Syllabus: R as a Research Tool (BIOL 6325)

Course Overview

Room and Time:

12:30-1:50 T TH, Biology 405 (Computer Lab)

Office hours:

Wednesday 2:30 – 3:30 PM or by appointment

Course Description

This is a workshop course in which I will teach the basics of the computer language “R”, an open-source, interactive software package specifically designed for scientific numerical computation. R is a language designed around a core set of statistical libraries and offers advanced statistical capabilities. The language also offers very good graphical capabilities for exploring data or preparing figures for presentations and publication.

This course will use the R language to teach basic computer programming principles that can apply to other computer languages such as Python or C++. This is not a statistics course, but is aimed at teaching you to use R for data science. Therefore, I will cover visualization, data shaping and model fitting. My goal is to provide graduate students with tools to become better users of computers — programming allows a scientist to use the computer to answer the questions that are most important to the researcher and not be limited by pre-packaged tools.

There are no prerequisites.

Expected Learning Outcomes

After completion of the course students will be able to:

  • Explain the basics of computer architecture
  • Use a text editor and produce a plain-text workflow
  • Install and find documentation for R functions and libraries. Search for and find domain-specific R packages.
  • Use and understand the R data types (vectors, matrices, dataframes, strings)
  • Reshape data and use visual exploratory graphics. Practice good data management.
  • Write their own functions in R and break a problem into a set of functions.
  • Be fluent in programming concepts such aqs functional programming, code reuse, object-oriented programming, recursion, regular expressions, and split-transform-recombine data manipulation.
  • Engage in good code and data organization practices and use a consistent programming style.

Methods of Assessing Learning Outcomes

There will be no exams or final projects. Evaluations will be based solely on completion of weekly assignments. I will provide detailed feedback on the code students turn in each week.

Grading scale:

The final grade will be based on the average of the 14 weekly programming assignments. Grading framework: A => 90%; B => 80%; C 70%; D => 60%; F < 60%

Resources, required supplies

Software

I have installed R and RStudio on the machines in 405. You will want to install on your own computer:

  1. Download and install R, https://www.r-project.org/.
  2. Download and install RStudio, http://www.rstudio.com/download.
  3. Install required packages:
pkgs <- c(
"dplyr","ggmap", "Lahman", "lubridate", "maps", "nlme", "pbkrtest", "plyr", "RColorBrewer"
"scales", "stringr", "tidyr")
install.packages(pkgs)

Books

  • There is no textbook per se. Although R will be available on the lab computers in Biology 405, you are encouraged to download and install the R and Rstudio on your own laptop and bring your laptop to class.
  • You will NEED to use a reasonable text editor for this class. I recommend RStudio (which runs on Windows, Mac and Linux) if you do not already have a favorite editing environment; it is a full-fledged integrated development environment. I do not use RStudio, however, I use Emacs and will give you information on an emacs setup I’ve created for students if you are interested in becoming an emacs convert ( my student starter setup is at https://github.com/schwilklab/emacs-starter).
  • Many other books and web resources on R are available. In particular, books from the Springer Use R! series can be freely downloaded from the SpringerLink eBook site. (You need to be connected via the TTU network to be able to download Springer books).

    Some books that may be useful (none required):

Course Outline

Week 1 Basic R features; introduction to the main data types and visualization
Week 2 More on vectors and other data types
Week 3 Introduction to functions. More on lists and data frames
Week 4 Programming structures: relational and logical operations; flow control
Week 5 Environment and scope, more on data frames
Week 6 Math and simulations in R
Week 7 Debugging, introduction to strings and regular expressions
Week 9 Introduction to graphics
Week 10 the Grammar of Graphics
Week 8 Data shaping and transformation; split-transform-recombine
Week 11 Reshaping and tidying data, exploring large data sets
Week 12 Dates and times, statistical models in R
Week 13 Overview of main domain-specific libraries
Week 14 TBA

Assignments

Out-of-class assignments will be given on a weekly basis. For each assignment, please turn in a well-documented R script (email to me). I may also ask for specific outputs and test run results or graphs. Name each file starting with your last name, underscore, first name, then a hyphen and use the “.R” extension for R scripts, (e.g., Schwilk_Dylan-HW01.R). When emailing your assignment to me, please use a subject line with the following format: “R-research-tool: HW01”. In fact, please use “R-research-tool:” as a preface to the subject line for any email you send regarding the class.

There will be 12 assignments. Each assignment will be worth ~20 points. I will drop the lowest grade so the total points available will be around 220.

  • Sept 5. Vectors and matrices
  • Sept 12. Introduction to data and visualization
  • Sept 19. Functions
  • Sept 26. Functions 2
  • Oct 3. Simulated evolution 1
  • Oct 10. Simulated evolution 2
  • Oct 17. Simulated evolution 3
  • Oct 24. Strings and regular expressions
  • Oct 31. Shaping data and using plyr
  • Nov 7. Exploring large data sets: US baby names
  • Nov 14. Tidying and reshaping data
  • Dec 2. NOTE DAY CHANGE. TBA: Own data project

Special accommodation

ADA statement

Any student who, because of a disability, may require special arrangements in order to meet the course requirements should contact the instructor as soon as possible to make any necessary arrangements. Students should present appropriate verification from Student Disability Services during the instructor’s office hours. Please note instructors are not allowed to provide classroom accommodations to a student until appropriate verification from Student Disability Services has been provided. For additional information, you may contact the Student Disability Services office in 335 West Hall or 806-742-2405.

Statement about observance of religious holidays

A student who intends to observe a religious holy day should make that intention known in writing to the instructor prior to the absence. A student who is absent for the observance of a religious holy day shall be allowed to take an exam or complete an assignment scheduled for that day within a reasonable time after the absence.

Academic Honesty

It is the student’s responsibility to conduct him/herself in a civil manner while in the classroom. Please consult the university policy on and academic honesty (OP 34.12) and civility.

Back to top | E-mail Schwilk