Syllabus: R as a Research Tool (BIOL 6325)

Course Overview

Room and Time:

11:00-12:20 T TH

Biology 405 Computer Lab

Office hours:

By appointment via zoom or in person

Course Description

This is a workshop course in which I will teach the computer language “R”, an open-source, interactive software package specifically designed for scientific numerical computation. R is a language designed around a core set of statistical libraries and offers advanced statistical capabilities. The language also offers very good graphical capabilities for exploring data or preparing figures for presentations and publication. Additionally, the course will teach basic computer programming principles that can apply to other computer languages such as Python or C++. This is not a statistics course, but is aimed at teaching you to use R for data science. Therefore, I will cover visualization, data shaping, and model fitting in addition to the traditional programming subjects. My goal is to provide graduate students with tools to become better users of computers — programming allows a scientist to use the computer to answer the questions that are most important to the researcher and not be limited by pre-packaged tools.

There are no prerequisites.

Expected Learning Outcomes

The overarching goal of this course is to produce scientists who can efficiently and accurately investigate data and do so in an open and reproducible workflow.

After completion of the course students will be able to

  • use the console as a calculator and assign variables.
  • use and understand the R data types (vectors, matrices, dataframes, strings).
  • write functions in R and break a problem into a set of functions.
  • Search for and install domain-specific R packages and be able to use the documentation of such packages.
  • take a new data set and efficiently investigate it for patterns using data reshaping techniques and exploratory visualization.
  • produce publication-quality figures.
  • be fluent in some programming concepts especially important in data analysis.
  • have some familiarity but not necessarily fluency with other programming concepts such as functional programming style, object-oriented programming, recursion, and regular expressions.
  • Engage in good code and data organization practices and use a consistent programming style.
  • use a text editor and produce a plain-text workflow.
  • understand the basics of reproducible research including passing familiarity with version control and RMarkdown so that the student could go on learn more about these subjects easily..

Methods of Assessing Learning Outcomes

There will be no exams or final projects. Evaluations will be based solely on completion of weekly assignments. I will provide detailed feedback on the code students turn in each week.

Grading scale:

The final grade will be based on the average of the weekly programming assignments. I will first drop the lowest score. Grading framework: A => 90%; B => 80%; C 70%; D => 60%; F < 60%

Resources, required supplies

Software

You will need to install on your own computer:

  1. Download and install R, https://www.r-project.org/.
  2. Download and install RStudio, http://www.rstudio.com/download.
  3. Install required packages:
pkgs <- c("tidyverse", "Lahman", "lubridate", "maps", "nlme", "pbkrtest", "RColorBrewer"
"scales")
install.packages(pkgs)

Books

Course Outline

Week 1 Basic R features
Week 2 Introduction to the main data types and visualization
Week 3 Introduction to functions. More on lists and data frames
Week 4 Programming structures: relational and logical operations; flow control
Week 5 Loops; more on data frames and factors
Week 6 Math and simulations in R
Week 7 Debugging, introduction to strings
Week 8 Strings and regular expressions
Week 9 The Grammar of Graphics
Week 10 Reshaping data, split-apply-recombine
Week 11 Tidy data, dates and times
Week 12 Color and figures, statistical models
Week 13 Markup langauges and version control
Week 14 Advanced data sources, Overview of domain-specific libraries

Assignments

Out-of-class assignments will be given on a weekly basis and are due each Monday. For each assignment, please turn in a well-documented R script via email. I may also ask for specific outputs, example results, or graphs. Name each file starting with your last name, underscore, first name, then a hyphen and the homework name. Use the “.R” extension for R scripts, (e.g., if I were to submit the first assignment I would name the file Schwilk_Dylan-HW01.R). When emailing your assignment to me, please use a subject line with the following format: “R-research-tool: HW01”. In fact, please use “R-research-tool:” as a preface to the subject line for any email you send regarding the class.

There will be 13 assignments. Each assignment will be worth ~20 points. I will drop the lowest grade so the total points available will be around 240.

Sept 5 HW01 Vectors and matrices
Sept 12 HW02 Introduction to data and visualization
Sept 19 HW03 Functions 1
Sept 26 HW04 Functions 2
Oct 3 HW05 Simulated evolution 1: write functions
Oct 10 HW06 Simulated evolution 2: explain functions
Oct 17 HW07 Not for submission: Simulated evolution 3
Oct 24 HW08 Strings
Oct 31 HW09 Data visualization with ggplot2
Nov 7 HW10 Shaping data and dplyr
Nov 14 HW11 Exploring large data sets: US baby names
Nov 21 HW12 Tidying and reshaping data
Dec 5 HW13 Your own data project (you have 2 weeks to complete)

Academic Honesty

It is the student’s responsibility to conduct him/herself in a civil manner while in the classroom. Please consult the university policy on and academic honesty (OP 34.12) and civility.

I do not tolerate any plagiarism. You must write your own code. If you use code from any other resource (a classmate, a friend, the internet) you must carefully cite that in your code. I will teach you how to use the internet to help you find solutions int he course but do not blindly google! You are mean to complete the assignments in this course using what you have learned up to that point in the course.

TTU required Academic Integrity Statement

Academic integrity is taking responsibility for one’s own class and/or course work, being individually accountable, and demonstrating intellectual honesty and ethical behavior. Academic integrity is a personal choice to abide by the standards of intellectual honesty and responsibility. Because education is a shared effort to achieve learning through the exchange of ideas, students, faculty, and staff have the collective responsibility to build mutual trust and respect. Ethical behavior and independent thought are essential for the highest level of academic achievement, which then must be measured. Academic achievement includes scholarship, teaching, and learning, all of which are shared endeavors. Grades are a device used to quantify the successful accumulation of knowledge through learning. Adhering to the standards of academic integrity ensures grades are earned honestly. Academic integrity is the foundation upon which students, faculty, and staff build their educational and professional careers. [Texas Tech University (“University”) Quality Enhancement Plan, Academic Integrity Task Force, 2010].

Accomodations and illness

Illness and Covid 19

First: please contact me if you need to miss class or if you need flexibility in any assignment due dates. I will endeavor to help you out.

The University will continue to monitor CDC, State, and TTU System guidelines concerning COVID-19. Any changes affecting class policies or temporary changes to delivery modality will be in accordance with those guidelines and announced as soon as possible. Students will not be required to purchase specialized technology to support a temporary modality change, though students are expected to have access to a computer to access course content and course-specific messaging.

This is where students can find information about COVID testing, vaccinations, isolation, and quarantine. https://www.depts.ttu.edu/ communications/emergency/coronavirus/.

If you test positive for COVID-19, report your positive test through TTU’s reporting system: https://www.depts.ttu.edu/communications/ emergency/coronavirus/. Once you report a positive test, the portal will automatically generate a letter that you can distribute to your professors and instructors.

ADA statement

Any student who, because of a disability, may require special arrangements in order to meet the course requirements should contact the instructor as soon as possible to make any necessary arrangements. Students should present appropriate verification from Student Disability Services during the instructor’s office hours. Please note: instructors are not allowed to provide classroom accommodations to a student until appropriate verification from Student Disability Services has been provided. For additional information, please contact Student Disability Services in West Hall or call 806-742-2405.

Statement about observance of religious holidays

“Religious holy day” means a holy day observed by a religion whose places of worship are exempt from property taxation under Texas Tax Code §11.20. A student who intends to observe a religious holy day should make that intention known in writing to the instructor prior to the absence. A student who is absent from classes for the observance of a religious holy day shall be allowed to take an examination or complete an assignment scheduled for that day within a reasonable time after the absence. A student who is excused under section 2 may not be penalized for the absence; however, the instructor may respond appropriately if the student fails to complete the assignment satisfactorily.

Additional TTU recommended syllabus statements

Back to top | E-mail Schwilk