Data Science: Productivity Tools

Keep your projects organized and produce reproducible reports using GitHub, git, Unix/Linux, and RStudio.

In this online course taught by Harvard Professor Rafael Irizarry, learn to keep your projects organized and produce reproducible reports using GitHub, git, Unix/Linux, and RStudio.

Featuring faculty from:
Self-Paced
Length
8 weeks
1-2 hours a week
Certificate Price
$149
Program Dates
Self-Paced
Length
8 weeks
1-2 hours a week
Certificate Price
$149
Program Dates
Start Data Science: Productivity Tools

What You'll Learn

A typical data analysis project may involve several parts, each including several data files and different scripts with code. Keeping all this organized can be challenging.

Part of our Professional Certificate Program in Data Science, this course explains how to use Unix/Linux as a tool for managing files and directories on your computer and how to keep the file system organized. You will be introduced to the version control systems git, a powerful tool for keeping track of changes in your scripts and reports. We also introduce you to GitHub and demonstrate how you can use this service to keep your work in a repository that facilitates collaborations.

Finally, you will learn to write reports in R markdown which permits you to incorporate text and code into a document. We'll put it all together using the powerful integrated desktop environment RStudio.

The course will be delivered via edX and connect learners around the world. By the end of the course, participants will learn:

  • How to use Unix/Linux to manage your file system
  • How to perform version control with git
  • How to start a repository on GitHub
  • How to leverage the many useful features provided by RStudio

Your Instructors

Image
Rafael Irizarry

Rafael Irizarry

Professor of Biostatistics at Harvard University
Read full bio.

Ways to take this course

When you enroll in this course, you will have the option of pursuing a Verified Certificate or Auditing the Course.

A Verified Certificate costs $149 and provides unlimited access to full course materials, activities, tests, and forums. At the end of the course, learners who earn a passing grade can receive a certificate. 

Alternatively, learners can Audit the course for free and have access to select course material, activities, tests, and forums. Please note that this track does not offer a certificate for learners who earn a passing grade.

Read More

Introduction to Linear Models and Matrix Algebra

Perform matrix operations

Learn to use R programming to apply linear models to analyze data in life sciences.

Read More

Data Science: Inference and Modeling

Key concepts through a motivating case study

Learn inference and modeling: two of the most widely used statistical tools in data analysis.

Read More

Data Science: Linear Regression

Implement linear regression and adjust for confounding in practice using R.

Learn how to use R to implement linear regression, one of the most common statistical modeling approaches in data science.