Data Science Principles
Are you prepared for our data-driven world?
Data Science Principles is a Harvard Online course that gives you an overview of data science with a code- and math-free introduction to prediction, causality, data wrangling, privacy, and ethics.
4-5 hours per week
4-5 hours per week
What You'll Learn
What is data science, and how can it help you make sense of the infinite data, metrics, and tools that are available today?
Data science is at the core of any growing modern business, from health care to government to advertising and more. Insights gathered from data science collection and analysis practices have the potential to increase quality, effectiveness, and efficiency of work output in professional and personal situations.
Data Science Principles makes the foundational topics in data science approachable and relevant by using real-world examples that prompt you to think critically about applying these understandings to your workplace. Get an overview of data science with a nearly code- and math-free introduction to prediction, causality, visualization, data wrangling, privacy, and ethics.
Data Science Principles is an introduction to data science course for anyone who wants to positively impact outcomes and understand insights from their company’s data collection and analysis efforts. This online certificate course will prepare you to speak the language of data science and contribute to data-oriented discussions within your company and daily life. This is a course for beginners and managers to better understand what data science is and how to work with data scientists.
Data Science Principles is part of our Harvard on Digital Learning Path.
The Harvard on Digital course series provides the frameworks and methodologies to turn data into insight, technologies into strategy, and opportunities into value and responsibility to lead with data-driven decision making.
The course will be delivered via HBS Online’s course platform and immerse learners in real-world examples from experts at industry-leading organizations. By the end of the course, participants will be able to:
- Understand the modern data science landscape and technical terminology for a data-driven world
- Recognize major concepts and tools in the field of data science and determine where they can be appropriately applied
- Appreciate the importance of curating, organizing, and wrangling data
- Explain uncertainty, causality, and data quality—and the ways they relate to each other
- Predict the consequences of data use and misuse and know when more data may be needed or when to change approaches
Dustin Tingley is a data scientist at Harvard University. He is Professor of Government and Deputy Vice Provost for Advances in Learning and helps to direct Harvard's education focused data science and technology team. Professor Tingley has helped a variety of organizations use the tools of data science and he has helped to develop machine learning algorithms and accompanying software for the social sciences. He has written on a variety of topics using data science techniques, including education, politics, and economics.
Real World Case Studies
Affiliations are listed for identification purposes only.
Listen to Harvard Professor and faculty member at Boston Children’s Hospital analyze Google Flu, its failures, and lessons learned.
Explore the difficulties faced in keeping data anonymous and private with Harvard Professor and Director of the Data Privacy Lab in IQSS at Harvard.
Learn how Burning Glass Technologies uses text analysis to recommend job openings, skill development, and labor market trends.
Who Will Benefit
"This is a topic that people in any industry should have at least basic knowledge of in order to create more efficient and competitive businesses, tools, and resources."
Carlos E. Sapene
CEO, Chief Strategy Officer
"I found value in the real-world examples in Data Science Principles. With complicated topics and new terms, it's especially beneficial for learnings to be able to tie back new or abstract concepts to ideas that we understand. This course helped me understand data in this context and what algorithms are actually trying to solve."
Financial Services Analyst
"Data Science Principles applies to many aspects of our daily lives. The course helps guide people in everyday life through decision making and process thinking."
Senior Director of Sales
Data Science Principles makes the fundamental topics in data science approachable and relevant by using real-world examples and prompts learners to think critically about applying these new understandings to their own workplace. Get an overview of data science with a nearly code- and math-free introduction to prediction, causality, visualization, data wrangling, privacy, and ethics.
- Study a flu detection case study alongside Professor Dustin Tingley and Mauricio Santillana, Assistant Professor at Harvard’s T.H. Chan School of Public Health.
- Explain why data collection is important.
- Identify factors that may affect data quality.
- Recognize that not all data is numerical.
- Explain how the organization of data can affect the information you are able to extract from it.
- Study a predicting sepsis case alongside Craig Umscheid, Vice President and Chief Quality and Innovation Office, University of Chicago Medicine.
- Understand the basic structure of a predictive algorithm.
- Identify where human decisions shape predictive systems.
- Evaluate the success of a predictive system.
- Study The Google Tax Case.
- Explain why it is important to establish causal relationships.
- Identify barriers to establishing causal relationships in a variety of settings.
- Identify why randomization can help establish a causal relationship but also create other problems.
- Explore a privacy and facial recognition case study with Latanya Sweeney, Professor of the Practice of Government and Technology at the Harvard Kennedy School and Sciences, director and founder of the Public Interest Tech Lab, and director and founder of the Data Privacy Lab.
- Explain why data privacy is important.
- Describe what can constitute a violation of privacy.
- Critique existing privacy policies.
- Create a set of ethical tenets to guide data work at their own organizations.
- Study the Burning Glass and Text Data case.
- Identify sources of non-numerical data.
- Explain why it would be useful to use non-numerical data.
- Describe the differences in approach for supervised and unsupervised learning.
- Identify use cases for neural networks.
- Explore a case study on reducing food waste with Shelf Engine.
- Describe some algorithms commonly used in data science.
- Understand basic workhorse algorithms in data science such as regression.
- Explain why and how such tools are made substantially more complex.
- Explain the crucial role humans have in overseeing and maintaining algorithms.
- Explain some of the trade-offs between more sophisticated algorithms, including the costs of running and evaluating their success.
- Learn about the Harvard Link case study.
- Explain the importance of data transformation and wrangling.
- List the common technologies used within data science ecosystems.
- Describe the connection between data science tasks, software tools, and hardware tools.
- Identify potential sources of bottlenecks in the data science process.
- Work on a health care prioritization case study.
- Recognize a problem that an algorithm might be able to solve.
- Recognize the challenges created by using data science tools in ways outside their intended use.
- Identify steps within the data science process that need auditing.