STATISTC/COMPSCI 190F: Foundations of Data Science


Course Description: The field of Data Science encompasses methods, processes, and systems that enable the extraction of useful knowledge from data. Foundations of Data Science introduces core data science concepts including computational and inferential thinking, along with core data science skills including computer programming and statistical methods. The course presents these topics in the context of hands-on analysis of real-world data sets, including economic data, document collections, geographical data, and social networks. The course also explores social issues surrounding data analysis such as privacy and design.

Intended Audience: Students should consult the Introductory Programming Course guide for more information on selecting between introductory programming courses. This course is designed to serve as an introductory programming course for a broad audience. A block of seats is specifically being held for entering freshmen, but upper-year students are also welcome in the course. Students who do not intend to take COMPSCI 186 or 187 should strongly consider taking this course for an introduction to programming instead of COMPSCI 121. The course is recommended for students whose primary major touches on numerical or statistical topics or involves aspects of data analysis including:

  • Biology
  • Biochemistry
  • Chemistry
  • Economics
  • Earth Systems
  • Environmental Science
  • Finance
  • Geography
  • Geology
  • Kinesiology
  • Linguistics
  • Management
  • Microbiology
  • Political Science
  • Public Health
  • Psychology
  • Sociology

Prerequisites and Eligibility: A strong mathematical background is required as evidenced by a score of 20 or higher on the math placement exam, or completion of the R1 general education requirement, or completion of any of Math 101 & 102, Math 104, 127, 128, 131, or 132. The course is open to students from all majors except for Computer Science, Informatics, and Mathematics & Statistics, who must take COMPSCI 121 (or an equivalent course).

Course Structure and Capacity: The course is listed as STATISTC 190F and COMPSCI 190F and will be co-taught by Prof. Patrick Flaherty from the Department of Mathematics & Statistics and Prof. Benjamin Marlin from the College of Information and Computer Sciences. Both primary sections have a maximum enrollment of 100 students. The two primary course sections meet together for lectures, but each have multiple associated 30-student lab sections. Students should pick the lab section that fits their schedule, and register for the corresponding primary course section.

Computing and Course Materials: The course provides a free, web-accessible platform for students to complete programming assignments and lab work. The lab sections COMPSCI 190F-01LQ and STATISTC 190F-01LQ also provide computer workstations for students to use during labs. Students registered for other lab sections will need to bring a laptop computer with a reasonably up-to-date web browser. No special software is required. The course uses a freely-available online text, Computational and Inferential Thinking by Ani Adhikari and John DeNero as the primary text (

Credits and Designations: The course is 4 credits, and satisfies a “technical” course requirement for the Information Technology Minor (