Lecture and section information
INFO 1998, Fall 2020
Lecture time: Wed 5:30pm - 6:30pm EST
Lecture location: Hosted on Zoom starting Mar. 18
Staff and office hours
For the remainder of the semester, office hours will be held over Zoom. Office hour times can be found on the course website’s front page. Clicking on a time takes you to the Zoom (Cornell authentication required).
If none of these office hours fit into your schedule, please email the course staff or post a private Piazza note, and we can set up additional office hours.
The goal of this course is to provide you with a high-level exposure to a wide range of Data Science techniques and Machine Learning models, for the purpose of enabling you to solve real problems with machine learning. The course covers getting set up, manipulating and visualizing large datasets, building supervised and unsupervised machine learning models, and a discussion about the various application of these methods in the real world. If you have religiously followed the course throughout the semester, you should expect to have a high-level and intuitive understanding of how data problems could be tackled. You can apply this quick, implementation-oriented toolkit you develop yourself to a variety of fields and problems.
If you are interested in a solid mathematical foundation for data science and machine learning, this class is not sufficient in itself. This course, however, should serve as a head start for you.
No prerequisites; Basic Python experience (at the level of CS 1110) is encouraged.
- We will be working together on in-class assignments/exercises during lectures, so please bring a laptop (or tablet) to fully participate.
- You will need a conda environment and/or virtualenv setup with necessary Python libraries.
- Please refer to the Getting Started page for more information.
Class material will be posted on our course website, including the assignments, lecture slides, notes, and demos.
We will use CMS for assignment / project submissions and feedbacks.
One assignment will be assigned at the end of lecture each week, due at the beginning of the next lecture. You may skip up to one assignment throughout the semester.
Mid-Semester Group Project
There will be one mid-semester project, focused on data cleaning, data manipulation, and data visualization.
Final Group Project
The final group project is meant to be a culmination of all the knowledge and techniques you acquired thruoghout the semester. This is your chance to showcase how much you’ve learned.
Feedback and Grade Postings
We will be providing you with feedback on the Cornell University Course Management System (CMS). We will grade your work within 8 days of the due date.
This is a 1-credit S/U class. In order to get a Satisfactory (S) grade, you will need at least 70%.
There are three components to grading:
- Weekly Assignments (50%)
- Mid-Semester Group Project (15%)
- Final Group Project (35%)
This is a student-run course, so we understand how stressful classes can get. Above all, we want you to enjoy learning and applying the course content. So if you are concerned about passing this class, or have any reasonable cases to make for deadline extensions, please reach out the course manager or post a private note on Piazza immediately. We would love to see you succeed, but can only help if you notify us in time.
Attendance is required. We discuss the answers to the assigned assignments in class, and coming to lectures ensures at least a fairly high score on assignments.
All Cornell students are expected to follow the Cornell University Code of Academic Integrity (http://cuinfo.cornell.edu/aic.cfm). Do not refer to notes from previous semesters or data science projects available online. Our instructors have caught these in the past and the penalty for plagiarism is an unsatisfactory (U) grade.