Machine Learning I

Time: Tue/Fri 3:25-5:05 ET/Boston

Location: Online (See Canvas for connection info.)

John Rachlin

Associate Teaching Professor, Northeastern University


Teaching Assistants

NameOffice Hours
See Piazza for Updates
Zoom

Sravya Burugu
See Piazza Zoom

Shruti Kunapuli
See Piazza Zoom

Tianhao Qu
See Piazza Zoom

Pulkit Saharan
See Piazza Zoom

Anurag Sarkar
See Piazza Zoom

Arif Waghbakriwala
See Piazza Zoom

Course Information

Course Description

Provides an introduction to machine learning. Introduces core machine learning principles and methods, providing a foundation for learning more advanced techniques. Covers traditional supervised learning models for both classification and regression, and corresponding optimization methods. Discusses issues of data use and broader ethical considerations inherent to machine learning. Lectures provide in-depth presentation of such content, to be reinforced with homeworks and/or projects. 4.000 Credit hours

Recommended Reading

Here are a few books on Machine Learning, Mathematics, and Programming that I have found both useful and insightful and from which I draw material for the course. I made every effort to select texts that are both popular and cost-effective.

Core concepts:
[ISL] James, Witten, Hastie, and Tibshirani (2021). An Introduction to Statistical Learning (2ed). (statlearning.com)
[Hun] Burkov (2019). The Hundred-Page Machine Learning book. (themlbook.com)
[Grok] Serrano (2021). Grokking Machine Learning. (O'Reilly E-Books)

Python Programming
[Py15] Downey (2015). Think Python. (greenteapress.com)
[Py20] Deitel and Deitel (2020). Intro to Python for Computer Science and Data Science. (O'Reilly E-Books)

Mathematics
[Math] Deisenroth, Faisal, Ong (2020). Mathematics for Machine Learning. (mml-book.github.io)

Ethics
[Align] Christian (2020). The Alignment Problem: Machine Learning and Human Values. (amazon)

Homework

There will be one assignment about every two weeks. Programming exercises will be done in Python. The detailed dates are listed on the schedule below. Homeworks should be completed largely on your own. Informal discussions or seeking general help from fellow students is ok so long as you cite your sources. Do not simply copy another student's submission! In addition, as part of your homework, there may be occaisional take-home quizzes to verify your understanding of the reading material.

Homework Late Policy:
  • Up to 24 hours late: 5% penalty
  • Up to 48 hours late: 10% penalty
  • After 48 hours: Not accepted.

Class Project

There will be a comprehensive group project that you will present at the end of the semester involving a non-Kaggle integrated dataset of your own choosing. The project will involve exploratory data analysis and visualization and the design and development of an appropriate machine-learning model for classification and prediction.

Academic Misconduct

Programming is a creative process. Individuals must reach their own understanding of problems and discover paths to their solutions. During this time, discussions with friends and colleagues are absolutely encouraged—you will do much better in the course, and at Northeastern, if you find people with whom you regularly discuss problems. But those discussions should take place verbally. If you simply copy code, you're breaking the rules. Each program/application must be entirely your own work. The university's academic integrity policy discusses actions regarded as violations and consequences for students: http://www.northeastern.edu/osccr/academic-integrity

Evaluation

The final grade for this course will be weighted as follows:

  • Homework: 80%
  • Group Project: 20%

Final grades will be assigned based on the following scale. Any curving or changes will be to the benefit of students. Final grades will be rounded to the nearest integer. (e.g., 94.4999 is a 94 whereas 94.5000 is a 95.)

LetterRange
A95 - 100
A-90 - 94
B+87 - 89
B83 - 86
B-80 - 82
C+77 - 79
C73 - 76
C-70 - 72
D+67 - 69
D63 - 66
D-60 - 62
F<60

Schedule

Note: This schedule is subject to change and will be adjusted as needed throughout the semester.

Week Date Topic Reading HW Due
1 Jan 10/13 Introduction and Core Concepts
  • About the course
  • What is ML?
  • Python Software Environment
  • History, Applications, Examples
  • Lab01: Numpy Arrays
Grok 1, ISL 1,2
2 Jan 17/20 Concepts
  • Accuracy-Variance Tradeoff
  • Lab02: Visualization
  • Lab03: Pandas Dataframe / Array Processing
Grok 2, ISL 3
3 Jan 24/27 Gradient Descent
  • Calculus Review
  • The Gradient Descent Algorithm
  • Batch and Stochastic Gradient Descent
  • Lasso and Ridge Regularization
  • Lab04: Linear Regression
  • Lab05: Gradients
  • Lab06: Polynomial Regression
Grok 3,4 HW1:LinReg
Due 1/27
4 Jan 31/Feb 3 Binary Classifiers
  • Perceptrons
  • Model Evaluation
  • Lab07: Perceptrons
  • Lab08: Confusion Matrices
Grok 5,7
5 Feb 7/10 Probabilistic Classifier
  • Probability Review
  • Bayes Theorem
  • Naive Bayes Algorithm
  • "Spam, spam, spam, spam!"
Grok 8
6 Feb 14/17 Nearest Neighbor
  • Perspecuity
  • Distance metrics
  • Hyperparameter tuning
  • Cross validation
  • Lab09: Vectorizing word lists
  • Lab10: Cosine Similarity
Grok 13 HW2:Perceptron
Due 2/15
7 Feb 21/24 Decision Trees
Grok 9
8 Feb 28/Mar 3 Ensemble Methods
  • Bagging
  • AdaBoost
  • Gradient boosting
  • XGBoost
Grok 12 HW3:Naive+KNN
Due: 3/2
(Late Due Date: 3/12)
9 Mar 7/10 Spring Break - No Class
10 Mar 14/17 Deep Learning Grok 10, ISL 10 PROJECT PROPOSAL
Due 3/17
11 Mar 21/24 Clustering ISL 12 HW4:Trees
Due: 3/22
12 Mar 28/31 Genetic Algorithms TBA
13 Apr 4/7 Ethics of Machine Learning TBA HW5:NeuralNet
Due: 4/5
14 Apr 11/14 Project Science Fair PROJECT DUE
Apr 10 @ 11:59pm
15 Apr 18 Wrap up

Inclusive Class

Northeastern University values the diversity of our students, staff, and faculty; recognizing the important contribution each makes to our unique community.

Respect is demanded at all times throughout this course. In the classroom, not only is participation required, it is expected that everyone is treated with dignity and respect. We realize everyone comes from a different background with different experiences and abilities. Our knowledge will always be used to better everyone in the class.

We strive to create a learning environment that is welcoming to students of all backgrounds. If you feel unwelcome for any reason, please let us know so we can work to make things better. You can let us know by talking to anyone on the teaching staff. If you feel uncomfortable talking to members of the teaching staff, please consider reaching out to your academic advisor.

Northeastern is committed to providing equal access and support to all qualified students through the provision of reasonable accommodations so that each student may fully participate in the learning experience. If you have a disability that requires accommodations, please contact the Disability Resource Center http://www.northeastern.edu/drc/, DRC@northeastern.edu, 617-353-2675. Accommodations cannot be made retroactively and to receive an accommodation, a letter from the DRC or LDP is required.