Announcements
- • Prerequisites: CS 5800 or CS 7800 with a minimum grade of C-. I will not enforce these pre-requisites this year. However, note that you are taking the course at your own risk. I will assume the knowledge of algorithms as well as intellectual capacity and work ethic of a student who passed such a course. I cannot add you to the course directly. Please show this note to a person who can enroll you in the course if the system blocks you from doing it due to prerequisites.
- • First class: Friday, September 10
- • Thanksgiving break: November 24-26
- • Midterm exam: Week 8, Friday, in class.
- • Final exam: Friday, December 10, in class
- • Mini project report due: Tuesday, December 14 ☀️
- • Computing resources at Northeastern: request access to Discovery cluster.
———————————————————————
Last updated: December 5, 2021
Weekly Schedule
———————————————————————
Week 14, December 6 ☀️
Final exam on Friday, December 10, in class.
Topics
- • Privacy-preserving data mining
- • Review for final exam
Reading materials
- • A book chapter by Stan Matwin here
Handouts and code
- • ICML 2010 slides by Stan Matwin here
- • Modified slides used in class here
Mini project report instructions: available here
———————————————————————
Week 13, November 29
Topics
- • Model-based clustering
- • Graph-based clustering
Reading materials
- • Tan et al: Cluster analysis: additional issues and algorithms (Chapter 9)
Handouts and code
———————————————————————
Week 12, November 22
Thanksgiving break!
Topics
- • Sequential pattern mining
- • Frequent subgraph mining
Reading materials
- • Tan et al: Association rules: advanced concepts and algorithms (Chapter 7)
Handouts and code
- • Advanced association rules analysis slides
———————————————————————
Week 11, November 15
Friday: guest lecture (Tech Talk)
Topics
- • Principal component analysis (PCA)
- • Committee machines
Handouts and code
- • Principal component analysis slides
- • Committee machines slides
Homework assignment #4: available here
———————————————————————
Week 10, November 8
Topics
- • Deep networks
- • Committee machines
Handouts and code
Class evaluation
———————————————————————
Week 9, November 1
Topics
- • Neural networks
- • Kernel machines
Reading materials
- • Tan et al: Classification: alternative techniques (Chapter 5)
Handouts and code
- • Neural network slides
- • Kernel machines slides (last modified: 11/12/2021)
———————————————————————
Week 8, October 25
Midterm Exam on Friday
Topics
Reading materials
- • Tan et al: Classification: alternative techniques (Chapter 5)
Handouts and code
———————————————————————
Week 7, October 18
Topics
- • Rule-based classifiers
- • Naive Bayes
Reading materials
- • Tan et al: Classification: alternative techniques (Chapter 5)
Handouts and code
———————————————————————
Week 6, October 11
Topics
- • Advanced concepts in K-means clustering
- • Association rule mining: assessing interestingness of rules
Reading materials
Handouts and code
Homework assignment #3: available here
———————————————————————
Week 5, October 4
Topics
- • Evaluating clustering
- • Association rule mining
Reading materials
- • Tan et al: Cluster analysis: basic concepts and algorithms (Chapter 8)
- • Tan et al: Association analysis (Chapter 6)
- • Wu & Kumar: Apriori (Chapter 4)
Handouts and code
- • Chapter 6 slides (last modified: 10/12/2021)
Homework assignment #2: available here
———————————————————————
Week 4, September 27
Topics
- • K-means clustering
- • Hierarchical clustering
Reading materials
- • Tan et al: Cluster analysis: basic concepts and algorithms (Chapter 8)
- • Wu & Kumar: K-means (Chapter 2)
Handouts and code
- • Chapter 8 slides (last modified: 10/05/2021)
———————————————————————
Week 3, September 20
Topics
- • Evaluation of classification models
- • Data and data preprocessing
Reading materials
- • Tan et al.: Data (Chapter 2)
Handouts and code
- • Chapter 2 slides (last modified: 09/21/2021)
———————————————————————
Week 2, September 13
Topics
Reading materials
- • Tan et al: Basic classification (Chapter 4)
- • Wu & Kumar: C4.5 (Chapter 1)
Handouts and code
- • Chapter 4 slides (last modified: 09/17/2021)
Homework assignment #1 available here.
———————————————————————
Week 1, September 6
Topics
- • Class overview and logistics
- • Introduction to data mining
Reading materials
- • Tan et al.: Introduction (Chapter 1)
Handouts and code
———————————————————————
———————————————————————