CS 6220: Data Mining Techniques

Class Meets

When: Tuesdays and Fridays, 3:25pm-5:05pm

Where: Science Engineering Complex 138 (see campus map)

Course Summary

Data mining is a practical discipline that aims to identify interesting patterns and relationships hidden in data. It emerged as one of the most dynamic fields of computer science as the new relationships discovered in the data can be used to predict the future. This course is designed to introduce fundamental concepts of data mining and provide hands-on experience with several techniques. The students will be expected to develop a broad background in the field of data mining and develop skills to solve practical problems. Problems will be presented from various fields, such as fraud detection, e-commerce, stock market, medicine and life sciences. The class will have multiple programming assignments. Students are expected to have good programming skills in languages such as Matlab, Python or R. No particular programming language is expected to be mandatory.

Prerequisites

CS 5800 or CS 7800 with a minimum grade of C-. 

Class Materials

Textbook: 

Introduction to Data Mining by Tan et al., Pearson 2019.

Recommended books: 

The Top Ten Algorithms in Data Mining by Wu and Kumar, CRC Press 2009.

Data Mining: Concepts and Techniques by Han et al., Morgan Kaufmann 2006.

Supplementary materials: to be provided in class.

Topics

Grading

Late Policy and Academic Honesty

All assignments and exams are individual, except when collaboration is explicitly allowed. All the sources used for problem solution must be acknowledged, e.g. web sites, books, research papers, personal communication with people, etc. Academic honesty is taken seriously; for detailed information see Office of Student Conduct and Conflict Resolution.

Last updated: July 28, 2021