CS6140 12F: Homework 02

Assigned: Wednesday, September 18, 2013
Due: Wednesday, October 02, 2013

Last modified:


General Instructions

  1. Feel free to discuss this assignment with others. However, you must acknowledge with whom you discussed the assignment, you must write your own code, and you must create your own report.


Assignment

In this assignment, you will create a Naive Bayes classifier for detecting e-mail spam, and you will test your classifier on a publicly available spam dataset. You will experiment with three methods for modeling the distribution of features, and you will test your classifier using 10-fold cross-validation. Extra Credit: Implement (A)ODE and compare and contrast the results obtained (ROC, AUC, etc.) and the probabilities assigned to the instances (consider a scatter plot of Naive Bayes probabilities vs. (A)ODE probabilities.)