DescriptionIn this project you will learn the basics of naive Bayes classification, and investigate how it is applied to spam filtering. One of the features of spam filtering is that, as the user trains the filter, classification results improve. However, during the initial phase, traditional filters typically misclassify.
You will study techniques from Bayesian decision making, using a set of probability distributions rather than a single one, to allow the filter to produce an "unsure" result in situations where there is lack of training data. We may also study various statistical methods used in real spam filters. The original algorithm based on a naive Bayes classifier has been improved in many ways, and probably still can be improved further. Other applications of Bayesian classification and decision making could be a subject of study as well. PrerequisitesStatistical Concepts II Either of Decision Theory III or Bayesian Statistics III/IV is recommended although not strictly required. Resources
|