The naive Bayesclassifiergreatly simplify learning byassumingthatfeaturesareindependent given class. Although independenceis generallya poor assumption, in practicenaiveBayesoftencompetes well with moresophisticatedclassifiers. Our broadgoal is to understandthedatacharacter isticswhichaffect theperformanceof naiveBayes. OurapproachusesMonteCarlosimulationsthatallow a systematicstudy of classificationaccuracy for several classesof randomly generatedproblems. We analyzethe impact of the distribution entropy on the classificationerror, showing that low-entropy featuredistributions yield good performanceof naive Bayes. We also demonstrate that naive Bayes works well for certain nearlyfunctional featuredependencies, thus reachingits bestperformancein twooppositecases:completely independentfeatures(as expected)and functionally dependent features(which is surprising).Anothersurprisingresultis that theaccuracy of naive Bayes is not directly correlatedwith the degree of feature dependenciesmeasuredas the classconditional mutual information betweenthe features.Instead,a betterpredictorof naiveBayesaccuracy is theamountof informationabouttheclass that is lost becauseof the independenceassumption.
[1]
Richard O. Duda,et al.
Pattern classification and scene analysis
,
1974,
A Wiley-Interscience publication.
[2]
Joseph L. Hellerstein,et al.
Recognizing End-User Transactions in Performance Management
,
2000,
AAAI/IAAI.
[3]
Thomas M. Cover,et al.
Elements of Information Theory
,
2005
.
[4]
Pat Langley,et al.
An Analysis of Bayesian Classifiers
,
1992,
AAAI.
[5]
Rish,et al.
An analysis of data characteristics that affect naive Bayes performance
,
2001
.
[6]
J. Hilden.
Statistical diagnosis based on conditional independence does not require it.
,
1984,
Computers in biology and medicine.
[7]
Ron Kohavi,et al.
Wrappers for performance enhancement and oblivious decision graphs
,
1995
.
[8]
Thomas G. Dietterich.
What is machine learning?
,
2020,
Archives of Disease in Childhood.