An Improved Learning Algorithm for Augmented Naive Bayes

Data mining applications require learning algorithms to have high predictive accuracy, scale up to large datasets, and produce comprehensible outcomes. Naive Bayes classifier has received extensive attention due to its efficiency, reasonable predictive accuracy, and simplicity. However, the assumption of attribute dependency given class of Naive Bayes is often violated, producing incorrect probability that can affect the success of data mining applications. We extend Naive Bayes classifier to allow certain dependency relations among attributes. Comparing to previous extensions of Naive Bayes, our algorithm is more efficient (more so in problems with a large number of attributes), and produces simpler dependency relation for better comprehensibility, while maintaining very similar predictive accuracy.