Attribute weighting: How and when does it work for Bayesian Network Classification

A Bayesian Network (BN) is a graphical model which can be used to represent conditional dependency between random variables, such as diseases and symptoms. A Bayesian Network Classifier (BNC) uses BN to characterize the relationships between attributes and the class labels, where a simplified approach is to employ a conditional independence assumption between attributes and the corresponding class labels, i.e., the Naive Bayes (NB) classification model. One major approach to mitigate NB's primary weakness (the conditional independence assumption) is the attribute weighting, and this type of approach has been proved to be effective for NB with simple structure. However, for weighted BNCs involving complex structures, in which attribute weighting is embedded into the model, there is no existing study on whether the weighting will work for complex BNCs and how effective it will impact on the learning of a given task. In this paper, we first survey several complex structure models for BNCs, and then carry out experimental studies to investigate the effectiveness of the attribute weighting strategies for complex BNCs, with a focus on Hidden Naive Bayes (HNB) and Averaged One-Dependence Estimation (AODE). Our studies use classification accuracy (ACC), area under the ROC curve ranking (AUC), and conditional log likelihood (CLL), as the performance metrics. Experiments and comparisons on 36 benchmark data sets demonstrate that attribute weighting technologies just slightly outperforms unweighted complex BNCs with respect to the ACC and AUC, but significant improvement can be observed using CLL.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[3]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[4]  Chengqi Zhang,et al.  Multi-instance Multi-graph Dual Embedding Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[5]  Iñaki Inza,et al.  Learning Bayesian network classifiers from label proportions , 2013, Pattern Recognit..

[6]  Liangxiao Jiang,et al.  Weighted average of one-dependence estimators† , 2012, J. Exp. Theor. Artif. Intell..

[7]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[9]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[10]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[11]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[12]  Geoffrey I. Webb,et al.  Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[13]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[14]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[15]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[16]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[17]  Harry Zhang,et al.  Naive Bayesian Classifiers for Ranking , 2004, ECML.

[18]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[19]  Jia Wu,et al.  Self-adaptive probability estimation for Naive Bayes classification , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[20]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[21]  Philip S. Yu,et al.  Graph stream classification using labeled and unlabeled graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[23]  Harry Zhang,et al.  Full Bayesian network classifiers , 2006, ICML.