Selecting One Dependency Estimators in Bayesian Network Using Different MDL Scores and Overfitting Criterion

The Averaged One Dependency Estimator (AODE) is integrated all possible Super-Parent-One-Dependency Estimators (SPODEs) and estimates class conditional probabilities by averaging them. In an AODE network some redundant SPODEs maybe result in some bias of classifiers, as a consequence, it could reduce the classification accuracy substantially. In this paper, a kind of MDL metrics is used to select SPODEs in a whole or partially, therefore there are three different classifiers presented. The performance comparisons between them and AODE have been shown not only the theoretical analyses are reasonable, but also efficient and effective. And Mean Square Error (MSE) is used to test overfitting. Experiential results have indicated that the classifier using MDL score metrics had better performance than original AODE, and at the same time, has less overfitting. At the end of the paper, further discussions and verifications of some properties of overfitting have also shown in the experiments.

[1]  Aleix M. Martínez,et al.  Multiobjective Optimization for Model Selection in Kernel Methods in Regression , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Weiguo Fan,et al.  Effective and efficient dimensionality reduction for large-scale and streaming data preprocessing , 2006, IEEE Transactions on Knowledge and Data Engineering.

[3]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[4]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[5]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[6]  Liangxiao Jiang,et al.  Weightily Averaged One-Dependence Estimators , 2006, PRICAI.

[7]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[8]  María M. Abad-Grau,et al.  Operations strategy and flexibility: modeling with Bayesian classifiers , 2006, Ind. Manag. Data Syst..

[9]  David G. Stork,et al.  Pattern Classification , 1973 .

[10]  Sjors H.W. Scheres,et al.  A Bayesian View on Cryo-EM Structure Determination , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[11]  Joe Suzuki,et al.  A Construction of Bayesian Networks from Databases Based on an MDL Principle , 1993, UAI.

[12]  Marco Wiering,et al.  Feature selection for Bayesian network classifiers using the MDL-FS score , 2010, Int. J. Approx. Reason..

[13]  Shigeru Shinomoto,et al.  Detection limit for rate fluctuations in inhomogeneous Poisson processes. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Ian Witten,et al.  Data Mining , 2000 .

[15]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[16]  Liangxiao Jiang,et al.  Hidden Naive Bayes , 2005, AAAI.

[17]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.

[18]  Arlindo L. Oliveira,et al.  Learning bayesian networks consistent with the optimal branching , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[19]  Elisa Guerrero Vázquez,et al.  Noise derived information criterion for model selection , 2002, ESANN.

[20]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[21]  Luis M. de Campos,et al.  A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests , 2006, J. Mach. Learn. Res..

[22]  Qing Wang,et al.  Learning Selective Averaged One-Dependence Estimators for Probability Estimation , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[23]  Kin Keung Lai,et al.  A Bias-Variance-Complexity Trade-Off Framework for Complex System Modeling , 2006, ICCSA.

[24]  Geoffrey I. Webb,et al.  Ensemble Selection for SuperParent-One-Dependence Estimators , 2005, Australian Conference on Artificial Intelligence.

[25]  Martin Pelikan,et al.  From mating pool distributions to model overfitting , 2008, GECCO '08.