Missing values prediction with K2

Dealing with missing values is one important task in data mining. There are many ways to work with this kind of data, but the literature doesn't determine the best one to all kinds of data set. The aim of this work is to show the application of a bayesian algorithm (K2) in data mining problems as a data preparation and classification tool. In this paper, the algorithm generates a bayesian network which is used to substitute the missing values. It's done by predicting the most probable instance for the features in each object of the database. The prediction uses an heuristic bayesian conditioning algorithm generating a preprocessed sample. Having this preprocessed sample, the classification is done. The results of the classification with and without the data preparation are analyzed.

[1]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[2]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[3]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[4]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[5]  V. Didelez,et al.  Maximum likelihood estimation in graphical models with missing values , 1998 .

[6]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[7]  Michael P. Wellman,et al.  Real-world applications of Bayesian networks , 1995, CACM.

[8]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[9]  D. Rubin Formalizing Subjective Notions about the Effect of Nonrespondents in Sample Surveys , 1977 .

[10]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[11]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[12]  Ivan Bratko,et al.  Experiments in automatic learning of medical diagnostic rules , 1984 .

[13]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[14]  Paola Sebastiani,et al.  Bayesian Inference with Missing Data Using Bound and Collapse , 2000 .

[15]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  Geoffrey I. Webb,et al.  Stochastic Attribute Selection Committees with Aultiple Boosting: Learning More Accurate and More Stable Classifer Committees , 1999, PAKDD.

[18]  Zijian Zheng,et al.  Classifying Unseen Cases with Many Missing Values , 1999, PAKDD.

[19]  Max Bramer,et al.  Techniques for Dealing with Missing Values in Classification , 1997, IDA.

[20]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[21]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[22]  José Manuel Gutiérrez,et al.  Expert Systems and Probabiistic Network Models , 1996 .

[23]  M. Degroot Optimal Statistical Decisions , 1970 .

[24]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[25]  O. O. Lobo,et al.  Ordered Estimation of Missing Values for Propositional Learning , 2000 .

[26]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[27]  Walter R. Gilks,et al.  Strategies for improving MCMC , 1995 .

[28]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[29]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[30]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[31]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[32]  A. P. White,et al.  Probabilistic induction by dynamic part generation in virtual trees , 1987 .

[33]  Geoffrey I. Webb,et al.  Stochastic Attribute Selection Committees , 1998, Australian Joint Conference on Artificial Intelligence.

[34]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[35]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .