An Improved IAMB Algorithm for Markov Blanket Discovery

Finding an efficient way to discover Markov blanket is one of the core issue s in data mining. This paper first discusses the problems existed in IAMB algorithm which is a typical algorithm for discovering the Markov b lanket of a target variable from the training dat a, and then proposes an improved algorithm λ -IAMB based on the improving approach which contains two aspects: code optimization and the improving strategy for conditional independence testing. E xperimental results show that λ -IAMB algorithm performs better than IAMB by finding Markov blanket of variables in typical Bayesian network and by testing the performance of them as feature selection method on some well-known real world datasets .

[1]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[2]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[3]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[4]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[5]  C. Aliferis,et al.  Algorithms for Large-Scale Local Causal Discovery and Feature Selection In the Presence Of Limited Sample Or Large Causal Neighbourhoods , 2002 .

[6]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[7]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[8]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[9]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[10]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[11]  Dimitris Margaritis,et al.  Speculative Markov blanket discovery for optimal feature selection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[14]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[15]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[16]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.