Feature Subset Selection by Bayesian Networks Based Optimization Abstract|a New Method for Feature Subset Selection in Machine Learning, Fss-ebna

(Feature Subset Selection by Estimation of Bayesian Network Algorithm), is presented. FSS-EBNA is an evolutionary, population-based, randomized search algorithm, and it can be executed when domain knowledge is not available. A wrapper approach, over Naive-Bayes and ID3 learning algorithms, is used to evaluate the goodness of each visited solution. FSS-EBNA, based on the EDA (Estimation of Distribution Algorithm) paradigm, avoids the use of crossover and mutation operators to evolve the populations, in constrast to Genetic Algorithms. In absence of these operators, the evolution is guaranteed by the factorization of the probability distribution of the best solutions found in a generation of the search. Promising results are achieved in a variety of task where domain knowledge is not available. The paper explains the main ideas of Feature Subset Selection, Estimation of Distribution Algorithm and Bayesian networks, presenting related work about each concept. A study about thèoverrtting' problem in the Feature Subset Selection process is carried out, obtaining a basis to deene the stopping criteria of the new algorithm.

[1]  David E. Boyce,et al.  Optimal Subset Selection , 1974 .

[2]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[3]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  J. Kittler,et al.  Feature Set Search Alborithms , 1978 .

[6]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[7]  W. Mendenhall,et al.  Statistics for engineering and the sciences , 1984 .

[8]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[9]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[11]  Gilbert Syswerda,et al.  Uniform Crossover in Genetic Algorithms , 1989, ICGA.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[14]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[15]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[16]  Wray L. BuntineRIACS Theory Reenement on Bayesian Networks , 1991 .

[17]  A. Atkinson Subset Selection in Regression , 1992 .

[18]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[19]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[20]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[21]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[22]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[23]  Mark S. Boddy,et al.  Deliberation Scheduling for Problem Solving in Time-Constrained Environments , 1994, Artif. Intell..

[24]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[25]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[26]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[27]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[28]  Ron Kohavi Feature Subset Selection as Search with Probabilistic Estimates , 1994 .

[29]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[30]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[31]  Remco R. Bouckaert,et al.  Properties of Bayesian Belief Network Learning Algorithms , 1994, UAI.

[32]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[33]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[34]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[35]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[36]  David Heckerman,et al.  Learning Bayesian Networks: Search Methods and Experimental Results , 1995 .

[37]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[38]  Jerzy W. Bala,et al.  Hybrid Learning Using Genetic Algorithms and Decision Trees for Pattern Classification , 1995, IJCAI.

[39]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[40]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[41]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[42]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms: A Performance Analysis of Control Parameters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Nir Friedman,et al.  On the Sample Complexity of Learning Bayesian Networks , 1996, UAI.

[44]  Huan Liu,et al.  Feature Selection and Classification - A Probabilistic Wrapper Approach , 1996, IEA/AIE.

[45]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[46]  Pedro Larrañaga,et al.  Learning Bayesian network structures by searching for the best ordering with genetic algorithms , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[47]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[48]  Heinz Mühlenbein,et al.  The Equation for Response to Selection and Its Use for Prediction , 1997, Evolutionary Computation.

[49]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[51]  Pedro Larrañaga,et al.  Analysis of the behaviour of genetic algorithms when learning Bayesian network structure from data , 1997, Pattern Recognit. Lett..

[52]  S. Baluja,et al.  Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space , 1997 .

[53]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[54]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[55]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[56]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[57]  Dunja Mladenic,et al.  Feature Subset Selection in Text-Learning , 1998, ECML.

[58]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[59]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[60]  Michael I. Jordan Graphical Models , 2003 .

[61]  Bo gazi,et al.  Combined 5 2 Cv F Test for Comparing Supervised Classiication Learning Algorithms , 1999 .

[62]  Kwong-Sak Leung,et al.  Using Evolutionary Programming and Minimum Description Length Principle for Data Mining of Bayesian Networks , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  David E. Goldberg,et al.  The compact genetic algorithm , 1999, IEEE Trans. Evol. Comput..

[64]  Heinz Mühlenbein,et al.  Schemata, Distributions and Graphical Models in Evolutionary Optimization , 1999, J. Heuristics.

[65]  Fernando G. Lobo,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2000, Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334).