Correlation-Based and Causal Feature Selection Analysis for Ensemble Classifiers

High dimensional feature spaces with relatively few samples usually leads to poor classifier performance for machine learning, neural networks and data mining systems. This paper presents a comparison analysis between correlation-based and causal feature selection for ensemble classifiers. MLP and SVM are used as base classifier and compared with Naive Bayes and Decision Tree. According to the results, correlation-based feature selection algorithm can eliminate more redundant and irrelevant features, provides slightly better accuracy and less complexity than causal feature selection. Ensemble using Bagging algorithm can improve accuracy in both correlation-based and causal feature selection.

[1]  Mingyi Wang,et al.  A hybrid Bayesian network learning method for constructing gene networks , 2007, Comput. Biol. Chem..

[2]  Constantin F. Aliferis,et al.  Causal Feature Selection , 2007 .

[3]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[4]  Terry Windeatt,et al.  Relevance and Redundancy Analysis for Ensemble Classifiers , 2009, MLDM.

[5]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[6]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[7]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[8]  Terry Windeatt,et al.  Ensemble MLP Classifier Design , 2008, Computational Intelligence Paradigms.

[9]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[10]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[11]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[12]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[13]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[14]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Ian Witten,et al.  Data Mining , 2000 .

[17]  Constantin F. Aliferis,et al.  A Novel Algorithm for Scalable and Accurate Bayesian Network Learning , 2004, MedInfo.

[18]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[21]  Weiru Liu,et al.  Learning belief networks from data: an information theory based approach , 1997, CIKM '97.

[22]  Terry Windeatt,et al.  Accuracy/Diversity and Ensemble MLP Classifier Design , 2006, IEEE Transactions on Neural Networks.