New prior knowledge based extensions for stable feature selection

In many data sets, there are only hundreds or fewer samples but thousands of features. The relatively small number of samples in high dimensional data results in modest classification performance and feature selection instability. In order to deal with the curse of dimensionality, we propose to investigate research on the effect of integrating background knowledge about some dimensions known to be more relevant, as a means of directing the feature selection process. We propose extensions of three feature selection techniques, two filters and a wrapper, by incorporating prior knowledge in the search procedure of the best features. We study the effect of these extensions on the classification performance and the stability of the feature selection. We experimentally test and compare our proposed approaches with their original versions, which do not integrate prior knowledge, over three high-dimensional datasets. The results show that our proposed techniques outperform other methods in terms of stability of feature selection but also in classification performance in most cases.

[1]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[2]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  James Joseph Biundo,et al.  Analysis of Contingency Tables , 1969 .

[4]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[5]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[6]  Peng Guan,et al.  Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method , 2009, Journal of experimental & clinical cancer research : CR.

[7]  Ludmila I. Kuncheva,et al.  A stability index for feature selection , 2007, Artificial Intelligence and Applications.

[8]  Torben F. Ørntoft,et al.  Identifying distinct classes of bladder carcinoma using microarrays , 2003, Nature Genetics.

[9]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[10]  Zengyou He,et al.  Stable Feature Selection for Biomarker Discovery , 2010, Comput. Biol. Chem..

[11]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[12]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ben Taskar,et al.  Learning on the Test Data: Leveraging Unseen Features , 2003, ICML.

[14]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[15]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[16]  Sinisa Todorovic,et al.  Local-Learning-Based Feature Selection for High-Dimensional Data Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[18]  Daphne Koller,et al.  Learning a meta-level prior for feature relevance from multiple related tasks , 2007, ICML '07.

[19]  Jieping Ye,et al.  Identifying biologically relevant genes via multiple heterogeneous data sources , 2008, KDD.

[20]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[21]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[22]  John Bibby,et al.  The Analysis of Contingency Tables , 1978 .

[23]  Thibault Helleputte,et al.  Feature Selection by Transfer Learning with Linear Regularized Models , 2009, ECML/PKDD.

[24]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..