On selecting interacting features from high-dimensional data

For high-dimensional data, most feature-selection methods, such as SIS and the lasso, involve ranking and selecting features individually. These methods do not require many computational resources, but they ignore feature interactions. A simple recursive approach, which, without requiring many more computational resources, also allows identification of interactions, is investigated. This approach can lead to substantial improvements in the performance of classifiers, and can provide insight into the way in which features work together in a given population. It also enjoys attractive statistical properties.

[1]  P. Hall,et al.  Tilting methods for assessing the influence of components in a classifier , 2009 .

[2]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[3]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[4]  luliana lonita,et al.  Optimal two-stage strategy for detecting interacting genes in complex diseases , 2006, BMC Genetics.

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[7]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[8]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[9]  Lukasz A. Kurgan,et al.  Knowledge discovery approach to automated cardiac SPECT diagnosis , 2001, Artif. Intell. Medicine.

[10]  Peter Hall,et al.  Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems , 2009 .

[11]  Edward R. Dougherty,et al.  Performance of feature-selection methods in the classification of high-dimension data , 2009, Pattern Recognit..

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[14]  N. Iizuka,et al.  MECHANISMS OF DISEASE Mechanisms of disease , 2022 .

[15]  Kevin Baker,et al.  Classification of radar returns from the ionosphere using neural networks , 1989 .

[16]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[17]  Johan A. K. Suykens,et al.  Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction , 2004, Bioinform..

[18]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[19]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[20]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[21]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..