The mRMR variable selection method: a comparative study for functional data

The use of variable selection methods is particularly appealing in statistical problems with functional data. The obvious general criterion for variable selection is to choose the ‘most representative’ or ‘most relevant’ variables. However, it is also clear that a purely relevance-oriented criterion could lead to select many redundant variables. The minimum Redundance Maximum Relevance (mRMR) procedure, proposed by Ding and Peng [Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol. 2005;3:185–205] and Peng et al. [Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005;27:1226–1238] is an algorithm to systematically perform variable selection, achieving a reasonable trade-off between relevance and redundancy. In its original form, this procedure is based on the use of the so-called mutual information criterion to assess relevance and redundancy. Keeping the focus on functional data problems, we propose here a modified version of the mRMR method, obtained by replacing the mutual information by the new association measure (called distance correlation) suggested by Székely et al. [Measuring and testing dependence by correlation of distances. Ann Statist. 2007;35:2769–2794]. We have also performed an extensive simulation study, including 1600 functional experiments (100 functional models sample sizes classifiers) and three real-data examples aimed at comparing the different versions of the mRMR methodology. The results are quite conclusive in favour of the new proposed alternative.

[1]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[3]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[4]  A. Cuevas,et al.  A comparative study of several smoothing methods in density estimation , 1994 .

[5]  Juan Antonio Cuesta-Albertos,et al.  Supervised Classification for a Family of Gaussian Functional Models , 2010, 1004.5031.

[6]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[7]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[8]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[10]  Maria L. Rizzo,et al.  Energy statistics: A class of statistics based on distances , 2013 .

[11]  Martin A. Lindquist,et al.  Logistic Regression With Brownian-Like Predictors , 2009 .

[12]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[13]  Maria L. Rizzo,et al.  On the uniqueness of distance covariance , 2012 .

[14]  Anirban Mukhopadhyay,et al.  A novel PSO-based graph-theoretic approach for identifying most relevant and non-redundant gene markers from gene expression data , 2015, Int. J. Parallel Emergent Distributed Syst..

[15]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[16]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[17]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[18]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[19]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[20]  Antonio Cuevas,et al.  Variable selection in functional data classification: a maxima-hunting proposal , 2013, 1309.6697.

[21]  James Bailey,et al.  Effective global approaches for mutual information based feature selection , 2014, KDD.

[22]  P. Hall,et al.  Determining and Depicting Relationships Among Components in High-Dimensional Variable Selection , 2011 .

[23]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[24]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[25]  Olga V. Demler,et al.  Impact of correlation on predictive ability of biomarkers , 2013, Statistics in medicine.

[26]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[27]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.