MEG Mind Reading : Strategies for Feature Selection

The regularized logistic regression classifier has shown good performance in problems where feature selection is critical, including our recent winning submissions to the ICANN2011 MEG mind reading challenge [Huttunen et al. 2011; Huttunen et al. 2012], and to the DREAM 6 AML classification challenge [Manninen et al. 2011]. The benefit of the method is that it includes an embedded feature selection step, which automatically selects a good subset of input features, thus, simplifying the classifier and improving the generalization. However, explicit wrapper feature selection methods, such as the forward and backward feature selection, are also widely used in similar problems. In this paper, we compare the efficiency of the elastic net regularized logistic regression classifier with the support vector machine classifier in combination with various sequential feature selection methods.

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  Zne-Jung Lee,et al.  Parameter determination of support vector machine and feature selection using simulated annealing approach , 2008, Appl. Soft Comput..

[3]  Mikko Sams,et al.  Face Prediction from fMRI Data during Movie Stimulus: Strategies for Feature Selection , 2011, ICANN.

[4]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[5]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[6]  Tong Zhang,et al.  Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations , 2011, IEEE Transactions on Information Theory.

[7]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[8]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[10]  Stephen José Hanson,et al.  Decoding the Large-Scale Structure of Brain Function by Classifying Mental States Across Individuals , 2009, Psychological science.

[11]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[12]  Justin C. W. Debuse,et al.  Feature Subset Selection within a Simulated Annealing Data Mining Algorithm , 1997, Journal of Intelligent Information Systems.

[13]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[14]  Heikki Huttunen,et al.  Mind reading with regularized multinomial logistic regression , 2012, Machine Vision and Applications.

[15]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[16]  J. Anderson,et al.  Penalized maximum likelihood estimation in logistic regression and discrimination , 1982 .

[17]  Samuel Kaski,et al.  ICANN/PASCAL2 Challenge: MEG Mind Reading — Overview and Results , 2012 .

[18]  Tom M. Mitchell,et al.  Machine learning classifiers and fMRI: A tutorial overview , 2009, NeuroImage.

[19]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[20]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[23]  Mark W. Woolrich,et al.  Advances in functional and structural MR image analysis and implementation as FSL , 2004, NeuroImage.

[24]  Masa-aki Sato,et al.  Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns , 2008, NeuroImage.

[25]  Stefan Haufe,et al.  The Berlin Brain–Computer Interface: Non-Medical Uses of BCI Technology , 2010, Front. Neurosci..

[26]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[28]  D. Rueckert,et al.  Multi-Method Analysis of MRI Images in Early Diagnostics of Alzheimer's Disease , 2011, PloS one.