Sparse Discriminant Analysis

We consider the problem of performing interpretable classification in the high-dimensional setting, in which the number of features is very large and the number of observations is limited. This setting has been studied extensively in the chemometrics literature, and more recently has become commonplace in biological and medical applications. In this setting, a traditional approach involves performing feature selection before classification. We propose sparse discriminant analysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously. Sparse discriminant analysis is based on the optimal scoring interpretation of linear discriminant analysis, and can be extended to perform sparse discrimination via mixtures of Gaussians if boundaries between classes are nonlinear or if subgroups are present within each class. Our proposal also provides low-dimensional views of the discriminative directions.

[1]  J. Friedman Regularized Discriminant Analysis , 1989 .

[2]  E. Kandel,et al.  Proceedings of the National Academy of Sciences of the United States of America. Annual subject and author indexes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[3]  W. V. McCarthy,et al.  Discriminant Analysis with Singular Covariance Matrices: Methods and Applications to Spectroscopic Data , 1995 .

[4]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[5]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Haesun Park,et al.  A Procrustes problem on the Stiefel manifold , 1999, Numerische Mathematik.

[8]  Trevor Hastie,et al.  Flexible discriminant and mixture models , 2000 .

[9]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[10]  D. Hawksworth The magnitude of fungal diversity: the 1.5 million species estimate revisited * * Paper presented at , 2001 .

[11]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[12]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[13]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[15]  Hildur Ólafsdóttir,et al.  Adding Curvature to Minimum Description Length Shape Models , 2003, BMVC.

[16]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[17]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[18]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[19]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[20]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[21]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[22]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.

[23]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[24]  T. Næs,et al.  From dummy regression to prior probabilities in PLS‐DA , 2007 .

[25]  B. Ersbøll,et al.  A method for comparison of growth media in objective identification of Penicillium based on multi-spectral imaging. , 2007, Journal of microbiological methods.

[26]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[27]  Chenlei Leng,et al.  Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data , 2008, Comput. Biol. Chem..

[28]  Brian Knutson,et al.  Interpretable Classifiers for fMRI Improve Prediction of Purchases , 2008, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[29]  T. Næs,et al.  Canonical partial least squares—a unified PLS approach to classification and regression problems , 2009 .

[30]  Ping Xu,et al.  Modified linear discriminant analysis approaches for classification of high-dimensional microarray data , 2009, Comput. Stat. Data Anal..

[31]  Rasmus Larsen,et al.  Shape and Texture Based Classification of Fish Species , 2009, SCIA.

[32]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[33]  R. Tibshirani,et al.  Penalized classification using Fisher's linear discriminant , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[34]  Kurt Hornik,et al.  The Comprehensive R Archive Network , 2012 .