Sparse Covariance Matrix Estimation by DCA-Based Algorithms

This letter proposes a novel approach using the -norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of -norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.

[1]  Edmund J. Crampin,et al.  Multilayer Perceptron Classification of Unknown Volatile Chemicals from the Firing Rates of Insect Olfactory Sensory Neurons and Its Application to Biosensor Design , 2013, Neural Computation.

[2]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[3]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[4]  Xinwei Deng,et al.  Penalized Covariance Matrix Estimation Using a Matrix-Logarithm Transformation , 2013 .

[5]  Haipeng Xing,et al.  Mean-Variance Portfolia Optimization When Means and Covariances are Unknown , 2010 .

[6]  Raphael N. Markellos,et al.  Parameter Uncertainty in Portfolio Selection: Shrinking the Inverse Covariance Matrix , 2011 .

[7]  Hiroyuki Toh,et al.  Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling , 2002, Bioinform..

[8]  W. Gander,et al.  A D.C. OPTIMIZATION ALGORITHM FOR SOLVING THE TRUST-REGION SUBPROBLEM∗ , 1998 .

[9]  Adam J. Rothman,et al.  Generalized Thresholding of Large Covariance Matrices , 2009 .

[10]  R. Tibshirani,et al.  Sparse estimation of a covariance matrix. , 2011, Biometrika.

[11]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[12]  Ron Meir,et al.  A bilinear formulation for vector sparsity optimization , 2008, Signal Process..

[13]  Yufeng Liu,et al.  Multicategory ψ-Learning and Support Vector Machine: Computational Tools , 2005 .

[14]  H. Zou,et al.  Sparse precision matrix estimation via lasso penalized D-trace loss , 2014 .

[15]  Hongyu Zhao,et al.  The application of sparse estimation of covariance matrix to quadratic discriminant analysis , 2015, BMC Bioinformatics.

[16]  Le Thi Hoai An,et al.  Optimization based DC programming and DCA for hierarchical clustering , 2007, Eur. J. Oper. Res..

[17]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[18]  Le Thi Hoai An,et al.  Feature selection for linear SVMs under uncertain data: Robust optimization based on difference of convex functions algorithms , 2014, Neural Networks.

[19]  Antonin Chambolle,et al.  Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage , 1998, IEEE Trans. Image Process..

[20]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[21]  Le Thi Hoai An,et al.  A DC programming approach for feature selection in support vector machines learning , 2008, Adv. Data Anal. Classif..

[22]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[23]  R. Tibshirani,et al.  Penalized classification using Fisher's linear discriminant , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[24]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[25]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[26]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[27]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[28]  Yoram Singer,et al.  Leveraging the margin more carefully , 2004, ICML.

[29]  B. Efron Correlated z-Values and the Accuracy of Large-Scale Statistical Estimates , 2010, Journal of the American Statistical Association.

[30]  Olivier Ledoit,et al.  Honey, I Shrunk the Sample Covariance Matrix , 2003 .

[31]  Olivier Ledoit,et al.  Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , 2003 .

[32]  Le Thi Hoai An,et al.  Block Clustering Based on Difference of Convex Functions (DC) Programming and DC Algorithms , 2013, Neural Computation.

[33]  Hua Xu,et al.  A comparative study of disease genes and drug targets in the human protein interactome , 2015, BMC Bioinformatics.

[34]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[35]  T. Richardson,et al.  Estimation of a covariance matrix with zeros , 2005, math/0508268.

[36]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[37]  Cheng Soon Ong,et al.  Learning sparse classifiers with difference of convex functions algorithms , 2013, Optim. Methods Softw..

[38]  Hoai Minh Le,et al.  Feature selection in machine learning: an exact penalty approach using a Difference of Convex function Algorithm , 2015 .

[39]  John S. Yap,et al.  Nonparametric Modeling of Longitudinal Covariance Structure in Functional Mapping of Quantitative Trait Loci , 2009, Biometrics.

[40]  R. Jagannathan,et al.  Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps , 2002 .

[41]  H. Zou,et al.  Positive Definite $\ell_1$ Penalized Estimation of Large Covariance Matrices , 2012, 1208.5702.

[42]  Jeffrey T Leek,et al.  A general framework for multiple testing dependence , 2008, Proceedings of the National Academy of Sciences.

[43]  Trevor Hastie,et al.  Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .

[44]  Le Thi Hoai An,et al.  A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem , 1998, SIAM J. Optim..

[45]  Le Thi Hoai An,et al.  DC approximation approaches for sparse optimization , 2014, Eur. J. Oper. Res..

[46]  Le Thi Hoai An,et al.  Exact penalty and error bounds in DC programming , 2012, J. Glob. Optim..

[47]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[48]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[49]  Le Thi Hoai An,et al.  New and efficient DCA based algorithms for minimum sum-of-squares clustering , 2014, Pattern Recognit..

[50]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.

[51]  Charles E McCulloch,et al.  A Flexible Estimating Equations Approach for Mapping Function-Valued Traits , 2011, Genetics.

[52]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[53]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[54]  Le Thi Hoai An,et al.  Recent Advances in DC Programming and DCA , 2013, Trans. Comput. Collect. Intell..

[55]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[56]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[57]  Adam J. Rothman Positive definite estimators of large covariance matrices , 2012 .

[58]  Jianqing Fan,et al.  Large covariance estimation by thresholding principal orthogonal complements , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[59]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[60]  Jason Weston,et al.  Trading convexity for scalability , 2006, ICML.

[61]  Enrique Sentana,et al.  The Econometrics of Mean-Variance Efficiency Tests: A Survey , 2009 .

[62]  A. U.S.,et al.  Sparse Estimation of a Covariance Matrix , 2010 .

[63]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[64]  Lie Wang,et al.  Sparse Covariance Matrix Estimation With Eigenvalue Constraints , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.