Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance

We propose a robust framework to jointly perform two key modeling tasks involving high dimensional data: (i) learning a sparse functional mapping from multiple predictors to multiple responses while taking advantage of the coupling among responses, and (ii) estimating the conditional dependency structure among responses while adjusting for their predictors. The traditional likelihood-based estimators lack resilience with respect to outliers and model misspecification. This issue is exacerbated when dealing with high dimensional noisy data. In this work, we propose instead to minimize a regularized distance criterion, which is motivated by the minimum distance functionals used in nonparametric methods for their excellent robustness properties. The proposed estimates can be obtained efficiently by leveraging a sequential quadratic programming algorithm. We provide theoretical justification such as estimation consistency for the proposed estimator. Additionally, we shed light on the robustness of our estimator through its linearization, which yields a combination of weighted lasso and graphical lasso with the sample weights providing an intuitive explanation of the robustness. We demonstrate the merits of our framework through simulation study and the analysis of real financial and genetics data.

[1]  M. C. Jones,et al.  Robust and efficient estimation by minimising a density power divergence , 1998 .

[2]  Mathias Drton,et al.  Robust graphical modeling of gene networks using classical and alternative t-distributions , 2010, 1009.3669.

[3]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[4]  Jianhua Z. Huang,et al.  Covariance matrix selection and estimation via penalised normal likelihood , 2006 .

[5]  D. Donoho,et al.  The "Automatic" Robustness of Minimum Distance Functionals , 1988 .

[6]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[7]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[8]  C. O. A. D. P. R. M. A. E. Stimation Covariate Adjusted Precision Matrix Estimation with an Application in Genetical Genomics , 2011 .

[9]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[10]  Michael I. Jordan Graphical Models , 2003 .

[11]  Kyung-Ah Sohn,et al.  Joint Estimation of Structured Sparsity and Output Structure in Multiple-Output Regression via Inverse-Covariance Regularization , 2012, AISTATS.

[12]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[15]  R. Beran Robust Location Estimates , 1977 .

[16]  Pradeep Ravikumar,et al.  Sparse inverse covariance matrix estimation using quadratic approximation , 2011, MLSLP.

[17]  J. Wolfowitz The Minimum Distance Method , 1957 .

[18]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[19]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[20]  Peter Z. G. Qian,et al.  Sliced space-filling designs , 2009 .

[21]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Maximum Likelihood Estimation , 2007, ArXiv.

[22]  Frank E. Curtis,et al.  An adaptive gradient sampling algorithm for non-smooth optimization , 2013, Optim. Methods Softw..

[23]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[24]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[25]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[27]  Hansheng Wang,et al.  Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso , 2007 .

[28]  Yufeng Liu,et al.  Simultaneous multiple response regression and inverse covariance matrix estimation via penalized Gaussian maximum likelihood , 2012, J. Multivar. Anal..

[29]  Michael L. Overton,et al.  A Sequential Quadratic Programming Algorithm for Nonconvex, Nonsmooth Constrained Optimization , 2012, SIAM J. Optim..

[30]  David W. Scott,et al.  Parametric Statistical Modeling by Minimum Integrated Square Error , 2001, Technometrics.

[31]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[32]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[33]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Takafumi Kanamori,et al.  Density-Difference Estimation , 2012, Neural Computation.