Joint Estimation of Structured Sparsity and Output Structure in Multiple-Output Regression via Inverse-Covariance Regularization

We consider the problem of learning a sparse regression model for predicting multiple related outputs given high-dimensional inputs, where related outputs are likely to share common relevant inputs. Most of the previous methods for learning structured sparsity assumed that the structure over the outputs is known a priori, and focused on designing regularization functions that encourage structured sparsity reflecting the given output structure. In this paper, we propose a new approach for sparse multiple-output regression that can jointly learn both the output structure and regression coecients with structured sparsity. Our approach reformulates the standard regression model into an alternative parameterization that leads to a conditional Gaussian graphical model, and employes an inverse-covariance regularization. We show that the orthant-wise quasi-Newton algorithm developed for L1regularized log-linear model can be adopted for a fast optimization for our method. We demonstrate our method on simulated datasets and real datasets from genetics and finances applications.

[1]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[2]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[3]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[4]  Michael I. Jordan,et al.  High-dimensional union support recovery in multivariate regression , 2008, NIPS 2008.

[5]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[6]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[7]  Xi Chen,et al.  Smoothing Proximal Gradient Method for General Structured Sparse Learning , 2011, UAI.

[8]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[9]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[10]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[11]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  P. Zhao,et al.  Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[16]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[17]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[18]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[19]  E. Xing,et al.  Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network , 2009, PLoS genetics.

[20]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[21]  John D. Storey,et al.  Mapping the Genetic Architecture of Gene Expression in Human Liver , 2008, PLoS biology.

[22]  R. Tibshirani,et al.  Covariance‐regularized regression and classification for high dimensional problems , 2009, Journal of the Royal Statistical Society. Series B, Statistical methodology.