论文信息 - Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

Eric P. Xing

Seyoung Kim

E. Xing

Seyoung Kim

Abstract:We consider the problem of learning a sparse multi-task regression, where the structure in the outputs can be represented as a tree with leaf nodes as outputs and internal nodes as clusters of the outputs at multiple granularity. Our goal is to recover the common set of relevant inputs for each output cluster. Assuming that the tree structure is available as prior knowledge, we formulate this problem as a new multi-task regularized regression called tree-guided group lasso. Our structured regularization is based on a group-lasso penalty, where groups are defined with respect to the tree structure. We describe a systematic weighting scheme for the groups in the penalty such that each output variable is penalized in a balanced manner even if the groups overlap. We present an efficient optimization method that can handle a large-scale problem. Using simulated and yeast datasets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns compared to other methods for multi-task learning.

参考文献

[1] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[2] J. Mesirov,et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[3] R. Tibshirani,et al. Supervised harvesting of expression trees , 2001, Genome Biology.

[4] R. Tibshirani,et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5] D. Pe’er,et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[6] A. Ng. Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[7] S. Horvath,et al. Statistical Applications in Genetics and Molecular Biology , 2011 .

[8] Joshua T. Burdick,et al. Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[9] S. Hunt,et al. Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[10] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[11] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12] D. Pe’er,et al. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification , 2006, Proceedings of the National Academy of Sciences.

[13] Martin J. Wainwright,et al. High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.

[14] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[15] K. Gunsalus,et al. Network modeling links breast cancer susceptibility and centrosome dysfunction. , 2007, Nature genetics.

[16] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.

[17] P. Zhao,et al. Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .

[18] R. Tibshirani,et al. PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[19] Martin J. Wainwright,et al. High-dimensional support union recovery in multivariate regression , 2008, NIPS.

[20] Francis R. Bach,et al. Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[21] Rachel B. Brem,et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[22] Yves Grandvalet,et al. Y.: SimpleMKL , 2008 .

[23] H. Stefánsson,et al. Genetics of gene expression and its effect on disease , 2008, Nature.

[24] S. Horvath,et al. Variations in DNA elucidate molecular networks that cause disease , 2008, Nature.

[25] Michael I. Jordan,et al. High-dimensional union support recovery in multivariate regression , 2008, NIPS 2008.

[26] Jean-Philippe Vert,et al. Group lasso with overlap and graph lasso , 2009, ICML '09.

[27] E. Xing,et al. Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network , 2009, PLoS genetics.

[28] P. Zhao,et al. The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[29] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[30] Trevor J. Hastie,et al. Genome-wide association analysis by lasso penalized logistic regression , 2009, Bioinform..

[31] Ben Taskar,et al. Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[32] Huan Liu. Feature Selection , 2010, Encyclopedia of Machine Learning.

[33] Shuicheng Yan,et al. Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34] Yi Zhang,et al. Multi-Task Active Learning with Output Constraints , 2010, AAAI.

[35] R. Tibshirani,et al. A note on the group lasso and a sparse group lasso , 2010, 1001.0736.

[36] Rong Jin,et al. Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[37] Francis R. Bach,et al. Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[38] Xi Chen,et al. Smoothing Proximal Gradient Method for General Structured Sparse Learning , 2011, UAI.

[39] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[40] Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping , 2009, 0909.1373.

引用

Towards Scalable Analysis of Images and Videos

2014

Metadata-Based Clustered Multi-task Learning for Thread Mining in Web Communities

MLDM

2016

Learning Sparse Convolutional Neural Network via Quantization With Low Rank Regularization

IEEE Access

2019

GWAS in a Box: Statistical and Visual Analytics of Structured Associations via GenAMap

PloS one

2014

When Low Rank Representation Based Hyperspectral Imagery Classification Meets Segmented Stacked Denoising Auto-Encoder Based Spatial-Spectral Feature

Remote. Sens.

2018

An Efficient Proximal Gradient Method for General Structured Structured Sparse Learning

2010

An E-cient Proximal Gradient Method for General Structured Sparse Learning

2010

Smoothing Proximal Gradient Method for General Structured Sparse Learning

UAI

2011

A Family of Penalty Functions for Structured Sparsity

NIPS

2010

Regularizers for structured sparsity

Advances in Computational Mathematics

2010

Structured sparsity with convex penalty functions

2012

Smoothing proximal gradient method for general structured sparse regression

The Annals of Applied Statistics

2010

Distributed Multi-Task Learning with Shared Representation

ArXiv

2016

Structured sparsity via optimal interpolation norms

2017

Sparse coding for machine learning, image processing and computer vision

2010

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

Towards Scalable Analysis of Images and Videos

Metadata-Based Clustered Multi-task Learning for Thread Mining in Web Communities

Learning Sparse Convolutional Neural Network via Quantization With Low Rank Regularization

GWAS in a Box: Statistical and Visual Analytics of Structured Associations via GenAMap

Advanced Data Mining and Applications

Heterogeneous Contrastive Learning

Generalized Dictionary for Multitask Learning with Boosting

Neural Granger Causality for Nonlinear Time Series

Semantic Kernel Forests from Multiple Taxonomies

When Low Rank Representation Based Hyperspectral Imagery Classification Meets Segmented Stacked Denoising Auto-Encoder Based Spatial-Spectral Feature

An Efficient Proximal Gradient Method for General Structured Structured Sparse Learning

An E-cient Proximal Gradient Method for General Structured Sparse Learning

Smoothing Proximal Gradient Method for General Structured Sparse Learning

A Family of Penalty Functions for Structured Sparsity

Regularizers for structured sparsity

Structured sparsity with convex penalty functions

Smoothing proximal gradient method for general structured sparse regression

Distributed Multi-Task Learning with Shared Representation

Structured sparsity via optimal interpolation norms

Sparse coding for machine learning, image processing and computer vision