Variational Inference of Penalized Regression with Submodular Functions

Various regularizers inducing structuredsparsity are constructed as Lovász extensions of submodular functions. In this paper, we consider a hierarchical probabilistic model of linear regression and its kernel extension with this type of regularization, and develop a variational inference scheme for the posterior estimate on this model. We derive an upper bound on the partition function with an approximation guarantee, and then show that minimizing this bound is equivalent to the minimization of a quadratic function over the polyhedron determined by the corresponding submodular function, which can be solved efficiently by the proximal gradient algorithm. Our scheme gives a natural extension of the Bayesian Lasso model for the maximum a posteriori (MAP) estimation to a variety of regularizers inducing structured sparsity, and thus this work provides a principled way to transfer the advantages of the Bayesian formulation into those models. Finally, we investigate the empirical performance of our scheme with several Bayesian variants of widely known models such as Lasso, generalized fused Lasso, and non-overlapping group Lasso.

[1]  Francis R. Bach,et al.  Convex Relaxation for Combinatorial Penalties , 2012, ArXiv.

[2]  Tommi S. Jaakkola,et al.  On the Partition Function and Random Maximum A-Posteriori Perturbations , 2012, ICML.

[3]  Robert E. Tarjan,et al.  A Fast Parametric Maximum Flow Algorithm and Applications , 1989, SIAM J. Comput..

[4]  Thomas L Casavant,et al.  Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11). , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[5]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[6]  Jack Edmonds,et al.  Matroids and the greedy algorithm , 1971, Math. Program..

[7]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[8]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[9]  Andreas Krause,et al.  Scalable Variational Inference in Log-supermodular Models , 2015, ICML.

[10]  Yi Yang,et al.  A fast unified algorithm for solving group-lasso penalize learning problems , 2014, Statistics and Computing.

[11]  David P. Wipf,et al.  Sparse Estimation Using General Likelihoods and Non-Factorial Priors , 2009, NIPS.

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[14]  Koh Takeuchi,et al.  Higher Order Fused Regularization for Supervised Learning with Grouped Parameters , 2015, ECML/PKDD.

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  Tibério S. Caetano,et al.  A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation , 2012, NIPS.

[17]  Ultracold Quantum Fields (Theoretical And Mathematical Physics) Ebooks Free Download , 2017 .

[18]  Artin Armagan,et al.  Variational Bridge Regression , 2009, AISTATS.

[19]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[20]  Andreas Krause,et al.  Variational Inference in Mixed Probabilistic Submodular Models , 2016, NIPS.

[21]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[22]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[23]  Volker Roth,et al.  The Bayesian group-Lasso for analyzing contingency tables , 2009, ICML '09.

[24]  Mohamed-Jalal Fadili,et al.  A Generalized Forward-Backward Splitting , 2011, SIAM J. Imaging Sci..

[25]  Andreas Krause,et al.  From MAP to Marginals: Variational Inference in Bayesian Submodular Models , 2014, NIPS.

[26]  Wen Gao,et al.  Efficient Generalized Fused Lasso and its Application to the Diagnosis of Alzheimer's Disease , 2014, AAAI.

[27]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[28]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[29]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[30]  Xiaofang Xu,et al.  Bayesian Variable Selection and Estimation for Group Lasso , 2015, 1512.01013.

[31]  Subhransu Maji,et al.  On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations , 2013, NIPS.

[32]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[33]  Holger Hoefling A Path Algorithm for the Fused Lasso Signal Approximator , 2009, 0910.0526.

[34]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..