Reconciling "priors" & "priors" without prejudice?

There are two major routes to address linear inverse problems. Whereas regularization-based approaches build estimators as solutions of penalized regression optimization problems, Bayesian estimators rely on the posterior distribution of the unknown, given some assumed family of priors. While these may seem radically different approaches, recent results have shown that, in the context of additive white Gaussian denoising, the Bayesian conditional mean estimator is always the solution of a penalized regression problem. The contribution of this paper is twofold. First, we extend the additive white Gaussian denoising results to general linear inverse problems with colored Gaussian noise. Second, we characterize conditions under which the penalty function associated to the conditional mean estimator can satisfy certain popular properties such as convexity, separability, and smoothness. This sheds light on some tradeoff between computational efficiency and estimation accuracy in sparse regularization, and draws some connections between Bayesian estimation and proximal optimization.

[1]  M. Kowalski Sparse regression using mixed norms , 2009 .

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[4]  Rémi Gribonval,et al.  Reconciling "priors" & "priors" without prejudice? (research report) , 2013 .

[5]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[6]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[7]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[8]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[9]  Hervé Glotin,et al.  Stochastic Low-Rank Kernel Learning for Regression , 2011, ICML.

[10]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[11]  Rodolphe Jenatton Active Set Algorithm for Structured Sparsity-Inducing Norms , 2009 .

[12]  Volkan Cevher,et al.  Compressible Distributions for High-Dimensional Statistics , 2011, IEEE Transactions on Information Theory.

[13]  Eero P. Simoncelli,et al.  Learning to be Bayesian without Supervision , 2006, NIPS.

[14]  Rémi Gribonval,et al.  Should Penalized Least Squares Regression be Interpreted as Maximum A Posteriori Estimation? , 2011, IEEE Transactions on Signal Processing.