Sparse estimation using Bayesian hierarchical prior modeling for real and complex linear models

In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued models, this paper proposes a GSM model - the Bessel K model - that induces concave penalty functions for the estimation of complex sparse signals. The properties of the Bessel K model are analyzed when it is applied to Type I and Type II estimation. This analysis reveals that, by tuning the parameters of the mixing pdf different penalty functions are invoked depending on the estimation type used, the value of the noise variance, and whether real or complex signals are estimated. Using the Bessel K model, we derive sparse estimators based on a modification of the expectation-maximization algorithm formulated for Type II estimation. The estimators include as special instances the algorithms proposed by Tipping and Faul 1] and Babacan et al. 2]. Numerical results show the superiority of the proposed estimators over these state-of-the-art algorithms in terms of convergence speed, sparseness, reconstruction error, and robustness in low and medium signal-to-noise ratio regimes. HighlightsA GSM is proposed to model sparsity-inducing priors for real and complex signal models.By using the GSM in combination with a novel modification of the EM algorithm, sparse estimators are devised.The sparsity-inducing property of the GSM depends on whether the signal model is real or complex.The proposed sparse estimators encompass other existing estimators.The proposed estimators outperform these sparse estimators in low and moderate SNR regimes.

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[3]  Sundeep Rangan,et al.  Necessary and Sufficient Conditions for Sparsity Pattern Recovery , 2008, IEEE Transactions on Information Theory.

[4]  Dmitriy Shutin,et al.  Application of Bayesian hierarchical prior modeling to sparse channel estimation , 2012, 2012 IEEE International Conference on Communications (ICC).

[5]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Dmitriy Shutin,et al.  Sparse Variational Bayesian SAGE Algorithm With Application to the Estimation of Multipath Wireless Channels , 2011, IEEE Transactions on Signal Processing.

[7]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[8]  A. TroppJ. Greed is good , 2006 .

[9]  D. F. Andrews,et al.  Scale Mixtures of Normal Distributions , 1974 .

[10]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[11]  Kenneth Kreutz-Delgado,et al.  Probabilistic Formulation of Independent Vector Analysis Using Complex Gaussian Scale Mixtures , 2009, ICA.

[12]  Te-Won Lee,et al.  Multivariate Scale Mixture of Gaussians Modeling , 2006, ICA.

[13]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[14]  Anuj Srivastava,et al.  Universal Analytical Forms for Modeling Image Probabilities , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  R.G. Baraniuk,et al.  Compressive Sensing [Lecture Notes] , 2007, IEEE Signal Processing Magazine.

[16]  Arnaud Doucet,et al.  Sparse Bayesian nonparametric regression , 2008, ICML '08.

[17]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[18]  B. Jørgensen Statistical Properties of the Generalized Inverse Gaussian Distribution , 1981 .

[19]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[20]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[21]  Tilmann Gneiting,et al.  Normal scale mixtures and dual probability densities , 1997 .

[22]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[23]  Richard G. Baraniuk,et al.  Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[24]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[25]  J. Griffin,et al.  Bayesian adaptive lassos with non-convex penalization , 2007 .

[26]  David P. Wipf,et al.  A New View of Automatic Relevance Determination , 2007, NIPS.

[27]  A. Doucet,et al.  A Hierarchical Bayesian Framework for Constructing Sparsity-inducing Priors , 2010, 1009.1914.

[28]  Lawrence Carin,et al.  Bayesian Compressive Sensing , 2008, IEEE Transactions on Signal Processing.

[29]  E. Stacy A Generalization of the Gamma Distribution , 1962 .

[30]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[31]  David P. Wipf,et al.  Iterative Reweighted 1 and 2 Methods for Finding Sparse Solutions , 2010, IEEE J. Sel. Top. Signal Process..

[32]  Larry Wasserman,et al.  TO PROBABILITY AND MATHEMATICAL STATISTICS , 2017 .

[33]  Michael E. Tipping,et al.  Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003 .

[34]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[35]  D. Owen Handbook of Mathematical Functions with Formulas , 1965 .

[36]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[37]  Robert D. Nowak,et al.  Compressed Channel Sensing: A New Approach to Estimating Sparse Multipath Channels , 2010, Proceedings of the IEEE.

[38]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[39]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[40]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[41]  Bhaskar D. Rao,et al.  Variational EM Algorithms for Non-Gaussian Latent Variable Models , 2005, NIPS.

[42]  H. Vincent Poor,et al.  Fast Variational Sparse Bayesian Learning With Automatic Relevance Determination for Superimposed Signals , 2011, IEEE Transactions on Signal Processing.

[43]  Soontorn Oraintara,et al.  Complex Gaussian Scale Mixtures of Complex Wavelet Coefficients , 2010, IEEE Transactions on Signal Processing.

[44]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[45]  H. Vincent Poor,et al.  Fast adaptive variational sparse Bayesian learning with automatic relevance determination , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  O. Barndorff-Nielsen,et al.  Normal Variance-Mean Mixtures and z Distributions , 1982 .

[47]  Aggelos K. Katsaggelos,et al.  Bayesian Compressive Sensing Using Laplace Priors , 2010, IEEE Transactions on Image Processing.

[48]  Bhaskar D. Rao,et al.  Latent Variable Bayesian Models for Promoting Sparsity , 2011, IEEE Transactions on Information Theory.

[49]  D.G. Tzikas,et al.  The variational approximation for Bayesian inference , 2008, IEEE Signal Processing Magazine.

[50]  Pierre Moulin,et al.  Analysis of Multiresolution Image Denoising Schemes Using Generalized Gaussian and Complexity Priors , 1999, IEEE Trans. Inf. Theory.

[51]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[52]  Bhaskar D. Rao,et al.  Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[53]  Bernard H. Fleury,et al.  A fast iterative Bayesian inference algorithm for sparse channel estimation , 2013, 2013 IEEE International Conference on Communications (ICC).

[54]  David P. Wipf,et al.  Variational Bayesian Inference Techniques , 2010, IEEE Signal Processing Magazine.

[55]  Anuj Srivastava,et al.  Probability Models for Clutter in Natural Images , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[57]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.