Optimal Approximation of Signal Priors

In signal restoration by Bayesian inference, one typically uses a parametric model of the prior distribution of the signal. Here, we consider how the parameters of a prior model should be estimated from observations of uncorrupted signals. A lot of recent work has implicitly assumed that maximum likelihood estimation is the optimal estimation method. Our results imply that this is not the case. We first obtain an objective function that approximates the error occurred in signal restoration due to an imperfect prior model. Next, we show that in an important special case (small gaussian noise), the error is the same as the score-matching objective function, which was previously proposed as an alternative for likelihood based on purely computational considerations. Our analysis thus shows that score matching combines computational simplicity with statistical optimality in signal restoration, providing a viable alternative to maximum likelihood methods. We also show how the method leads to a new intuitive and geometric interpretation of structure inherent in probability distributions.

[1]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[2]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[3]  M. Schervish Theory of Statistics , 1995 .

[4]  Simon J. Godsill,et al.  A Bayesian approach to the restoration of degraded audio signals , 1995, IEEE Trans. Speech Audio Process..

[5]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[6]  Edward H. Adelson,et al.  Noise removal via Bayesian wavelet coring , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[7]  K. Riedel Numerical Bayesian Methods Applied to Signal Processing , 1996 .

[8]  H. Chipman,et al.  Adaptive Bayesian Wavelet Shrinkage , 1997 .

[9]  Philippe Garat,et al.  Blind separation of mixture of independent sources through a quasi-maximum likelihood approach , 1997, IEEE Trans. Signal Process..

[10]  Christian Jutten,et al.  Source separation in post-nonlinear mixtures , 1999, IEEE Trans. Signal Process..

[11]  Aapo Hyvärinen,et al.  Sparse Code Shrinkage: Denoising of Nongaussian Data by Maximum Likelihood Estimation , 1999, Neural Computation.

[12]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[13]  I. Johnstone,et al.  Empirical Bayes selection of wavelet thresholds , 2005, math/0508281.

[14]  Steffen L. Lauritzen,et al.  The Geometry of Decision Theory , 2006 .

[15]  Eero P. Simoncelli,et al.  Learning to be Bayesian without Supervision , 2006, NIPS.

[16]  Aapo Hyvärinen,et al.  Connections Between Score Matching, Contrastive Divergence, and Pseudolikelihood for Continuous-Valued Variables , 2007, IEEE Transactions on Neural Networks.

[17]  W. Richards,et al.  Perception as Bayesian Inference , 2008 .

[18]  A. Hyvärinen,et al.  Estimation of Non-normalized Statistical Models , 2009 .