Iterative Gaussianization: From ICA to Random Rotations

Most signal processing problems involve the challenging task of multidimensional probability density function (PDF) estimation. In this paper, we propose a solution to this problem by using a family of rotation-based iterative Gaussianization (RBIG) transforms. The general framework consists of the sequential application of a univariate marginal Gaussianization transform followed by an orthonormal transform. The proposed procedure looks for differentiable transforms to a known PDF so that the unknown PDF can be estimated at any point of the original domain. In particular, we aim at a zero-mean unit-covariance Gaussian for convenience. RBIG is formally similar to classical iterative projection pursuit algorithms. However, we show that, unlike in PP methods, the particular class of rotations used has no special qualitative relevance in this context, since looking for interestingness is not a critical issue for PDF estimation. The key difference is that our approach focuses on the univariate part (marginal Gaussianization) of the problem rather than on the multivariate part (rotation). This difference implies that one may select the most convenient rotation suited to each practical application. The differentiability, invertibility, and convergence of RBIG are theoretically and experimentally analyzed. Relation to other methods, such as radial Gaussianization, one-class support vector domain description, and deep neural networks is also pointed out. The practical performance of RBIG is successfully illustrated in a number of multidimensional problems such as image synthesis, classification, denoising, and multi-information estimation.

[1]  Jean-François Cardoso,et al.  Dependence, Correlation and Gaussianity in Independent Component Analysis , 2003, J. Mach. Learn. Res..

[2]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[3]  Aapo Hyvärinen,et al.  Sparse Code Shrinkage: Denoising of Nongaussian Data by Maximum Likelihood Estimation , 1999, Neural Computation.

[4]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[5]  Kuldip K. Paliwal,et al.  Fast principal component analysis using fixed-point algorithm , 2007, Pattern Recognit. Lett..

[6]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[7]  Lai-Wan Chan,et al.  Extended Gaussianization Method for Blind Separation of Post-Nonlinear Mixtures , 2005, Neural Computation.

[8]  Valero Laparra,et al.  Image Denoising with Kernels Based on Natural Image Relations , 2010, J. Mach. Learn. Res..

[9]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[10]  Tülay Adali,et al.  Complex ICA by Negentropy Maximization , 2008, IEEE Transactions on Neural Networks.

[11]  Valero Laparra,et al.  Psychophysically Tuned Divisive Normalization Approximately Factorizes the PDF of Natural Images , 2010, Neural Computation.

[12]  Eero P. Simoncelli,et al.  Image compression via joint statistical characterization in the wavelet domain , 1999, IEEE Trans. Image Process..

[13]  Eero P. Simoncelli,et al.  Nonlinear Extraction of Independent Components of Natural Images Using Radial Gaussianization , 2009, Neural Computation.

[14]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  M. Studený,et al.  The Multiinformation Function as a Tool for Measuring Stochastic Dependence , 1998, Learning in Graphical Models.

[16]  Martin J. Wainwright,et al.  Image denoising using scale mixtures of Gaussians in the wavelet domain , 2003, IEEE Trans. Image Process..

[17]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[18]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[19]  Adrian F. M. Smith,et al.  BOOK REVIEW: Bayesian Theory , 2001 .

[20]  Pierre Moulin,et al.  Information-theoretic analysis of interscale and intrascale dependencies between image wavelet coefficients , 2001, IEEE Trans. Image Process..

[21]  Eero P. Simoncelli,et al.  Nonlinear image representation for efficient perceptual coding , 2006, IEEE Transactions on Image Processing.

[22]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[23]  L. Dworsky An Introduction to Probability , 2008 .

[24]  I. Jolliffe Principal Component Analysis , 2002 .

[25]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[26]  Pierre Comon,et al.  How fast is FastICA? , 2006, 2006 14th European Signal Processing Conference.

[27]  Ramesh A. Gopinath,et al.  Gaussianization , 2000, NIPS.

[28]  P. A. P. Moran,et al.  An introduction to probability theory , 1968 .

[29]  Jason F. Ralph,et al.  Automatic Induction of Projection Pursuit Indices , 2010, IEEE Transactions on Neural Networks.

[30]  Matthias Bethge,et al.  Natural Image Coding in V1: How Much Use Is Orientation Selectivity? , 2008, PLoS Comput. Biol..

[31]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[32]  Jia Jie Bayesian denoising of visual images in the wavelet domain , 2003 .

[33]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[34]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[35]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[36]  Henry Stark,et al.  Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[37]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[38]  Luis Gómez-Chova,et al.  Urban monitoring using multi-temporal SAR and multi-spectral data , 2006, Pattern Recognit. Lett..

[39]  Robert D. Nowak,et al.  Wavelet-based image estimation: an empirical Bayes approach using Jeffrey's noninformative prior , 2001, IEEE Trans. Image Process..

[40]  R. Moddemeijer On estimation of entropy and mutual information of continuous distributions , 1989 .

[41]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[42]  Robert Jenssen,et al.  Gaussianization: An Efficient Multivariate Density Estimation Technique for Statistical Signal Processing , 2006, J. VLSI Signal Process..

[43]  Gustavo Camps-Valls,et al.  On the Suitable Domain for SVM Training in Image Coding , 2008, J. Mach. Learn. Res..

[44]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[45]  Eero P. Simoncelli Bayesian Denoising of Visual Images in the Wavelet Domain , 1999 .

[46]  Ramesh A. Gopinath,et al.  Short-time Gaussianization for robust speaker verification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[47]  Gene H. Golub,et al.  Matrix computations , 1983 .

[48]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[49]  Jean-Claude Massé,et al.  A statistical model for random rotations , 2006 .

[50]  Maria L. Rizzo,et al.  A new test for multivariate normality , 2005 .

[51]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[52]  F. Piazza,et al.  A practical Approach Based on Gaussianization for Post-Nonlinear Underdetermined BSS , 2006, 2006 International Conference on Communications, Circuits and Systems.

[53]  Francesc J. Ferri,et al.  Regularization operators for natural images based on nonlinear perception models , 2006, IEEE Transactions on Image Processing.

[54]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[55]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..