Gaussianization

High dimensional data modeling is difficult mainly because the so-called "curse of dimensionality". We propose a technique called "Gaussianization" for high dimensional density estimation, which alleviates the curse of dimensionality by exploiting the independence structures in the data. Gaussianization is motivated from recent developments in the statistics literature: projection pursuit, independent component analysis and Gaussian mixture models with semi-tied covariances. We propose an iterative Gaussianization procedure which converges weakly: at each iteration, the data is first transformed to the least dependent coordinates and then each coordinate is marginally Gaussianized by univariate techniques. Gaussianization offers density estimation sharper than traditional kernel methods and radial basis function methods. Gaussianization can be viewed as efficient solution of nonlinear independent component analysis and high dimensional projection pursuit.

[1]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[2]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[3]  Jenq-Neng Hwang,et al.  Nonparametric multivariate density estimation: a comparative study , 1994, IEEE Trans. Signal Process..

[4]  S. Klinke,et al.  Exploratory Projection Pursuit , 1995 .

[5]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[6]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..