Sparse Regression: Utilizing the Higher-order Structure of Data for Prediction

Independent component analysis and the closely related method of sparse coding model multidimensional data as linear combinations of independent components that have nongaussian, usually sparse, distributions. Such a modelling approach is especially suitable in large dimensions, as it avoids the curse of dimensionality. It also seems to represent important properties of sensory data. In this paper we show how to use these models for regression. If the joint density of two random vectors is modelled by independent component analysis, it is possible to obtain simple algorithms to compute the maximum likelihood predictor of one of the vectors when the other vector is observed. The obtained predictors are nonlinear, but in contrast to such nonparametric methods as MLP, the nonlinearities are not chosen ad hoc: They are directly determined by the density approximation.

[1]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[2]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[3]  Aapo Hyvärinen,et al.  Sparse Code Shrinkage: Denoising of Nongaussian Data by Maximum Likelihood Estimation , 1999, Neural Computation.

[4]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[5]  E. Oja,et al.  Sparse code shrinkage for image denoising , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[6]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[7]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[8]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.