Topographic Product Models Applied to Natural Scene Statistics

We present an energy-based model that uses a product of generalized Student-t distributions to capture the statistical structure in data sets. This model is inspired by and particularly applicable to natural data sets such as images. We begin by providing the mathematical framework, where we discuss complete and overcomplete models and provide algorithms for training these models from data. Using patches of natural scenes, we demonstrate that our approach represents a viable alternative to independent component analysis as an interpretive model of biological visual systems. Although the two approaches are similar in flavor, there are also important differences, particularly when the representations are overcomplete. By constraining the interactions within our model, we are also able to study the topographic organization of Gabor-like receptive fields that our model learns. Finally, we discuss the relation of our new approach to previous workin particular, gaussian scale mixture models and variants of independent components analysis.

[1]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[2]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[3]  D. Tolhurst,et al.  Characterizing the sparseness of neural codes , 2001, Network.

[4]  Geoffrey E. Hinton,et al.  Probabilistic sequential independent components analysis , 2004, IEEE Transactions on Neural Networks.

[5]  P O Hoyer,et al.  Independent component analysis applied to feature extraction from colour and stereo images , 2000, Network.

[6]  Tom Heskes,et al.  Selecting Weighting Factors in Logarithmic Opinion Pools , 1997, NIPS.

[7]  Geoffrey E. Hinton,et al.  Learning Sparse Topographic Representations with Products of Student-t Distributions , 2002, NIPS.

[8]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[9]  Christopher K. I. Williams,et al.  An analysis of contrastive divergence learning in gaussian boltzmann machines , 2002 .

[10]  Richard E. Turner,et al.  A Maximum-Likelihood Interpretation for Slow Feature Analysis , 2007, Neural Computation.

[11]  Terrence J. Sejnowski,et al.  Soft Mixer Assignment in a Hierarchical Generative Model of Natural Scene Statistics , 2006, Neural Computation.

[12]  Martin J. Wainwright,et al.  Scale Mixtures of Gaussians and the Statistics of Natural Images , 1999, NIPS.

[13]  Yee Whye Teh,et al.  Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation , 2006, Cogn. Sci..

[14]  P. Dayan,et al.  An unsupervised learning model of neural plasticity: Orientation selectivity in goggle-reared kittens , 2007, Vision Research.

[15]  Yee Whye Teh,et al.  Discovering Multiple Constraints that are Frequently Approximately Satisfied , 2001, UAI.

[16]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[17]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[18]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[19]  Eero P. Simoncelli Statistical models for images: compression, restoration and synthesis , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[20]  J. H. Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[21]  Geoffrey E. Hinton,et al.  Self Supervised Boosting , 2002, NIPS.

[22]  Michael S. Lewicki,et al.  A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals , 2005, Neural Computation.

[23]  David Haussler,et al.  Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[24]  D. F. Andrews,et al.  Scale Mixtures of Normal Distributions , 1974 .

[25]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[26]  Christopher K. I. Williams,et al.  Products of Gaussians , 2001, NIPS.

[27]  Max Welling,et al.  Extreme Components Analysis , 2003, NIPS.

[28]  Joseph J. Atick,et al.  What Does the Retina Know about Natural Scenes? , 1992, Neural Computation.

[29]  Song-Chun Zhu Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998 .

[30]  M. Lewicki,et al.  Learning higher-order structures in natural images , 2003, Network.

[31]  G. Goodhill Contributions of Theoretical Modeling to the Understanding of Neural Map Development , 2007, Neuron.

[32]  Martin J. Wainwright,et al.  Image denoising using scale mixtures of Gaussians in the wavelet domain , 2003, IEEE Trans. Image Process..

[33]  Yee Whye Teh,et al.  Energy-Based Models for Sparse Overcomplete Representations , 2003, J. Mach. Learn. Res..

[34]  Aapo Hyvärinen,et al.  A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images , 2001, Vision Research.

[35]  M. A. Repucci,et al.  Spatial Structure and Symmetry of Simple-Cell Receptive Fields in Macaque Primary Visual Cortex , 2002 .

[36]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[37]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[38]  Javier R. Movellan,et al.  Diffusion Networks, Products of Experts, and Factor Analysis , 2001 .

[39]  Martin J. Wainwright,et al.  Random cascades of Gaussian scale mixtures and their use in modeling natural images with application to denoising , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).