Temporal and spatiotemporal coherence in simple-cell responses: a generative model of natural image sequences

We present a two-layer dynamic generative model of the statistical structure of natural image sequences. The second layer of the model is a linear mapping from simple-cell outputs to pixel values, as in most work on natural image statistics. The first layer models the dependencies of the activity levels (amplitudes or variances) of the simple cells, using a multivariate autoregressive model. The second layer shows the emergence of basis vectors that are localized, oriented and have different scales, just like in previous work. But in our new model, the first layer learns connections between the simple cells that are similar to complex cell pooling: connections are strong among cells with similar preferred location, frequency and orientation. In contrast to previous work in which one of the layers needed to be fixed in advance, the dynamic model enables us to estimate both of the layers simultaneously from natural data.

[1]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[3]  Graeme Mitchison,et al.  Removing Time Variation with the Anti-Hebbian Differential Synapse , 1991, Neural Computation.

[4]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[5]  Anil K. Bera,et al.  ARCH Models: Properties, Estimation and Testing , 1993 .

[6]  J. Atick,et al.  Temporal decorrelation: a theory of lagged and nonlagged responses in the lateral geniculate nucleus , 1995 .

[7]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[8]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[9]  J. Movshon,et al.  Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex , 1997, The Journal of Neuroscience.

[10]  Geoffrey E. Hinton,et al.  Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[11]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[12]  J. V. van Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[13]  D. Ruderman,et al.  INDEPENDENT COMPONENT ANALYSIS OF NATURAL IMAGE SEQUENCES YIELDS SPATIOTEMPORAL FILTERS SIMILAR TO SIMPLE CELLS IN PRIMARY VISUAL CORTEX , 1998 .

[14]  D. Ruderman,et al.  Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[15]  Martin J. Wainwright,et al.  Scale Mixtures of Gaussians and the Statistics of Natural Images , 1999, NIPS.

[16]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[17]  Dinh-Tuan Pham,et al.  Blind separation of instantaneous mixtures of nonstationary sources , 2001, IEEE Trans. Signal Process..

[18]  Konrad P. Körding,et al.  Extracting Slow Subspaces from Natural Videos Leads to Complex Cells , 2001, ICANN.

[19]  Eero P. Simoncelli,et al.  Natural signal statistics and sensory gain control , 2001, Nature Neuroscience.

[20]  Aapo Hyvärinen,et al.  A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images , 2001, Vision Research.

[21]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[22]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[23]  Laurenz Wiskott,et al.  Applying Slow Feature Analysis to Image Sequences Yields a Rich Repertoire of Complex Cell Properties , 2002, ICANN.

[24]  Christoph Kayser,et al.  Learning the invariance properties of complex cells from their responses to natural stimuli , 2002, The European journal of neuroscience.

[25]  Juha Karhunen,et al.  An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[26]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[27]  Jos Koetsier,et al.  Unsupervised neural networks for the identification of minimum overcomplete basis in visual data , 2002, Neurocomputing.

[28]  R. Freeman,et al.  Oblique effect: a neural basis in the visual cortex. , 2003, Journal of neurophysiology.

[29]  Aapo Hyvärinen,et al.  Simple-Cell-Like Receptive Fields Maximize Temporal Coherence in Natural Video , 2003, Neural Computation.

[30]  Aapo Hyvärinen,et al.  Bubbles: a unifying framework for low-level statistical properties of natural image sequences. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[31]  Bruno A. Olshausen,et al.  Principles of Image Representation in Visual Cortex , 2003 .

[32]  Juha Karhunen,et al.  Hierarchical models of variance sources , 2004, Signal Process..

[33]  Aapo Hyvärinen,et al.  Blind separation of sources that have spatiotemporal variance dependencies , 2004, Signal Process..