A Maximum-Likelihood Interpretation for Slow Feature Analysis

The brain extracts useful features from a maelstrom of sensory information, and a fundamental goal of theoretical neuroscience is to work out how it does so. One proposed feature extraction strategy is motivated by the observation that the meaning of sensory data, such as the identity of a moving visual object, is often more persistent than the activation of any single sensory receptor. This notion is embodied in the slow feature analysis (SFA) algorithm, which uses slowness as a heuristic by which to extract semantic information from multidimensional time series. Here, we develop a probabilistic interpretation of this algorithm, showing that inference and learning in the limiting case of a suitable probabilistic model yield exactly the results of SFA. Similar equivalences have proved useful in interpreting and extending comparable algorithms such as independent component analysis. For SFA, we use the equivalent probabilistic model as a conceptual springboard with which to motivate several novel extensions to the algorithm.

[1]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[2]  Laurenz Wiskott,et al.  What Is the Relation Between Slow Feature Analysis and Independent Component Analysis? , 2006, Neural Computation.

[3]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[4]  Michael S. Lewicki,et al.  A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals , 2005, Neural Computation.

[5]  A. Hyvärinen,et al.  Temporal and spatiotemporal coherence in simple-cell responses: a generative model of natural image sequences , 2003, Network.

[6]  Harri Valpola,et al.  Denoising Source Separation , 2005, J. Mach. Learn. Res..

[7]  James V. Stone Learning Perceptually Salient Visual Parameters Using Spatiotemporal Smoothness Constraints , 1996, Neural Computation.

[8]  Konrad Paul Kording,et al.  How are complex cell properties adapted to the statistics of natural stimuli? , 2004, Journal of neurophysiology.

[9]  Graeme Mitchison,et al.  Removing Time Variation with the Anti-Hebbian Differential Synapse , 1991, Neural Computation.

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  Barak A. Pearlmutter,et al.  Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA , 1996, NIPS.

[12]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[13]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[14]  Rajesh P. N. Rao,et al.  Bilinear Sparse Coding for Invariant Vision , 2005, Neural Computation.

[15]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[16]  Laurenz Wiskott,et al.  Slow feature analysis yields a rich repertoire of complex cell properties. , 2005, Journal of vision.

[17]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[18]  Konrad P. Körding,et al.  Learning the Nonlinearity of Neurons from Natural Visual Stimuli , 2003, Neural Computation.

[19]  Harri Lappalainen,et al.  Ensemble learning for independent component analysis , 1999 .

[20]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[21]  Aapo Hyvärinen,et al.  Bubbles: a unifying framework for low-level statistical properties of natural image sequences. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[22]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[23]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[24]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[25]  A. Hyvärinen,et al.  Temporal and spatiotemporal coherence in simple-cell responses: a generative model of natural image sequences , 2003 .

[26]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[27]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[28]  Zoubin Ghahramani,et al.  Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[29]  Yen-Wei Chen,et al.  Ensemble learning for independent component analysis , 2006, Pattern Recognit..

[30]  Geoffrey E. Hinton,et al.  Topographic Product Models Applied to Natural Scene Statistics , 2006, Neural Computation.