Learning Perceptually Salient Visual Parameters Using Spatiotemporal Smoothness Constraints

A model is presented for unsupervised learning of low level vision tasks, such as the extraction of surface depth. A key assumption is that perceptually salient visual parameters (e.g., surface depth) vary smoothly over time. This assumption is used to derive a learning rule that maximizes the long-term variance of each unit's outputs, whilst simultaneously minimizing its short-term variance. The length of the half-life associated with each of these variances is not critical to the success of the algorithm. The learning rule involves a linear combination of anti-Hebbian and Hebbian weight changes, over short and long time scales, respectively. This maximizes the information throughput with respect to low-frequency parameters implicit in the input sequence. The model is used to learn stereo disparity from temporal sequences of random-dot and gray-level stereograms containing synthetically generated subpixel disparities. The presence of temporal discontinuities in disparity does not prevent learning or generalization to previously unseen image sequences. The implications of this class of unsupervised methods for learning in perceptual systems are discussed.

[1]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[2]  H B Barlow,et al.  Single units and sensation: a neuron doctrine for perceptual psychology? , 1972, Perception.

[3]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[4]  T. Sejnowski,et al.  Storing covariance with nonlinearly interacting neurons , 1977, Journal of mathematical biology.

[5]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[6]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[7]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[8]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[9]  B. C. Motter,et al.  Responses of neurons in visual cortex (V1 and V2) of the alert macaque to dynamic random-dot stereograms , 1985, Vision Research.

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  Richard Durbin,et al.  An analogue approach to the travelling salesman problem using an elastic net method , 1987, Nature.

[12]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[13]  D. Hubel,et al.  Segregation of form, color, movement, and depth: anatomy, physiology, and perception. , 1988, Science.

[14]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[15]  Geoffrey E. Hinton,et al.  Discovering Viewpoint-Invariant Relationships That Characterize Objects , 1990, NIPS.

[16]  Terrence J. Sejnowski,et al.  Competitive Anti-Hebbian Learning of Invariants , 1991, NIPS.

[17]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[18]  Graeme Mitchison,et al.  Removing Time Variation with the Anti-Hebbian Differential Synapse , 1991, Neural Computation.

[19]  James V. Stone The Optimal Elastic Net: Finding Solutions to the Travelling Salesman Problem , 1992 .

[20]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[21]  A. Cowey,et al.  The role of the 'face-cell' area in the discrimination and recognition of faces by monkeys. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[22]  Suzanna Becker,et al.  Learning to Categorize Objects Using Temporal Coherence , 1992, NIPS.

[23]  Harry G. Barrow,et al.  A Model of Adaptive Development of Complex Cortical Cells , 1992 .

[24]  G. Westheimer The Ferrier Lecture, 1992. Seeing depth with two eyes: stereopsis , 1994, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[25]  James V. Stone Learning Spatio-Temporal Invariances , 1994, BMVC.

[26]  G. Westheimer SEEING DEPTH WITH TWO EYES: STEREOPSIS , 1994 .

[27]  James V. Stone,et al.  Adaptive Scale Filtering: A General Method for Obtaining Shape From Texture , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  James V. Stone,et al.  A learning rule for extracting spatio-temporal invariances , 1995 .

[29]  Jim Kay,et al.  The discovery of structure by multi-stream networks of local processors with contextual guidance , 1995 .

[30]  Suzanna Becker,et al.  Mutual information maximization: models of cortical self-organization. , 1996, Network.