Implicit Learning in 3D Object Recognition: The Importance of Temporal Context

A novel architecture and set of learning rules for cortical self-organization is proposed. The model is based on the idea that multiple information channels can modulate one another's plasticity. Features learned from bottom-up information sources can thus be influenced by those learned from contextual pathways, and vice versa. A maximum likelihood cost function allows this scheme to be implemented in a biologically feasible, hierarchical neural circuit. In simulations of the model, we first demonstrate the utility of temporal context in modulating plasticity. The model learns a representation that categorizes people's faces according to identity, independent of viewpoint, by taking advantage of the temporal continuity in image sequences. In a second set of simulations, we add plasticity to the contextual stream and explore variations in the architecture. In this case, the model learns a two-tiered representation, starting with a coarse view-based clustering and proceeding to a finer clustering of more specific stimulus features. This model provides a tenable account of how people may perform 3D object recognition in a hierarchical, bottom-up fashion.

[1]  Michael I. Jordan,et al.  Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[2]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[3]  V. Bruce,et al.  The role of dynamic information in the recognition of unfamiliar faces , 1998, Memory & cognition.

[4]  J. Atick,et al.  Temporal decorrelation: a theory of lagged and nonlagged responses in the lateral geniculate nucleus , 1995 .

[5]  Terrence J. Sejnowski,et al.  Filter Selection Model for Generating Visual Motion Signals , 1992, NIPS.

[6]  D I Perrett,et al.  Organization and functions of cells responsive to faces in the temporal cortex. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[7]  H. Jones,et al.  Visual cortical mechanisms detecting focal orientation discontinuities , 1995, Nature.

[8]  Rajesh P. N. Rao,et al.  Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[9]  Keiji Tanaka,et al.  Coding visual images of objects in the inferotemporal cortex of the macaque monkey. , 1991, Journal of neurophysiology.

[10]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[11]  A. Sillito,et al.  Spatial frequency tuning of orientation‐discontinuity‐sensitive corticofugal feedback to the cat lateral geniculate nucleus. , 1996, The Journal of physiology.

[12]  I. Ohzawa,et al.  Receptive-field dynamics in the central visual pathways , 1995, Trends in Neurosciences.

[13]  Jim Kay,et al.  The discovery of structure by multi-stream networks of local processors with contextual guidance , 1995 .

[14]  J. B. Levitt,et al.  Receptive fields and functional architecture of macaque V2. , 1994, Journal of neurophysiology.

[15]  T J Sejnowski,et al.  Learning viewpoint-invariant face representations from visual experience in an attractor network. , 1998, Network.

[16]  James V. Stone Learning Perceptually Salient Visual Parameters Using Spatiotemporal Smoothness Constraints , 1996, Neural Computation.

[17]  W. Precht The synaptic organization of the brain G.M. Shepherd, Oxford University Press (1975). 364 pp., £3.80 (paperback) , 1976, Neuroscience.

[18]  Steven J. Nowlan,et al.  Maximum Likelihood Competitive Learning , 1989, NIPS.

[19]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[20]  David J. Hess,et al.  Effects of global and local context on lexical processing during language comprehension , 1995 .

[21]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  Bartlett W. Mel,et al.  Information Processing in Dendritic Trees , 1994, Neural Computation.

[24]  Konrad P. Körding,et al.  Neurons with Two Sites of Synaptic Integration Learn Invariant Representations , 2001, Neural Computation.

[25]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[26]  V. Bruce,et al.  Human Face Perception and Identification , 1998 .

[27]  J. H. Neely Semantic priming effects in visual word recognition: A selective review of current findings and theories. , 1991 .

[28]  H. McGurk,et al.  Visual influences on speech perception processes , 1978, Perception & psychophysics.

[29]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[30]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[31]  Suzanna Becker,et al.  Learning Temporally Persistent Hierarchical Representations , 1996, NIPS.

[32]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[33]  Y. Miyashita Neuronal correlate of visual associative long-term memory in the primate temporal cortex , 1988, Nature.

[34]  Alexander Dimitrov,et al.  Visual Cortex Circuitry and Orientation Tuning , 1996, NIPS.

[35]  C. Gilbert,et al.  Spatial integration and cortical dynamics. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Dario L. Ringach,et al.  Dynamics of orientation tuning in macaque primary visual cortex , 1997, Nature.

[37]  William H. Calvin Cortical columns, modules, and Hebbian cell assemblies , 1998 .

[38]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[39]  Marian Stewart Bartlett,et al.  Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks , 1996, NIPS.

[40]  Mark H. Johnson,et al.  Object Recognition and Sensitive Periods: A Computational Analysis of Visual Imprinting , 1994, Neural Computation.

[41]  T. Sejnowski,et al.  Spatial Transformations in the Parietal Cortex Using Basis Functions , 1997, Journal of Cognitive Neuroscience.

[42]  Steven J. Nowlan,et al.  Mixtures of Controllers for Jump Linear and Non-Linear Plants , 1993, NIPS.

[43]  Lucas Paletta,et al.  Learning temporal context in active object recognition using Bayesian analysis , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[44]  David I. Perrett,et al.  Modeling visual recognition from neurobiological constraints , 1994, Neural Networks.

[45]  Suzanna Becker,et al.  Learning to Categorize Objects Using Temporal Coherence , 1992, NIPS.

[46]  G. Shepherd The Synaptic Organization of the Brain , 1979 .

[47]  Jim Kay,et al.  Activation Functions, Computational Goals, and Learning Rules for Local Processors with Contextual Guidance , 1997, Neural Computation.

[48]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[49]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[50]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[51]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.