Online learning for attention, recognition, and tracking by a single developmental framework

It is likely that human-level online learning for vision will require a brain-like developmental model. We present a general purpose model, called the Self-Aware and Self-Effecting (SASE) model, characterized by internal sensation and action. Rooted in the biological genomic equivalence principle, this model is a general-purpose cell-centered in-place learning scheme to handle different levels of development and operation, from the cell level all the way to the brain level. It is unknown how the brain self-organizes its internal wiring without a holistically-aware central controller. How does the brain develop internal object representations? How do such representations enable tightly intertwined attention and recognition in the presence of complex backgrounds? Internally in SASE, local neural learning uses only the co-firing between the pre-synaptic and post-synaptic activities. Such a two-way representation automatically boosts action-relevant components in the sensory inputs (e.g., foreground vs. background) by increasing the chance of only action-related feature detectors to win in competition. It enables develop in a “skull-closed” fashion. We discuss SASE networks called Where-What networks (WWN) for the open problem of general purpose online attention and recognition with complex backgrounds. In WWN, desired invariance and specificity emerge at each of the what and where motor ends without an internal master map. WWN allows both type-based top-down attention and location-based top-down attention, to attend and recognize individual objects from complex backgrounds (which may include other objects). It is proposed that WWN deals with any real-world foreground objects and any complex backgrounds.

[1]  J. Elman,et al.  Rethinking Innateness: A Connectionist Perspective on Development , 1996 .

[2]  Juyang Weng,et al.  Task Muddiness, Intelligence Metrics, and the Necessity of Autonomous Mental Development , 2009, Minds and Machines.

[3]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[4]  M. Domjan The principles of learning and behavior , 1982 .

[5]  Juyang Weng,et al.  2008 Special issue , 2008 .

[6]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[7]  R. Baillargeon How Do Infants Learn About the Physical World? , 1994 .

[8]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[10]  Dileep George,et al.  Towards a Mathematical Theory of Cortical Micro-circuits , 2009, PLoS Comput. Biol..

[11]  D. Mumford,et al.  Neural activity in early visual cortex reflects behavioral experience and higher-order perceptual saliency , 2002, Nature Neuroscience.

[12]  John K. Tsotsos,et al.  Neurobiology of Attention , 2005 .

[13]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[14]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[15]  T. Sejnowski,et al.  Irresistible environment meets immovable neurons , 1997, Behavioral and Brain Sciences.

[16]  J. Piaget The construction of reality in the child , 1954 .

[17]  Ahmad Emami,et al.  A Neural Syntactic Language Model , 2005, Machine Learning.

[18]  Juyang Weng,et al.  Task Transfer by a Developmental Robot , 2007, IEEE Transactions on Evolutionary Computation.

[19]  J. Kaas,et al.  Auditory processing in primate cerebral cortex , 1999, Current Opinion in Neurobiology.

[20]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Robert Hecht-Nielsen,et al.  Confabulation theory , 2007, Scholarpedia.

[22]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[23]  James S. Albus,et al.  A model of computation and representation in the brain , 2010, Inf. Sci..

[24]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[25]  M. Corbetta,et al.  Control of goal-directed and stimulus-driven attention in the brain , 2002, Nature Reviews Neuroscience.

[26]  M. Alexander,et al.  Principles of Neural Science , 1981 .

[27]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[28]  Juyang Weng,et al.  Dually Optimal Neuronal Layers: Lobe Component Analysis , 2009, IEEE Transactions on Autonomous Mental Development.

[29]  E. Knudsen Fundamental components of attention. , 2007, Annual review of neuroscience.

[30]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[31]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.