From simple innate biases to complex visual concepts

Early in development, infants learn to solve visual problems that are highly challenging for current computational methods. We present a model that deals with two fundamental problems in which the gap between computational difficulty and infant learning is particularly striking: learning to recognize hands and learning to recognize gaze direction. The model is shown a stream of natural videos and learns without any supervision to detect human hands by appearance and by context, as well as direction of gaze, in complex natural scenes. The algorithm is guided by an empirically motivated innate mechanism—the detection of “mover” events in dynamic images, which are the events of a moving image region causing a stationary region to move or change after contact. Mover events provide an internal teaching signal, which is shown to be more effective than alternative cues and sufficient for the efficient acquisition of hand and gaze representations. The implications go beyond the specific tasks, by showing how domain-specific “proto concepts” can guide the system to acquire meaningful concepts, which are significant to the observer but statistically inconspicuous in the sensory input.

[1]  F. Kaufmann,et al.  Aspects of Motion Perception in Infancy , 1987 .

[2]  William M. Fields,et al.  The Cultural Origins of Human Cognition. , 2000 .

[3]  P. L. Adams THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[4]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  D. Teller,et al.  ASSESSMENT OF VISUAL ACUITY IN INFANTS AND CHILDREN; THE ACUITY CARD PROCEDURE , 1986, Developmental medicine and child neurology.

[6]  James L. McClelland,et al.  Letting structure emerge: connectionist and dynamical systems approaches to cognition , 2010, Trends in Cognitive Sciences.

[7]  J. Tenenbaum,et al.  Secret Agents , 2005, Psychological science.

[8]  J. Piaget,et al.  The Origins of Intelligence in Children , 1971 .

[9]  Linda B. Smith,et al.  What's in View for Toddlers? Using a Head Camera to Study Visual Experience. , 2008, Infancy : the official journal of the International Society on Infant Studies.

[10]  R. Baillargeon,et al.  Can a Self-Propelled Box Have a Goal? , 2005, Psychological science.

[11]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[12]  W. Geisler Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[13]  J. Wilder The Origins of Intelligence in Children , 1954 .

[14]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  R. C. Oldfield THE PERCEPTION OF CAUSALITY , 1963 .

[16]  Michael C. Frank,et al.  Measuring the Development of Social Attention Using Free-Viewing. , 2012, Infancy : the official journal of the International Society on Infant Studies.

[17]  A. Woodward Infants selectively encode the goal object of an actor's reach , 1998, Cognition.

[18]  J. Stevenson The cultural origins of human cognition , 2001 .

[19]  Alan F. Smeaton,et al.  Detector adaptation by maximising agreement between independent data sources , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Virginia Slaughter,et al.  When do infants expect hands to be connected to a person? , 2011, Journal of experimental child psychology.

[21]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[22]  T. Poggio,et al.  A parallel algorithm for real-time computation of optical flow , 1989, Nature.

[23]  Martin A Giese,et al.  Neural theory for the perception of causal actions , 2012, Psychological Research.

[24]  G. Rizzolatti,et al.  Mirrors in the Brain: How Our Minds Share Actions and Emotions , 2007 .

[25]  C. Hofsten,et al.  Infants predict other people's action goals , 2006, Nature Neuroscience.

[26]  D R Proffitt,et al.  The development of infant sensitivity to biomechanical motions. , 1985, Child development.

[27]  S. Carey The Origin of Concepts , 2000 .

[28]  D. Muir,et al.  A demonstration of gaze following in 3- to 6-month-olds , 1997 .

[29]  G. Rizzolatti,et al.  View-Based Encoding of Actions in Mirror Neurons of Area F5 in Macaque Premotor Cortex , 2011, Current Biology.

[30]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[31]  Mark H. Johnson,et al.  Biology and Cognitive Development: The Case of Face Recognition , 1993 .

[32]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  D. Muir,et al.  Gaze-following : its development and significance , 2007 .

[34]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[35]  Terrence J Sejnowski,et al.  Foundations for a New Science of Learning , 2009, Science.

[36]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[37]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[38]  P. Green Biology and Cognitive Development: the Case of Face Recognition, Mark H. Johnson, John Morton. Blackwell, Oxford (1991), x, +180. Price £35.00 hardback, £10.95 paperback , 1992 .

[39]  Virginia Slaughter,et al.  Do young infants respond socially to human hands? , 2011, Infant behavior & development.

[40]  Y. Sugita Face perception in monkeys reared with no exposure to faces , 2008, Proceedings of the National Academy of Sciences.

[41]  Richard N Aslin,et al.  How Infants View Natural Scenes Gathered From a Head-Mounted Camera , 2009, Optometry and vision science : official publication of the American Academy of Optometry.

[42]  D. Premack The infant's theory of self-propelled objects , 1990, Cognition.

[43]  J. Bruner,et al.  The capacity for joint visual attention in the infant , 1975, Nature.

[44]  J. Kremenitzer,et al.  Smooth-pursuit eye movements in the newborn infant. , 1979, Child development.

[45]  Sachiko Amano,et al.  Infant shifting attention from an adult’s face to an adult’s hand: a precursor of joint attention , 2004 .

[46]  A. Leslie Spatiotemporal Continuity and the Perception of Causality in Infants , 1984, Perception.

[47]  Shimon Ullman,et al.  The chains model for detecting parts by their context , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Katherine D. Kinzler,et al.  Core knowledge. , 2007, Developmental science.

[49]  Sandy Lovie How the mind works , 1980, Nature.

[50]  Daniel P. Huttenlocher,et al.  Spatial priors for part-based recognition using statistical models , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[51]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).

[52]  J. Sommerville,et al.  Action experience alters 3-month-old infants' perception of others' actions , 2005, Cognition.

[53]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..