Foundations for a theory of mind for a humanoid robot

Human social dynamics rely upon the ability to correctly attribute beliefs, goals, and percepts to other people. The set of abilities that allow an individual to infer these hidden mental states based on observed actions and behavior has been called a “theory of mind” (Premack & Woodruff, 1978). Existing models of theory of mind have sought to identify a developmental progression of social skills that serve as the basis for more complex cognitive abilities. These skills include detecting eye contact, identifying self-propelled stimuli, and attributing intent to moving objects. If we are to build machines that interact naturally with people, our machines must both interpret the behavior of others according to these social rules and display the social cues that will allow people to naturally interpret the machine's behavior. Drawing from the models of Baron-Cohen (1995) and Leslie (1994), a novel architecture called embodied theory of mind was developed to link high-level cognitive skills to the low-level perceptual abilities of a humanoid robot. The implemented system determines visual saliency based on inherent object attributes, high-level task constraints, and the attentional states of others. Objects of interest are tracked in real-time to produce motion trajectories which are analyzed by a set of naive physical laws designed to discriminate animate from inanimate movement. Animate objects can be the source of attentional states (detected by finding faces and head orientation) as well as intentional states (determined by motion trajectories between objects). Individual components are evaluated by comparisons to human performance on similar tasks, and the complete system is evaluated in the context of a basic social learning mechanism that allows the robot to mimic observed movements. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  J. A. Fodor A theory of the child's theory of mind , 1992, Cognition.

[2]  D. Sperber Are folk taxonomies “memes”? , 1998, Behavioral and Brain Sciences.

[3]  Joseph A. Driscoll,et al.  A visual attention network for a humanoid robot , 1998, Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190).

[4]  H. Nothdurft The role of features in preattentive vision: Comparison of orientation, motion and color cues , 1993, Vision Research.

[5]  Jerald D. Kralik,et al.  Self-recognition in primates: phylogeny and the salience of species-typical features. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A. Leslie,et al.  Exploration of the autistic child's theory of mind: knowledge, belief, and communication. , 1989, Child development.

[7]  A. Michotte The perception of causality , 1963 .

[8]  Kerstin Dautenhahn,et al.  Getting to know each other - Artificial social intelligence for autonomous robots , 1995, Robotics Auton. Syst..

[9]  A. Whiten,et al.  On the Nature and Evolution of Imitation in the Animal Kingdom: Reappraisal of a Century of Research , 1992 .

[10]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[11]  C. Breazeal Sociable Machines: Expressive Social Ex-change Between Humans and Robots , 2000 .

[12]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[13]  M. Wertheimer Psychomotor Coordination of Auditory and Visual Space at Birth , 1961, Science.

[14]  G. Burghardt,et al.  Predator simulation and duration of death feigning in neonate hognose snakes , 1988, Animal Behaviour.

[15]  Luc Steels,et al.  Emergent adaptive lexicons , 1996 .

[16]  Matthew M. Williamson,et al.  Robot arm control exploiting natural dynamics , 1999 .

[17]  Ronald A. Rensink,et al.  TO SEE OR NOT TO SEE: The Need for Attention to Perceive Changes in Scenes , 1997 .

[18]  Yasuo Kuniyoshi,et al.  Velocity and disparity cues for robust real-time binocular tracking , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  A. Leslie The Perception of Causality in Infants , 1982, Perception.

[20]  T. Sejnowski,et al.  A critique of pure vision , 1993 .

[21]  L. Cohen,et al.  Precursors to infants' perception of the causality of a simple event , 1998 .

[22]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[23]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[24]  S. Cannon,et al.  The mechanical behavior of active human skeletal muscle in small oscillations. , 1982, Journal of biomechanics.

[25]  R. Byrne Imitation without intentionality. Using string parsing to copy the organization of behaviour , 1999, Animal Cognition.

[26]  A. Meltzoff Understanding the Intentions of Others: Re-Enactment of Intended Acts by 18-Month-Old Children. , 1995, Developmental psychology.

[27]  S. Brison The Intentional Stance , 1989 .

[28]  K. Dautenhahn,et al.  Imitation in Animals and Artifacts , 2002 .

[29]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[30]  Y. Bar-Shalom Tracking and data association , 1988 .

[32]  B. Webb,et al.  Can robots make good models of biological behaviour? , 2001, Behavioral and Brain Sciences.

[33]  Giulio Sandini,et al.  Oculo-motor stabilization reflexes: integration of inertial and visual information , 1998, Neural Networks.

[34]  R. Hobson Autism and the Development of Mind , 1995 .

[35]  Joseph E LeDoux,et al.  The Integrated Mind , 1978, Springer US.

[36]  Anne Treisman,et al.  Preattentive processing in vision , 1985, Computer Vision Graphics and Image Processing.

[37]  K. Dautenhahn Ants don't have Friends- Thoughts on Socially Intelligent Agents , 1997 .

[38]  B. Troost,et al.  The ocular motor system , 1981, Annals of neurology.

[39]  F. Zajac Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. , 1989, Critical reviews in biomedical engineering.

[40]  Brian Scassellati,et al.  Alternative Essences of Intelligence , 1998, AAAI/IAAI.

[41]  U.-M. O'Reilly,et al.  Personality through faces for humanoid robots , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[42]  Herman Schwendinger,et al.  The First Edition , 1999 .

[43]  D. Premack,et al.  Intentional communication in the chimpanzee: The development of deception , 1979, Cognition.

[44]  Roberto Cipolla,et al.  Determining the gaze of faces in images , 1994, Image Vis. Comput..

[45]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[46]  Maja J. Matarić,et al.  Behavior-based primitives for articulated control , 1998 .

[47]  E. Feigenbaum,et al.  Computers and Thought , 1963 .

[48]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  G P Bingham,et al.  Dynamics and the orientation of kinematic forms in visual event recognition. , 1995, Journal of experimental psychology. Human perception and performance.

[50]  Pawan Sinha,et al.  Perceiving and recognizing three-dimensional forms , 1996 .

[51]  M. Minsky The Society of Mind , 1986 .

[52]  E I Knudsen,et al.  Vision guides the adjustment of auditory localization in young barn owls. , 1985, Science.

[53]  Jo Liska Cognitive ethology: The minds of other animals , 1997 .

[54]  R. Seyfarth,et al.  How Monkeys See the World , 1990 .

[55]  A. Leslie Spatiotemporal Continuity and the Perception of Causality in Infants , 1984, Perception.

[56]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[57]  A. Mack Inattentional Blindness , 2003 .

[58]  B. Scassellati Imitation and mechanisms of joint attention: a developmental structure for building social skills on a humanoid robot , 1999 .

[59]  Ellin Kofsky Scholnick,et al.  Conceptual Development : Piaget's Legacy , 1999 .

[60]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[61]  G. Lakoff,et al.  Women, Fire, and Dangerous Things: What Categories Reveal about the Mind , 1988 .

[62]  A. Premack,et al.  Causal cognition : a multidisciplinary debate , 1996 .

[63]  Brian Scassellati,et al.  Humanoid Robots: A New Kind of Tool , 2000, IEEE Intell. Syst..

[64]  A. Leslie,et al.  Do six-month-old infants perceive causality? , 1987, Cognition.

[65]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[66]  Robert E Irie,et al.  Robust Sound Localization: An Application of an Auditory Perception System for a Humanoid Robot , 1995 .

[67]  C. Frith,et al.  Interacting minds--a biological basis. , 1999, Science.

[68]  Yehezkel Yeshurun,et al.  Cepstral Filtering on a Columnar Image Architecture: A Fast Algorithm for Binocular Stereo Segmentation , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Brian Scassellati,et al.  Infant-like Social Interactions between a Robot and a Human Caregiver , 2000, Adapt. Behav..

[70]  Rodney A. Brooks,et al.  Building brains for bodies , 1995, Auton. Robots.

[71]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[72]  P. Todd,et al.  How motion reveals intention: Categorizing social interactions , 1999 .

[73]  D. Povinelli,et al.  Young children's understanding of briefly versus extremely delayed images of the self: emergence of the autobiographical stance. , 1998, Developmental psychology.

[74]  C. Moore,et al.  Joint attention : its origins and role in development , 1995 .

[75]  D. Ballard,et al.  Memory Representations in Natural Tasks , 1995, Journal of Cognitive Neuroscience.

[76]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[77]  Judith Felson Duchan,et al.  Assessing children's language in naturalistic contexts , 1983 .

[78]  J. Perner,et al.  Development of theory of mind and executive control , 1999, Trends in Cognitive Sciences.

[79]  J. Gomez Visual behaviour as a window for reading the mind of others in primates. , 1991 .

[80]  Shumeet Baluja,et al.  Non-Intrusive Gaze Tracking Using Artificial Neural Networks , 1993, NIPS.

[81]  A. Diamond Developmental Time Course in Human Infants and Infant Monkeys, and the Neural Bases of, Inhibitory Control in Reaching a , 1990, Annals of the New York Academy of Sciences.

[82]  M. Scaife,et al.  The response to eye-like shapes by birds II. The importance of staring, pairedness and shape , 1976, Animal Behaviour.

[83]  Matthew M. Williamson,et al.  Series elastic actuators , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[84]  Christoph von der Malsburg,et al.  Tracking and learning graphs and pose on image sequences of faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[85]  Harold H. Chaput and Leslie B. Cohen A Model of infant Causal Perception and its Development , 2001 .

[86]  F. Volkmar,et al.  Handbook of Autism and Pervasive Developmental Disorders , 1987 .

[87]  Rodney A. Brooks,et al.  Intelligence Without Reason , 1991, IJCAI.

[88]  Anton Berns,et al.  Separating the wheat from the chaff , 1991, Current Biology.

[89]  S. Baron-Cohen,et al.  Does the autistic child have a “theory of mind” ? , 1985, Cognition.

[90]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[91]  J. Ridley Studies of Interference in Serial Verbal Reactions , 2001 .

[92]  Brian Scassellati A Binocular, Foveated Active Vision System , 1998 .

[93]  Bryan Adams,et al.  Meso : a virtual musculature for humanoid motor control , 2000 .

[94]  Dominic W. Massaro,et al.  Synthesis of visible speech , 1990 .

[95]  J. Osofsky Handbook of infant development , 1979 .

[96]  D. Reiss,et al.  Mirror self-recognition in the bottlenose dolphin: A case of cognitive convergence , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[97]  Dare A. Baldwin,et al.  Infants' contribution to the achievement of joint reference. , 1991, Child development.

[98]  R. Byrne,et al.  Machiavellian intelligence : social expertise and the evolution of intellect in monkeys, apes, and humans , 1990 .

[99]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[100]  Ingemar J. Cox,et al.  An Efficient Implementation of Reid's Multiple Hypothesis Tracking Algorithm and Its Evaluation for the Purpose of Visual Tracking , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[101]  P. Mundy,et al.  The theoretical implications of joint-attention deficits in autism , 1989, Development and Psychopathology.

[102]  W. Ashby Design for a Brain , 1954 .

[103]  ProposalMatthew J. Marjanovi Teach a Robot to Fish... A Thesis Proposal , 2000 .

[104]  Brian Scassellati,et al.  A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.

[105]  D. Povinelli,et al.  Theory of mind: evolutionary history of a cognitive specialization , 1995, Trends in Neurosciences.

[106]  R. Brooks,et al.  The cog project: building a humanoid robot , 1999 .

[107]  J. T. Murphy,et al.  Measurements of human forearm posture viscoelasticity , 1986 .

[108]  Carlos Hitoshi Morimoto,et al.  Pupil detection and tracking using multiple light sources , 2000, Image Vis. Comput..

[109]  J. Fagan Infants' recognition of invariant features of faces , 1976 .

[110]  S. G. Lisberger,et al.  Motor learning in a recurrent network model based on the vestibulo–ocular reflex , 1992, Nature.

[111]  D. Premack The infant's theory of self-propelled objects , 1990, Cognition.

[112]  D. Parmelee,et al.  Handbook of Autism and Pervasive Developmental Disorders, Second Edition , 1998 .

[113]  J. Bruner,et al.  The capacity for joint visual attention in the infant , 1975, Nature.

[114]  Rochel Gelman,et al.  WHAT PRESCHOOLERS KNOW ABOUT ANIMATE AND INANIMATE OBJECTS , 1983 .

[115]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[116]  H. Wimmer,et al.  Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception , 1983, Cognition.

[117]  Janette Atkinson,et al.  Brain development and cognition: a reader edited by Mark H. Johnson, Blackwell Publishers, 1993. £19.99 (xi + 734 pages) ISBN 0 631 18223 3 , 1993, Trends in Neurosciences.

[118]  Aaron Ladd Edsinger A gestural language for a humanoid robot , 2001 .

[119]  E. Bizzi,et al.  Neural, mechanical, and geometric factors subserving arm posture in humans , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[120]  E. Sue Savage-Rumbaugh,et al.  Deception and social manipulation in symbol-using apes. , 1988 .

[121]  Marvin Minsky,et al.  Proposal to ARPA for Research on Artificial Intelligence at MIT, 1970-1971 , 1970 .

[122]  Richard W. Byrne,et al.  Computation and mindreading in primate tactical deception. , 1991 .

[123]  S. Thayer Children's Detection of On-Face and Off-Face Gazes. , 1977 .

[124]  Keiichiro Hoashi,et al.  Humanoid Robots in Waseda University—Hadaly-2 and WABIAN , 2002, Auton. Robots.

[125]  B. Julesz,et al.  Human factors and behavioral science: Textons, the fundamental elements in preattentive vision and perception of textures , 1983, The Bell System Technical Journal.

[126]  Brian Scassellati,et al.  IEEE Intelligent Systems , 2018, Computer.

[127]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[128]  C. Breazeal,et al.  An Ontogenetic Perspective to Scaling Sensorimotor Intelligence , 1996 .

[129]  J. K. Aggarwal,et al.  3D structure reconstruction from an ego motion sequence using statistical estimation and detection theory , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[130]  G. G. Gallop Chimpanzees: self-recognition. , 1970, Science.

[131]  L. Weiskrantz Blindsight : a case study and implications , 1986 .

[132]  Sigmund Freud,et al.  The Ego and the Id , 1923 .

[133]  Giulio Sandini,et al.  A Foveated Retina-Like Sensor Using CCD Technology , 1989, Analog VLSI Implementation of Neural Systems.

[134]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[135]  T. Odlin Women, Fire, and Dangerous Things: What Categories Reveal about the Mind , 1988 .

[136]  R. Byrne,et al.  Machiavellian intelligence II : extensions and evaluations , 1997 .

[137]  Rochel Gelman,et al.  First Principles Organize Attention to and Learning About Relevant Data: Number and the Animate-Inanimate Distinction as Examples , 1990, Cogn. Sci..

[138]  Ben Kröse,et al.  Features and spatial filters , 1988, Nature.

[139]  T. Nummenmaa The language of the face , 1964 .

[140]  Joel L. Davis,et al.  Large-Scale Neuronal Theories of the Brain , 1994 .

[141]  A. Karmiloff-Smith,et al.  Is There a Social Module? Language, Face Processing, and Theory of Mind in Individuals with Williams Syndrome , 1995, Journal of Cognitive Neuroscience.

[142]  Sorin Moga,et al.  From Perception-Action Loops to Imitation Processes: A Bottom-Up Approach of Learning by Imitation , 1998, Appl. Artif. Intell..

[143]  Yasuo Kuniyoshi,et al.  A foveated wide angle lens for active vision , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[144]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[145]  Demetri Terzopoulos,et al.  Techniques for Realistic Facial Modeling and Animation , 1991 .

[146]  L. Herman Vocal, social, and self-imitation by bottlenosed dolphins , 2002 .

[147]  U. Frrrh Autism: explaining the enigma , 1989 .

[148]  Frank C. Keil,et al.  The growth of causal understandings of natural kinds. , 1995 .

[149]  Aude Billard,et al.  Grounding communication in autonomous robots: An experimental study , 1998, Robotics Auton. Syst..

[150]  D. Lewkowicz,et al.  A dynamic systems approach to the development of cognition and action. , 2007, Journal of cognitive neuroscience.

[151]  J. Bruner,et al.  The role of tutoring in problem solving. , 1976, Journal of child psychology and psychiatry, and allied disciplines.

[152]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[153]  J. Murphy,et al.  Measurements of human forearm viscoelasticity. , 1986, Journal of biomechanics.

[154]  F. Heider,et al.  An experimental study of apparent behavior , 1944 .

[155]  Alexander Zelinsky,et al.  Robust Real-Time Face Tracking and Gesture Recognition , 1997, IJCAI.

[156]  John K. Tsotsos Behaviorist Intelligence and the Scaling Problem , 1995, Artif. Intell..

[157]  Patrice D. Tremoulet,et al.  Perceptual causality and animacy , 2000, Trends in Cognitive Sciences.

[158]  Eun-Jung Holden,et al.  A 3D Head Tracker for an Automatic Lipreading System , 2000 .

[159]  G. Butterworth The ontogeny and phylogeny of joint visual attention. , 1991 .

[160]  J. Eiser,et al.  Young children's understanding of smoking. , 1986, Addictive behaviors.

[161]  P. H. Greene,et al.  Why is it easy to control your arms ? , 1982, Journal of motor behavior.

[162]  Richard A. Griggs,et al.  The elusive thematic‐materials effect in Wason's selection task , 1982 .

[163]  A. Bernardino,et al.  Binocular Visual Tracking : Integration of Perception and Control , 1999 .

[164]  A. Leslie Mapping the mind: ToMM, ToBY, and Agency: Core architecture and domain specificity , 1994 .

[165]  Ken Nakayama,et al.  Serial and parallel processing of visual feature conjunctions , 1986, Nature.

[166]  Kerstin Dautenhahn,et al.  Of hummingbirds and helicopters: An algebraic framework for interdisciplinary studies of imitation a , 2000 .

[167]  A. Meltzoff,et al.  Imitation, Memory, and the Representation of Persons. , 1994, Infant behavior & development.

[168]  Susan Carey,et al.  On the origin of causal understanding. , 1995 .

[169]  M. Hauser Costs of deception: cheaters are punished in rhesus monkeys (Macaca mulatta). , 1992, Proceedings of the National Academy of Sciences of the United States of America.