The Embodied Crossmodal Self Forms Language and Interaction: A Computational Cognitive Review

Human language is inherently embodied and grounded in sensorimotor representations of the self and the world around it. This suggests that the body schema and ideomotor action-effect associations play an important role in language understanding, language generation, and verbal/physical interaction with others. There are computational models that focus purely on non-verbal interaction between humans and robots, and there are computational models for dialog systems that focus only on verbal interaction. However, there is a lack of research that integrates these approaches. We hypothesize that the development of computational models of the self is very appropriate for considering joint verbal and physical interaction. Therefore, they provide the substantial potential to foster the psychological and cognitive understanding of language grounding, and they have significant potential to improve human-robot interaction methods and applications. This review is a first step toward developing models of the self that integrate verbal and non-verbal communication. To this end, we first analyze the relevant findings and mechanisms for language grounding in the psychological and cognitive literature on ideomotor theory. Second, we identify the existing computational methods that implement physical decision-making and verbal interaction. As a result, we outline how the current computational methods can be used to create advanced computational interaction models that integrate language grounding with body schemas and self-representations.

[1]  J. Piaget The Language and Thought of the Child , 1927 .

[2]  R. Morrison Mind, Self and Society from the Standpoint of a Social Behaviorist , 1936 .

[3]  L. Vygotsky Play and Its Role in the Mental Development of the Child , 1967 .

[4]  A. Paivio Mental imagery in associative learning and memory , 1969 .

[5]  P. Anderson More is different. , 1972, Science.

[6]  J. Bruner,et al.  The role of tutoring in problem solving. , 1976, Journal of child psychology and psychiatry, and allied disciplines.

[7]  J. Belsky,et al.  From exploration to play: A cross-sectional study of infant free play behavior. , 1981 .

[8]  S. Waxman,et al.  Words as Invitations to Form Categories: Evidence from 12- to 13-Month-Old Infants , 1995, Cognitive Psychology.

[9]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[10]  J. Moran-Ellis The ambiguity of play , 1998 .

[11]  M. Arbib,et al.  Language within our grasp , 1998, Trends in Neurosciences.

[12]  G. Lakoff,et al.  Philosophy in the flesh : the embodied mind and its challenge to Western thought , 1999 .

[13]  Lokendra Shastri,et al.  Recruitment of binding and binding-error detector circuits via long-term potentiation , 1999, Neurocomputing.

[14]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[15]  R. Baillargeon Infants physical knowledge: Of acquired expectations and core principles , 2001 .

[16]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[17]  J. Mandler Thought before language , 2004, Trends in Cognitive Sciences.

[18]  J. Feldman,et al.  Embodied meaning in a neural theory of language , 2004, Brain and Language.

[19]  Nicholas P. Holmes,et al.  The body schema and multisensory representation(s) of peripersonal space , 2004, Cognitive Processing.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Julia E. Meyers-Manor The genesis of animal play: Testing the limits , 2005 .

[22]  Jerome A. Feldman,et al.  From Molecule to Metaphor - A Neural Theory of Language , 2006 .

[23]  Jeffrey M. Zacks,et al.  Event perception: a mind-brain perspective. , 2007, Psychological bulletin.

[24]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[25]  Pierre-Yves Oudeyer,et al.  In Search of the Neural Circuits of Intrinsic Motivation , 2007, Front. Neurosci..

[26]  L. Barsalou Grounded cognition. , 2008, Annual review of psychology.

[27]  Rolf A. Zwaan,et al.  Embodied Language: A Review of the Role of the Motor System in Language Comprehension , 2008, Quarterly journal of experimental psychology.

[28]  L. Steels The symbol grounding problem has been solved, so what’s next? , 2008 .

[29]  Stefan Wermter,et al.  Multimodal communication in animals, humans and robots: An introduction to perspectives in brain-inspired informatics , 2009, Neural Networks.

[30]  N. Aksan,et al.  Symbolic interaction theory , 2009 .

[31]  Karl J. Friston The free-energy principle: a rough guide to the brain? , 2009, Trends in Cognitive Sciences.

[32]  Peter Ford Dominey,et al.  A cognitive neuroscience perspective on embodied language for human–robot cooperation , 2010, Brain and Language.

[33]  Luke S. Zettlemoyer,et al.  Reading between the Lines: Learning to Map High-Level Instructions to Commands , 2010, ACL.

[34]  E. J. Capaldi,et al.  A review of contemporary ideomotor theory. , 2010, Psychological bulletin.

[35]  Alejandro Hernández Arieta,et al.  Body Schema in Robotics: A Review , 2010, IEEE Transactions on Autonomous Mental Development.

[36]  Roel M. Willems,et al.  Body-specific representations of action verbs: Neural evidence from right- and left-handers , 2009, NeuroImage.

[37]  D. Parisi,et al.  Towards a Vygotskyan cognitive robotics: The role of language as a cognitive tool , 2011 .

[38]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[39]  Luc Steels,et al.  The Grounded Naming Game , 2012 .

[40]  Luke S. Zettlemoyer,et al.  A Joint Model of Language and Perception for Grounded Attribute Learning , 2012, ICML.

[41]  M. Kiefer,et al.  Conceptual representations in mind and brain: Theoretical developments, current evidence and future directions , 2012, Cortex.

[42]  Luc Steels,et al.  Emergent Action Language on Real Robots , 2012, Language Grounding in Robots.

[43]  Bruno Lara,et al.  Is that me? Sensorimotor learning and self-other distinction in robotics , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[44]  Thomas Gamerschlag Frames and concept types : applications in language and philosophy , 2014 .

[45]  Ari Weinstein,et al.  Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[46]  M. Paulus How and why do infants imitate? An ideomotor approach to social and imitative learning in infancy (and beyond) , 2014, Psychonomic bulletin & review.

[47]  Thomas Gamerschlag,et al.  Frames and Concept Types , 2014 .

[48]  Mark Turner,et al.  The Origin of Ideas: Blending, Creativity, and the Human Spark , 2014 .

[49]  Mehul Bhatt,et al.  Grounding Dynamic Spatial Relations for Embodied (Robot) Interaction , 2014, PRICAI.

[50]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[51]  I. Momennejad,et al.  FMRI decoding of intentions: Compositionality, hierarchy and prospective memory , 2015, The 3rd International Winter Conference on Brain-Computer Interface.

[52]  Angela D. Friederici,et al.  Grounding language processing on basic neurophysiological principles , 2015, Trends in Cognitive Sciences.

[53]  Stefan Wermter,et al.  Interactive reinforcement learning through speech guidance in a domestic scenario , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[54]  Rafael Peñaloza,et al.  Upward Refinement for Conceptual Blending in Description Logic: An ASP-based Approach and Case Study in EL++ , 2015, JOWO@IJCAI.

[55]  B. Hayden,et al.  The Psychology and Neuroscience of Curiosity , 2015, Neuron.

[56]  J. Tani Exploring Robotic Minds: Actions, Symbols, and Consciousness as Self-Organizing Dynamic Phenomena , 2016 .

[57]  Rafael Peñaloza,et al.  Conceptual Blending in EL++ , 2016, Description Logics.

[58]  Rafael Peñaloza,et al.  Upward refinement operators for conceptual blending in the description logic 𝓔𝓛++$\mathcal {E}\mathcal {L}^{++}$ , 2018, Annals of Mathematics and Artificial Intelligence.

[59]  Bruno Lara,et al.  Exploration Behaviors, Body Representations, and Simulation Processes for the Development of Cognition in Artificial Agents , 2016, Front. Robot. AI.

[60]  F. Pulvermüller,et al.  Conceptual grounding of language in action and perception: a neurocomputational model of the emergence of category specificity and semantic hubs , 2016, The European journal of neuroscience.

[61]  Sean Trott,et al.  Recognizing Intention from Natural Language : Clarification Dialog and Construction Grammar , 2016 .

[62]  Demis Hassabis,et al.  Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[63]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  M. Nour Surfing Uncertainty: Prediction, Action, and the Embodied Mind. , 2017, British Journal of Psychiatry.

[65]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[66]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[67]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[68]  Honglak Lee,et al.  Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.

[69]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[70]  Stephen Clark,et al.  Understanding Early Word Learning in Situated Artificial Agents , 2017 .

[71]  John Langford,et al.  Mapping Instructions and Visual Observations to Actions with Reinforcement Learning , 2017, EMNLP.

[72]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[73]  Nicholas Roy,et al.  Efficient grounding of abstract spatial concepts for natural language interaction with robot platforms , 2018, Int. J. Robotics Res..

[74]  Thien Huu Nguyen,et al.  BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop , 2018, ArXiv.

[75]  Kai-Uwe Kühnberger,et al.  A computational framework for conceptual blending , 2018, Artif. Intell..

[76]  Ruslan Salakhutdinov,et al.  Gated-Attention Architectures for Task-Oriented Language Grounding , 2017, AAAI.

[77]  Matthew J. Hausknecht,et al.  TextWorld: A Learning Environment for Text-based Games , 2018, CGW@IJCAI.

[78]  Zhou Yu,et al.  Sentiment Adaptive End-to-End Dialog Systems , 2018, ACL.

[79]  Regina Barzilay,et al.  Grounding Language for Transfer in Deep Reinforcement Learning , 2017, J. Artif. Intell. Res..

[80]  Guido Schillaci,et al.  An interdisciplinary overview of developmental indices and behavioral measures of the minimal self , 2019, 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[81]  Philipp Schwartenbeck,et al.  Computational mechanisms of curiosity and goal-directed exploration , 2019, eLife.

[82]  Hinrich Schütze,et al.  Extending Machine Language Models toward Human-Level Language Understanding , 2019, ArXiv.

[83]  Steven M Frankland,et al.  Concepts and Compositionality: In Search of the Brain's Language of Thought. , 2020, Annual review of psychology.

[84]  Shimon Whiteson,et al.  A Survey of Reinforcement Learning Informed by Natural Language , 2019, IJCAI.

[85]  Phuong D. H. Nguyen,et al.  From Semantics to Execution: Integrating Action Planning With Reinforcement Learning for Robotic Causal Problem-Solving , 2019, Front. Robot. AI.

[86]  Chelsea Finn,et al.  Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.

[87]  Alexei A. Efros,et al.  Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.

[88]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[89]  Nando de Freitas,et al.  Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.

[90]  Learning Latent Plans from Play , 2019, CoRL.

[91]  C. Weber,et al.  Crossmodal Language Grounding in an Embodied Neurocognitive Model , 2020, Frontiers in Neurorobotics.

[92]  Rosalind W. Picard,et al.  Hierarchical Reinforcement Learning for Open-Domain Dialog , 2019, AAAI.

[93]  Stefan Wermter,et al.  Robotic self-representation improves manipulation skills and transfer learning , 2020, ArXiv.

[94]  Ray Kurzweil,et al.  Multilingual Universal Sentence Encoder for Semantic Retrieval , 2019, ACL.

[95]  Cecilio Angulo,et al.  Social Reinforcement in Artificial Prelinguistic Development: A Study Using Intrinsically Motivated Exploration Architectures , 2020, IEEE Transactions on Cognitive and Developmental Systems.

[96]  A. Gupta,et al.  See, Hear, Explore: Curiosity via Audio-Visual Association , 2020, NeurIPS.

[97]  Martin V. Butz,et al.  Hierarchical principles of embodied reinforcement learning: A review , 2020, ArXiv.

[98]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[99]  Manfred Eppe,et al.  Curious Hierarchical Actor-Critic Reinforcement Learning , 2020, ICANN.

[100]  David Schlangen,et al.  An Overview of Natural Language State Representation for Reinforcement Learning , 2020, ArXiv.

[101]  James L. McClelland,et al.  Environmental drivers of systematicity and generalization in a situated agent , 2019, ICLR.

[102]  Jacob Andreas,et al.  Experience Grounds Language , 2020, EMNLP.

[103]  Hadas Kress-Gazit,et al.  Robots That Use Language , 2020, Annu. Rev. Control. Robotics Auton. Syst..

[104]  Pierre-Yves Oudeyer,et al.  Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration , 2020, NeurIPS.

[105]  Pontus Loviken,et al.  Prerequisites for an Artificial Self , 2020, Frontiers in Neurorobotics.

[106]  James M. Rehg,et al.  Where Are You? Localization from Embodied Dialog , 2020, EMNLP.

[107]  Luke Zettlemoyer,et al.  ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[108]  Pierre-Yves Oudeyer,et al.  Grounding Language to Autonomously-Acquired Skills via Goal Generation , 2021, ICLR.

[109]  Pierre-Yves Oudeyer,et al.  Intelligent Behavior Depends on the Ecological Niche , 2021, Künstliche Intell..

[110]  Corey Lynch,et al.  Language Conditioned Imitation Learning Over Unstructured Data , 2020, Robotics: Science and Systems.

[111]  Stephen Clark,et al.  Grounded Language Learning Fast and Slow , 2020, ICLR.

[112]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[113]  Matthew J. Hausknecht,et al.  ALFWorld: Aligning Text and Embodied Environments for Interactive Learning , 2020, ICLR.

[114]  C. Weber,et al.  Survey on reinforcement learning for language processing , 2021, Artificial Intelligence Review.

[115]  Phuong D. H. Nguyen,et al.  Sensorimotor Representation Learning for an “Active Self” in Robots: A Model Survey , 2020, KI - Künstliche Intelligenz.

[116]  Anja Philippsen,et al.  Goal-Directed Exploration for Learning Vowels and Syllables: A Computational Model of Speech Acquisition , 2021, KI - Künstliche Intelligenz.