A constructivist approach to robot language learning via simulated babbling and holophrase extraction

It is thought that meaning may be grounded in early childhood language learning via the physical and social interaction of the infant with those around him or her, and that the capacity to use words, phrases and their meaning are acquired through shared referential ‘inference’ in pragmatic interactions. In order to create appropriate conditions for language learning by a humanoid robot, it would therefore be necessary to expose the robot to similar physical and social contexts. However in the early stages of language learning it is estimated that a 2-year-old child can be exposed to as many as 7,000 utterances per day in varied contextual situations. In this paper we report on the issues behind and the design of our currently ongoing and forthcoming experiments aimed to allow a robot to carry out language learning in a manner analogous to that in early child development and which effectively ‘short cuts’ holophrase learning. Two approaches are used: (1) simulated babbling through mechanisms which will yield basic word or holophrase structures and (2) a scenario for interaction between a human and the humanoid robot where shared ‘intentional’ referencing and the associations between physical, visual and speech modalities can be experienced by the robot. The output of these experiments, combined to yield word or holophrase structures grounded in the robot's own actions and modalities, would provide scaffolding for further proto-grammatical usage-based learning. This requires interaction with the physical and social environment involving human feedback to bootstrap developing linguistic competencies. These structures would then form the basis for further studies on language acquisition, including the emergence of negation and more complex grammar.

[1]  Rachid Alami,et al.  A methodological approach relating the classification of gesture to identification of human intent in the context of human-robot interaction , 2005, ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005..

[2]  Luc Steels,et al.  How Grammar Emerges to Dampen Combinatorial Search in Parsing , 2006, EELC.

[3]  Ron Borowsky,et al.  Neural Representations of Visual Words and Objects: A Functional MRI Study on the Modularity of Reading and Object Processing , 2007, Brain Topography.

[4]  F. Pulvermüller The Neuroscience of Language , 2003 .

[5]  S. Choi,et al.  The semantic development of negation: a cross-linguistic longitudinal study , 1988, Journal of Child Language.

[6]  Susan Foster-Cohen,et al.  CONSTRUCTING A LANGUAGE: A USAGE-BASED THEORY OF LANGUAGE ACQUISITION , 2004, Studies in Second Language Acquisition.

[7]  J. Werker,et al.  Cross-language speech perception: Evidence for perceptual reorganization during the first year of life , 1984 .

[8]  L. Wittgenstein Philosophical investigations = Philosophische Untersuchungen , 1958 .

[9]  P. Bloom How Children Learn the Meaning of Words and How LSA Does It ( Too ) , 2005 .

[10]  Stevan Harnad The Symbol Grounding Problem , 1999, ArXiv.

[11]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[12]  David Lee,et al.  The art of designing robot faces: dimensions for human-robot interaction , 2006, HRI '06.

[13]  Bénédicte de Boysson-Bardies How language comes to children , 2009 .

[14]  N. A. Mirza Grounded Sensorimotor Interaction Histories for Ontogenetic Development in Robots , 2008 .

[15]  Caroline Lyon,et al.  Using single layer networks for discrete, sequential data: An example from Natural Language Processing , 2005, Neural Computing & Applications.

[16]  Marilyn M. Vihman,et al.  The Evolutionary Emergence of Language: The Role of Mimesis in Infant Language Development: Evidence for Phylogeny? , 2000 .

[17]  James L. Morgan,et al.  Signal to Syntax: An Overview , 1996 .

[18]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[19]  Deb Roy,et al.  Semiotic schemas: A framework for grounding language in action and perception , 2005, Artif. Intell..

[20]  Chrystopher L. Nehaniv,et al.  Grounded Sensorimotor Interaction Histories in an Information Theoretic Metric Space for Robot Ontogeny , 2007, Adapt. Behav..

[21]  Giulio Sandini,et al.  RobotCub: an open framework for research in embodied cognition , 2004, 4th IEEE/RAS International Conference on Humanoid Robots, 2004..

[22]  Alison Wray,et al.  ‘Needs only’ Analysis in Linguistic Ontogeny and Phylogeny , 2007 .

[23]  M. Arbib,et al.  Language within our grasp , 1998, Trends in Neurosciences.

[24]  Chrystopher L. Nehaniv Meaning for observers and agents , 1999, Proceedings of the 1999 IEEE International Symposium on Intelligent Control Intelligent Systems and Semiotics (Cat. No.99CH37014).

[25]  B. de Boer,et al.  The Evolution of Speech , 2006 .

[26]  Angelo Cangelosi,et al.  Emergence of Communication and Language , 2006 .

[27]  V. Braitenberg,et al.  The detection and generation of sequences as a key to cerebellar function: Experiments and theory , 1997, Behavioral and Brain Sciences.

[28]  Chrystopher L. Nehaniv,et al.  Naturally occurring gestures in a human-robot interaction teaching scenario , 2008 .

[29]  Simon Kirby,et al.  Complex Systems in Language Evolution: the Cultural Emergence of Compositional Structure , 2003, Adv. Complex Syst..

[30]  J. Crutchfield Information and Its Metric , 1990 .

[31]  D. Roy Grounding words in perception and action: computational insights , 2005, Trends in Cognitive Sciences.

[32]  Kerstin Dautenhahn,et al.  Self-Imitation and Environmental Scaffolding for Robot Teaching , 2007 .

[33]  Luc Steels,et al.  The Origins of Syntax in Visually Grounded Robotic Agents , 1997, IJCAI.

[34]  Dana H. Ballard,et al.  A multimodal learning interface for grounding spoken language in sensory perceptions , 2004, ACM Trans. Appl. Percept..

[35]  Luc Steels,et al.  The Recruitment Theory of Language Origins , 2007 .

[36]  K. Demuth,et al.  The prosodic structure of early words , 1996 .

[37]  A. Woodward,et al.  Infants' sensitivity to word boundaries in fluent speech , 1996, Journal of Child Language.

[38]  G. Rizzolatti,et al.  Speech listening specifically modulates the excitability of tongue muscles: a TMS study , 2002, The European journal of neuroscience.

[39]  Peter F. MacNeilage,et al.  The Evolutionary Emergence of Language: Evolution of Speech: The Relation Between Ontogeny and Phylogeny , 2000 .

[40]  Angelo Cangelosi,et al.  The Emergence of a 'Language' in an Evolving Population of Neural Networks , 1998, Connect. Sci..

[41]  Dov M. Gabbay,et al.  Dynamic syntax - the flow of language understanding , 2000 .

[42]  Pierre-Yves Oudeyer,et al.  Self-Organization in the Evolution of Speech , 2006, Oxford Studies in the Evolution of Language.

[43]  T. Hofmann Varieties of Meaning. , 1976 .

[44]  A. Wray Protolanguage as a holistic system for social interaction , 1998 .

[45]  Marilyn M. Vihman,et al.  Word Learning and the Origins of Phonological Systems , 2009 .

[46]  Kaspar Meyer,et al.  Behind the looking-glass , 2008, Nature.

[47]  C. Lyon,et al.  DEVELOPING AGENTS THAT CAN SPEAK WITH HUMANS: POINTERS FROM THE EVOLUTION OF LANGUAGE , 2007 .

[48]  Julian M. Pine,et al.  Constructing a Language: A Usage-Based Theory of Language Acquisition. , 2004 .