Developmental Word Acquisition and Grammar Learning by Humanoid Robots Through a Self-Organizing Incremental Neural Network

We present a new approach for online incremental word acquisition and grammar learning by humanoid robots. Using no data set provided in advance, the proposed system grounds language in a physical context, as mediated by its perceptual capacities. It is carried out using show-and-tell procedures, interacting with its human partner. Moreover, this procedure is open-ended for new words and multiword utterances. These facilities are supported by a self-organizing incremental neural network, which can execute online unsupervised classification and topology learning. Embodied with a mental imagery, the system also learns by both top-down and bottom-up processes, which are the syntactic structures that are contained in utterances. Thereby, it performs simple grammar learning. Under such a multimodal scheme, the robot is able to describe online a given physical context (both static and dynamic) through natural language expressions. It can also perform actions through verbal interactions with its human partner.

[1]  Luc Steels,et al.  Grounding adaptive language games in robotic agents , 1997 .

[2]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[3]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[4]  Sebastian Thrun,et al.  LEARNING MORE FROM LESS DATA: EXPERIMENTS WITH LIFELONG ROBOT LEARNING , 1996 .

[5]  Deb Roy,et al.  Mental imagery for a conversational robot , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Sebastian Thrun,et al.  Learning One More Thing , 1994, IJCAI.

[7]  Luc Steels,et al.  Shared grounding of event descriptions by autonomous robots , 2003, Robotics Auton. Syst..

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[10]  N. Iwahashi,et al.  Active and unsupervised learning for spoken word acquisition through a multimodal interface , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).

[11]  Sven Wachsmuth,et al.  Integration of Vision and Speech Understanding Using Bayesian Networks , 2000 .

[12]  J. Siskind A computational study of cross-situational techniques for learning word-to-meaning mappings , 1996, Cognition.

[13]  L. R. Rabiner,et al.  A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[14]  Paul Vogt,et al.  The emergence of compositional structures in perceptually grounded language games , 2005, Artif. Intell..

[15]  Osamu Hasegawa,et al.  Developmental Word Grounding Through a Growing Neural Network With a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[17]  L. Steels,et al.  Crucial factors in the origins of word-meaning , 2000 .

[18]  Shen Furao,et al.  An incremental network for on-line unsupervised classification and topology learning , 2006, Neural Networks.

[19]  Terry Regier,et al.  The Human Semantic Potential: Spatial Language and Constrained Connectionism , 1996 .

[20]  Chen Yu,et al.  On the Integration of Grounding Language and Learning Objects , 2004, AAAI.

[21]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[22]  Naoto Iwahashi,et al.  Language acquisition through a human-Crobot interface by combining speech, visual, and behavioral information , 2003, Inf. Sci..