Language acquisition through a human-Crobot interface by combining speech, visual, and behavioral information

This paper describes new language-processing methods suitable for human-robot interfaces. These methods enable a robot to learn linguistic knowledge from scratch in unsupervised ways. The learning is done through statistical optimization in the process of human-robot communication, combining speech, visual, and behavioral information in a probabilistic framework. The linguistic knowledge learned includes speech units like phonemes, lexicon, and grammar, and is represented by a graphical model that includes hidden Markov models. In experiments, a robot was eventually able to understand utterances according to given situations, and act appropriately.

[1]  M. R. Manzini Learnability and Cognition , 1991 .

[2]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[3]  Deb Roy,et al.  Integration of speech and vision using mutual information , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  Naoto Iwahashi,et al.  Language acquisition through a human-robot interface , 2000, INTERSPEECH.

[5]  A.L. Gorin,et al.  An experiment in spoken language acquisition , 1992, IEEE Trans. Speech Audio Process..

[6]  Stephen E. Levinson,et al.  Adaptive acquisition of language , 1991 .

[7]  M. Brent Advances in the computational study of language acquisition , 1996, Cognition.

[8]  J. Siskind A computational study of cross-situational techniques for learning word-to-meaning mappings , 1996, Cognition.