Conversational Robots: Building Blocks for Grounding Word Meaning

How can we build robots that engage in fluid spoken conversations with people, moving beyond canned responses to words and towards actually understanding? As a step towards addressing this question, we introduce a robotic architecture that provides a basis for grounding word meanings. The architecture provides perceptual, procedural, and affordance representations for grounding words. A perceptually-coupled on-line simulator enables sensory-motor representations that can shift points of view. Held together, we show that this architecture provides a rich set of data structures and procedures that provide the foundations for grounding the meaning of certain classes of words.

[1]  J. Feldman,et al.  Karma: knowledge-based active representations for metaphor and aspect , 1997 .

[2]  Patrick Suppes,et al.  Language and Learning for Robots , 1994 .

[3]  B. Landau,et al.  “What” and “where” in spatial language and spatial cognition , 1993 .

[4]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[5]  Deb Roy,et al.  Coupling perception and simulation: steps towards conversational robotics , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[6]  Luc Steels,et al.  Language games for autonomous robots , 2001 .

[7]  Jerome A. Feldman,et al.  When push comes to shove: a computational model of the role of motor control in the acquisition of action verbs , 1997 .

[8]  P. Johnson-Laird,et al.  Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness , 1985 .

[9]  F. Cao,et al.  MIMIC: a robot planning environment integrating real and simulated worlds , 1989, Proceedings. IEEE International Symposium on Intelligent Control 1989.

[10]  Laura A. Carlson,et al.  Grounding spatial language in perception: an empirical and computational investigation. , 2001, Journal of experimental psychology. General.

[11]  Jerry Pratt,et al.  Series elastic actuators for high fidelity force control , 2002 .

[12]  Jay G. Wilpon,et al.  SAM: a perceptive spoken language-understanding robot , 1992, IEEE Trans. Syst. Man Cybern..

[13]  Brian Scassellati,et al.  Theory of Mind for a Humanoid Robot , 2002, Auton. Robots.

[14]  G. Miller,et al.  Language and Perception , 1976 .

[15]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[16]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[17]  C. Breazeal Towards Sociable Robots , 2002 .

[18]  N. Cocchiarella,et al.  Situations and Attitudes. , 1986 .

[19]  Jeffrey Mark Siskind,et al.  Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic , 1999, J. Artif. Intell. Res..

[20]  Alex Pentland,et al.  Learning audio-visual associations using mutual information , 1999, Proceedings Integration of Speech and Image Understanding.

[21]  Cynthia Breazeal,et al.  Toward sociable robots , 2003, Robotics Auton. Syst..

[22]  Deb Roy,et al.  Coupling Robot Perception and Online Simulation for Grounding Conversational Semantics , 2003 .

[23]  J. Lammens A computational model of color perception and color naming , 1995 .

[24]  Helge J. Ritter,et al.  Multi-modal human-machine communication for instructing robot grasping tasks , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Deb Roy,et al.  Grounded Semantic Composition for Visual Scenes , 2011, J. Artif. Intell. Res..

[26]  Udo W. Pooch,et al.  Connecting simulation to the mission operational environment , 2000 .