Trading Spaces: How Humans and Humanoids Use Speech and Gesture to Give Directions

Humans intuitively accompany direction-giving with gestures. These gestures have been shown to have the same underlying conceptual structure as diagrams and direction-giving language, but the puzzle is how they communicate given that their form is not codified, and may in fact differ from one person or situation to the next. Based on results from a study on language and gesture in direction-giving, we describe a framework to analyze gestural images into semantic units (image description features), and to link these units to morphological features (hand shape, trajectory, etc.). This feature-based framework allows for implementing an integrated microplanner for multimodal directions that derives the form of both natural language and gesture directly from communicative goals. Using this microplanner we developed an embodied conversational agent that can perform appropriate speech and novel gestures in direction-giving conversation with real humans.

[1]  Eva Stopp,et al.  R. Rajagopalan. a Model for Integrated Qualitative Spatial and Dynamic Reasoning Jackendoo. \what" and \where" in Spatial Language and Spatial , 2007 .

[2]  P. N. Johnson-Laird Language and spatial cognition: an interdisciplinary study of prepositions in English: by Annette Herskovits, Cambridge: Cambridge University Press, 1986 , 1989 .

[3]  Johanna D. Moore,et al.  DPOCL: A Principled Approach To Discourse Planning , 1994, INLG.

[4]  Holly A. Taylorandbarbaratversky Spatial Mental Models Derived from Survey and Route Descriptions , 1992 .

[5]  Christopher Habel,et al.  Incremental production of preverbal messages with INC , 2003 .

[6]  Matthew Stone,et al.  Microplanning with Communicative Intentions: The SPUD System , 2001, Comput. Intell..

[7]  M. Denis The description of routes : A cognitive approach to the production of spatial discourse , 1997 .

[8]  D. McNeill,et al.  Speech-gesture mismatches: Evidence for one underlying representation of linguistic and nonlinguistic information , 1998 .

[9]  C. Peirce,et al.  Philosophical Writings of Peirce , 1955 .

[10]  Ken Perlin,et al.  Improv: a system for scripting interactive actors in virtual worlds , 1996, SIGGRAPH.

[11]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents , 2004, Comput. Animat. Virtual Worlds.

[12]  Annette Herskovits Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English , 2009 .

[13]  Yang Gao Automatic extraction of spatial location for gesture generation , 2002 .

[14]  Ulrike Gut,et al.  The TASX-environment: an XML-based toolset for time aligned speech corpora , 2002, LREC.

[15]  Johanna D. Moore,et al.  A Media-Independent Content Language for Integrated Text and Graphics Generation , 1998 .

[16]  B. Landau,et al.  Spatial language and spatial cognition , 2013 .

[17]  Karen Emmorey,et al.  Using space to describe space: Perspective in speech, sign, and gesture , 2000, Spatial Cogn. Comput..

[18]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[19]  Massimo Poesio,et al.  Semantic Ambiguity and Perceived Ambiguity , 1995, ArXiv.

[20]  B. Tversky,et al.  Perspective in Spatial Descriptions , 1996 .

[21]  R. Krauss,et al.  Do conversational hand gestures communicate? , 1991, Journal of personality and social psychology.

[22]  Hao Yan Paired Speech and Gesture Generation in Embodied Conversational Agents , 2000 .

[23]  David R. Traum,et al.  Embodied agents for multi-party dialogue in immersive virtual worlds , 2002, AAMAS '02.

[24]  Leonard Talmy,et al.  How Language Structures Space , 1983 .

[25]  David D. McDonald,et al.  Salience: The Key to the Selection Problem in Natural Language Generation , 1982, ACL.

[26]  B. Landau,et al.  “What” and “where” in spatial language and spatial cognition , 1993 .

[27]  Justine Cassell,et al.  Knowledge Representation for Generating Locating Gestures in Route Directions , 2009, Spatial Language and Dialogue.

[28]  Aravind K. Joshi,et al.  The Relevance of Tree Adjoining Grammar to Generation , 1987 .

[29]  Annette Herskovits,et al.  Language and spatial cognition , 1986 .

[30]  H. Couclelis VERBAL DIRECTIONS FOR WAY-FINDING: SPACE, COGNITION, AND LANGUAGE , 1996 .

[31]  Sandra C. Lozano Communicative Gestures Benefit Communicators , 1992 .

[32]  Pat Hanrahan,et al.  Identification and validation of cognitive design principles for automated generation of assembly instructions , 2004, AVI.

[33]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[34]  Pat Hanrahan,et al.  Cognitive Design Principles for Visualizations: Revealing and Instantiating , 2003 .

[35]  Norman I. Badler,et al.  The EMOTE model for effort and shape , 2000, SIGGRAPH.

[36]  Jacob Hoeksema Review van "Kees van Deemter and Stanley Peters (eds.), 'Semantic ambiguity and underspecification'. (CSLI Lecture Notes, No. 55) Stanford: CSLI Publications, 1996 , 1998 .

[37]  Péter Szigetvári,et al.  What and When? , 2019, Inauguration and Liturgical Kingship in the Long Twelfth Century.

[38]  James C. Lester,et al.  Generating Coordinated Natural Language and 3D Animations for Complex Spatial Explanations , 1998, AAAI/IAAI.

[39]  Barbara Tversky,et al.  Characterizing Diagrams Produced by Individuals and Dyads , 2004, Spatial Cognition.

[40]  Beth Levy,et al.  Conceptual Representations in Lan-guage Activity and Gesture , 1980 .

[41]  Johanna D. Moore,et al.  Saying it in graphics: from intentions to visualizations , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[42]  Hao Yan,et al.  Coordination and context-dependence in the generation of embodied conversation , 2000, INLG.

[43]  L. Talmy Toward a Cognitive Semantics , 2003 .

[44]  F. D. Saussure Cours de linguistique générale , 1924 .

[45]  Stefan Kopp,et al.  Towards integrated microplanning of language and iconic gesture for multimodal output , 2004, ICMI '04.

[46]  Kenneth D. Forbus Qualitative Reasoning About Space and Motion , 1983 .

[47]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[48]  Randall W. Hill,et al.  Toward a New Generation of Virtual Humans for Interactive Experiences , 2002, IEEE Intell. Syst..