Noun Phrase Generation for Situated Dialogs

We report on a study examining the generation of noun phrases within a spoken dialog agent for a navigation domain. The task is to provide real-time instructions that direct the user to move between a series of destinations within a large interior space. A subtask within sentence planning is determining what form to choose for noun phrases. This choice is driven by both the discourse history and spatial context features derived from the direction-follower's position, e.g. his view angle, distance from the target referent and the number of similar items in view. The algorithm was developed as a decision tree and its output was evaluated by a group of human judges who rated 62.6% of the expressions generated by the system to be as good as or better than the language originally produced by human dialog partners.

[1]  David D. McDonald Subsequent reference: syntactic and rhetorical constraints , 1978, TINLAP '78.

[2]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[3]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[4]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[5]  Douglas E. Appelt,et al.  A Computational Model of Referring , 1987, IJCAI.

[6]  Bonnie L. Webber,et al.  Discourse Deixis: Reference to Discourse Segments , 1988, ACL.

[7]  Robert Dale,et al.  Cooking Up Referring Expressions , 1989, ACL.

[8]  Graeme Hirst,et al.  Collaborating on Referring Expressions , 1991, CL.

[9]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[10]  Robert Dale,et al.  A Fast Algorithm for the Generation of Referring Expressions , 1992, COLING.

[11]  Jeanette K. Gundel,et al.  Cognitive Status and the Form of Referring Expressions in Discourse , 1993 .

[12]  Gwyneth Doherty-Sneddon,et al.  THE HCRC MAP TASK CORPUS: Natural Dialogue For Speech Recognition , 1993, HLT.

[13]  G. Logan Spatial attention and the apprehension of spatial relations. , 1994, Journal of experimental psychology. Human perception and performance.

[14]  Klaus-Peter Gapp A Computational Model of the Basic Meanings of Graded Composite Spatial Relations in 3D Space , 1994, AGDM.

[15]  Wolfgang Maass,et al.  From visual perception to multimodal communication: Incremental route descriptions , 1994 .

[16]  Klaus-Peter Gapp Basic Meanings of Spatial Relations: Computation and Evaluation in 3D Space , 1994, AAAI.

[17]  G. Logan Linguistic and Conceptual Control of Visual Spatial Attention , 1995, Cognitive Psychology.

[18]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[19]  W. Maass How Spatial Information Connects Visual Perception and Natural Language Generation in Dynamic Environments: Towards a Computational Model , 1995, COSIT.

[20]  Jörg Baus,et al.  Visual Grounding of Route Descriptions in Dynamic Environments , 1995 .

[21]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[22]  Robbert-Jan Beun,et al.  Deictic use of Dutch demonstratives , 1995 .

[23]  Gregory D. Abowd,et al.  Rapid prototyping of mobile context-aware applications: the Cyberguide case study , 1996, MobiCom '96.

[24]  Christine Doran,et al.  Sentence planning as description using tree adjoining grammar , 1997 .

[25]  Eva Stopp,et al.  Time-dependent generation of minimal sets of spatial descriptions , 1998 .

[26]  James F. Allen,et al.  TRIPS: An Integrated Intelligent Problem-Solving Assistant , 1998, AAAI/IAAI.

[27]  Matthew Stone,et al.  Textual Economy Through Close Coupling of Syntax and Semantics , 1998, INLG.

[28]  Amanda Stent,et al.  A Preliminary Model of Centering in Dialog , 1998, ACL.

[29]  Hiroaki Kitano,et al.  RoboCup Rescue: search and rescue in large-scale disasters as a domain for autonomous agents research , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[30]  Alexander H. Waibel,et al.  Smart Sight: a tourist assistant system , 1999, Digest of Papers. Third International Symposium on Wearable Computers.

[31]  Marilyn A. Walker,et al.  The AT&t-DARPA communicator mixed-initiative spoken dialog system , 2000, INTERSPEECH.

[32]  Richard Power,et al.  An integrated framework for text planning and pronominalisation , 2000, INLG.

[33]  K. D. Glover Proximal and distal deixis in negotiation talk , 2000 .

[34]  Guido Bugmann,et al.  Training Personal Robots Using Natural Language Instruction , 2001, IEEE Intell. Syst..

[35]  Wolfgang Wahlster,et al.  SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents , 2001 .

[36]  Chris Mellish,et al.  Corpus-based NP Modifier Generation , 2001, NAACL.

[37]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[38]  Beth Ann Hockey,et al.  Using eye movements to determine referents in a spoken dialogue system , 2001, PUI '01.

[39]  Yukiko I. Nakano,et al.  MACK: Media lab Autonomous Conversational Kiosk , 2002 .

[40]  Emiel Krahmer,et al.  Efficient context-sensitive generation of referring expressions , 2002 .

[41]  James C. Lester,et al.  Pronominalization in Generated Discourse and Dialogue , 2002, ACL.

[42]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[43]  Marjorie Skubic,et al.  Using spatial language in a human-robot dialog , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[44]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[45]  Kees van Deemter Generating Referring Expressions: Boolean Extensions of the Incremental Algorithm , 2002, CL.

[46]  Manpreet Kaur,et al.  Where is "it"? Event Synchronization in Gaze-Speech Input Systems , 2003, ICMI '03.

[47]  Michael Kipp,et al.  Gesture generation by imitation: from human behavior to computer character animation , 2005 .

[48]  John D. Kelleher A perceptually based computational framework for the interpretation of spatial language , 2003 .

[49]  R. Moratz,et al.  Instruction modes for joint spatial reference between naive users and a mobile robot , 2003, IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003.

[50]  Emiel Krahmer,et al.  Graph-Based Generation of Referring Expressions , 2003, CL.

[51]  D. Byron Understanding Referring Expressions in Situated Language Some Challenges for Real-World Agents Donna , 2003 .

[52]  Matthew Stone,et al.  Microplanning with Communicative Intentions: The SPUD System , 2001, Comput. Intell..

[53]  Sabine Geldof,et al.  CORAL: using natural language generation for navigational assistance , 2003 .

[54]  Steven K. Feiner,et al.  Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality , 2003, ICMI '03.

[55]  Susan R. Fussell,et al.  Gestures Over Video Streams to Support Remote Collaboration on Physical Tasks , 2004, Hum. Comput. Interact..

[56]  Massimo Poesio,et al.  Learning to Resolve Bridging References , 2004, ACL.

[57]  Josef van Genabith,et al.  Exploiting Visual Salience for the Generation of Referring Expressions , 2004, FLAIRS Conference.

[58]  Chris Mellish,et al.  Modelling Politeness in Natural Language Generation , 2004, INLG.

[59]  H. H. Clark,et al.  Speaking while monitoring addressees for understanding , 2004 .

[60]  Takenobu Tokunaga,et al.  Generating Referring Expressions Using Perceptual Groups , 2004, INLG.

[61]  Sebastian Varges Spatial Descriptions as Referring Expressions in the MapTask Domain , 2005, ENLG.

[62]  Marilyn A. Walker,et al.  Learning Content Selection Rules for Generating Object Descriptions in Dialogue , 2005, J. Artif. Intell. Res..

[63]  Geert-Jan M. Kruijff,et al.  Context-sensitive Utterance Planning for CCG , 2005, ENLG.

[64]  Vinay Sharma,et al.  Utilizing Visual Attention for Cross-Modal Coreference Interpretation , 2005, CONTEXT.

[65]  John D. Kelleher,et al.  A Context-Dependent Model of Proximity in Physically Situated Environments , 2005 .

[66]  D. Byron,et al.  An Analysis of Proximity Markers in Collaborative Dialogs , 2005 .

[67]  Judith Tonhauser,et al.  TOWARDS AN UNDERSTANDING OF THE MEANING OF NOMINAL TENSE , 2005 .

[68]  Josef van Genabith,et al.  Dynamically structuring, updating and interrelating representations of visual and linguistic discourse context , 2005, Artif. Intell..

[69]  Amanda Stent,et al.  Automatic Evaluation of Referring Expression Generation Using Corpora ∗ , 2005 .

[70]  Ipke Wachsmuth,et al.  Incremental Generation of Multimodal Deixis Referring to Objects , 2005, ENLG.

[71]  Wen-Tai Hsieh,et al.  Semantic Web technologies for context-aware museum tour guide applications , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[72]  Michael White,et al.  Learning to Say It Well: Reranking Realizations by Predicted Synthesis Quality , 2006, ACL.

[73]  Takenobu Tokunaga,et al.  Group-Based Generation of Referring Expressions , 2006, INLG.

[74]  Reinhard Moratz,et al.  Spatial Reference in Linguistic Human-Robot Interaction: Iterative, Empirically Supported Development of a Model of Projective Relations , 2006, Spatial Cogn. Comput..

[75]  Eric Fosler-Lussier,et al.  The OSU Quake 2004 corpus of two-party situated problem-solving dialogs , 2006, LREC.

[76]  Eric Fosler-Lussier,et al.  Sentence Planning for Realtime Navigational Instruction , 2006, HLT-NAACL.

[77]  Josef van Genabith,et al.  A Computational Model of the Referential Semantics of Projective Prepositions , 2006 .

[78]  Jon Oberlander,et al.  Data-Driven Generation of Emphatic Facial Displays , 2006, EACL.

[79]  Anja Belz,et al.  Comparing Automatic and Human Evaluation of NLG Systems , 2006, EACL.

[80]  John D. Kelleher,et al.  Incremental Generation of Spatial Referring Expressions in Situated Dialog , 2006, ACL.

[81]  Robert Dale,et al.  Algorithms for Generating Referring Expressions: Do They Do What People Do? , 2006, INLG.

[82]  Mary Ellen Foster,et al.  Avoiding Repetition in Generated Text , 2007, ENLG.

[83]  T. Tenbrink,et al.  Spatial reference in simulated human-robot interaction involving intrinsically oriented objects , 2007 .

[84]  Massimo Poesio,et al.  Statistical NP Generation: A First Report , 2007 .

[85]  Judith Masthoff,et al.  Generating Referring Expressions: Making Referents Easy to Identify , 2007, Computational Linguistics.

[86]  Matthew Stone,et al.  Sentence generation as a planning problem , 2007, ACL.

[87]  Mariët Theune,et al.  The virtual guide: a direction giving embodied conversational agent , 2007, INTERSPEECH.

[88]  Jon Oberlander,et al.  Generating Instructions in Virtual Environments (GIVE):A Challenge and an Evaluation Testbed for NLG , 2007 .

[89]  Kristinn R. Thórisson,et al.  Simulated Perceptual Grouping: An Application to Human-Computer Interaction , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.