论文信息 - Towards Mediating Shared Perceptual Basis in Situated Dialogue

Towards Mediating Shared Perceptual Basis in Situated Dialogue

To enable effective referential grounding in situated human robot dialogue, we have conducted an empirical study to investigate how conversation partners collaborate and mediate shared basis when they have mismatched visual perceptual capabilities. In particular, we have developed a graph-based representation to capture linguistic discourse and visual discourse, and applied inexact graph matching to ground references. Our empirical results have shown that, even when computer vision algorithms produce many errors (e.g. 84.7% of the objects in the environment are mis-recognized), our approach can still achieve 66% accuracy in referential grounding. These results demonstrate that, due to its error-tolerance nature, inexact graph matching provides a potential solution to mediate shared perceptual basis for referential grounding in situated interaction.

[1] Emiel Krahmer,et al. Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation , 2010, Empirical Methods in Natural Language Generation.

[2] Terry Winograd,et al. Procedures As A Representation For Data In A Computer Program For Understanding Natural Language , 1971 .

[3] Illah R. Nourbakhsh,et al. Using a robot proxy to create common ground in exploration tasks , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[4] Kristinn R. Thórisson,et al. Simulated Perceptual Grouping: An Application to Human-Computer Interaction , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.

[5] Graeme Hirst,et al. Collaborating on Referring Expressions , 1991, CL.

[6] J. Gregory Trafton,et al. The Role of Spatial Information in Referential Communication: Speaker and Addressee Preferences for Disambiguating Objects , 2007 .

[7] Kees van Deemter. Generating Referring Expressions: Boolean Extensions of the Incremental Algorithm , 2002, CL.

[8] Rohit J. Kate,et al. Learning Language Semantics from Ambiguous Supervision , 2007, AAAI.

[9] Deb Roy,et al. Grounded Semantic Composition for Visual Scenes , 2011, J. Artif. Intell. Res..

[10] Katashi Nagao,et al. Ubiquitous Talker: Spoken Language Interaction with Real World Objects , 1995, IJCAI.

[11] Stefanie Tellex,et al. Learning perceptually grounded word meanings from unaligned parallel data , 2012, Machine Learning.

[12] Nina Dethlefs,et al. Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation , 2011, ACL.

[13] Deb Roy,et al. Mental imagery for a conversational robot , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14] Chris Mellish,et al. A Policy-Based Approach to Context Dependent Natural Language Generation , 2011, ENLG.

[15] King-Sun Fu,et al. Subgraph error-correcting isomorphisms for syntactic pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[16] A. Green,et al. Task-oriented dialogue for CERO: a user-centered approach , 2001, Proceedings 10th IEEE International Workshop on Robot and Human Interactive Communication. ROMAN 2001 (Cat. No.01TH8591).

[17] Oliver Lemon,et al. Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation , 2011, Comput. Speech Lang..

[18] Kees van Deemter,et al. Natural Reference to Objects in a Visual Domain , 2010, INLG.

[19] Herbert H. Clark,et al. Contributing to Discourse , 1989, Cogn. Sci..

[20] Changsong Liu,et al. Collaborative Effort towards Common Ground in Situated Human-Robot Dialogue , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[21] S. Goldin-Meadow,et al. The role of gesture in communication and thinking , 1999, Trends in Cognitive Sciences.

[22] Changsong Liu,et al. Probabilistic Labeling for Efficient Referential Grounding based on Collaborative Discourse , 2014, ACL.

[23] Alexander Koller,et al. Combining symbolic and corpus-based approaches for the generation of successful referring expressions , 2011, ENLG.

[24] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.

[25] Robert Dale,et al. The Use of Spatial Relations in Referring Expression Generation , 2008, INLG.

[26] Sheel Sanjay Dhande,et al. A computational model to connect gestalt perception and natural language , 2003 .

[27] Jean Scholtz,et al. The Peer-to-Peer Human-Robot Interaction Project , 2005 .

[28] Helmut Horacek. Generating Referential Descriptions Under Conditions of Uncertainty , 2005, ENLG.

[29] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[30] Robert Dale,et al. Viewing Referring Expression Generation as Search , 2005, IJCAI.

[31] J. Gregory Trafton,et al. Enabling effective human-robot interaction using perspective-taking in robots , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[32] Philip Edmonds. Collaboration On Reference To Objects That Are Not Mutually Known , 1994, COLING.

[33] Kees van Deemter,et al. Generating Expressions that Refer to Visible Objects , 2013, NAACL.

[34] Rakesh Nagi,et al. Enhancements to high level data fusion using graph matching and state space search , 2010, Inf. Fusion.

[35] R. Moratz,et al. Instruction modes for joint spatial reference between naive users and a mobile robot , 2003, IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003.

[36] Robert Dale,et al. Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[37] Takenobu Tokunaga,et al. Generation of Relative Referring Expressions based on Perceptual Grouping , 2004, COLING.

[38] Abigail Sellen,et al. A comparison of input devices in element pointing and dragging tasks , 1991, CHI.

[39] Justine Cassell,et al. Human conversation as a system framework: designing embodied conversational agents , 2001 .

[40] Deb Roy,et al. Coupling perception and simulation: steps towards conversational robotics , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[41] D. Roy. Grounding words in perception and action: computational insights , 2005, Trends in Cognitive Sciences.

[42] Marjorie Skubic,et al. Spatial language for human-robot dialogs , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[43] Takayuki Kanda,et al. Group attention control for communication robots with Wizard of OZ approach , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[44] Weixiong Zhang,et al. State-Space Search , 1999, Springer New York.

[45] H. H. Clark. Arenas of language use , 1993 .

[46] Reinhard Moratz,et al. Group-based Spatial Reference in Linguistic Human-Robot Interaction , 2003 .

[47] Laura A. Carlson,et al. Grounding spatial language in perception: an empirical and computational investigation. , 2001, Journal of experimental psychology. General.

[48] A. Meyer,et al. Tracking the time course of multidimensional stimulus discrimination: Analyses of viewing patterns and processing times during “same”-“different“ decisions , 2002 .

[49] Kees van Deemter,et al. Two Approaches for Generating Size Modifiers , 2011, ENLG.

[50] S. Levinson. Space in language and cognition: Explorations in cognitive diversity , 2003 .

[51] 付伶俐. 打磨Using Language,倡导新理念 , 2014 .

[52] Nina Dethlefs,et al. Optimising Natural Language Generation Decision Making For Situated Dialogue , 2011, SIGDIAL Conference.

[53] Albert Gatt,et al. Attribute Selection for Referring Expression Generation: New Algorithms and Evaluation Methods , 2008, INLG.

[54] Stefan Weijers. Referring Expressions with Groups as Landmarks , 2011 .

[55] Nick Hawes,et al. Incremental , multi-level processing for comprehending situated dialogue in human-robot interaction , 2007 .

[56] Michelle X. Zhou,et al. A probabilistic approach to reference resolution in multimodal user interfaces , 2004, IUI '04.

[57] Giulio Sandini,et al. Cognitive Systems , 2005 .

[58] C. Roos,et al. Interior Point Methods for Linear Optimization , 2005 .

[59] Stefan Kopp,et al. Referring in Installments: A Corpus Study of Spoken Object References in an Interactive Virtual Environment , 2012, INLG.

[60] Jay G. Wilpon,et al. SAM: a perceptive spoken language-understanding robot , 1992, IEEE Trans. Syst. Man Cybern..

[61] Joyce Yue Chai,et al. Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue , 2010, EMNLP.

[62] D. Byron. Understanding Referring Expressions in Situated Language Some Challenges for Real-World Agents Donna , 2003 .

[63] Albert Gatt,et al. Learning when to point: A data-driven approach , 2014, COLING.

[64] Raymond J. Mooney,et al. Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[65] Susan L. Epstein,et al. Help Me Understand You: Addressing the Speech Recognition Bottleneck , 2009, AAAI Spring Symposium: Agents that Learn from Human Teachers.

[66] Esther Levin,et al. A WOz Variant with Contrastive Conditions , 2006 .

[67] King-Sun Fu,et al. An Image Understanding System Using Attributed Symbolic Representation and Inexact Graph-Matching , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68] Marilyn A. Walker,et al. MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[69] Robert J. Ross,et al. Situated dialogue systems: agency & spatial meaning in task-oriented dialogue , 2012 .

[70] Alexander Koller,et al. Automated Planning for Situated Natural Language Generation , 2010, ACL.

[71] Reinhard Moratz,et al. Spatial Reference in Linguistic Human-Robot Interaction: Iterative, Empirically Supported Development of a Model of Projective Relations , 2006, Spatial Cogn. Comput..

[72] Scott Weinstein,et al. Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[73] Dongho Kim,et al. POMDP-based dialogue manager adaptation to extended domains , 2013, SIGDIAL Conference.

[74] Albert Gatt,et al. What and Where: An Empirical Investigation of Pointing Gestures and Descriptions in Multimodal Referring Actions , 2013, ENLG.

[75] Takenobu Tokunaga,et al. Group-Based Generation of Referring Expressions , 2006, INLG.

[76] Luke S. Zettlemoyer,et al. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[77] Mark Steedman,et al. Lexical Generalization in CCG Grammar Induction for Semantic Parsing , 2011, EMNLP.

[78] Advaith Siddharthan,et al. Generating Referring Expressions in Open Domains , 2004, ACL.

[79] Emiel Krahmer,et al. Computational Generation of Referring Expressions: A Survey , 2012, CL.

[80] Albert Gatt,et al. The TUNA-REG Challenge 2009: Overview and Evaluation Results , 2009, ENLG.

[81] Anette Frank,et al. Computing EM-based Alignments of Routes and Route Directions as a Basis for Natural Language Generation , 2010, COLING.

[82] Dan Klein,et al. A Game-Theoretic Approach to Generating Spatial Descriptions , 2010, EMNLP.

[83] Scott Thomas,et al. Using vision, acoustics, and natural language for disambiguation , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[84] Robert M. Krauss,et al. Effect of referent similarity and communication mode on verbal encoding , 1967 .

[85] Deb Roy,et al. Visual memory augmentation: using eye gaze as an attention filter , 2004, Eighth International Symposium on Wearable Computers.

[86] R. Passonneau. Using Centering to Relax Gricean Informational Constraints on Discourse Anaphoric Noun Phrases , 1996 .

[87] Herbert H. Clark,et al. Grounding in communication , 1991, Perspectives on socially shared cognition.

[88] John D. Kelleher,et al. Incremental Generation of Spatial Referring Expressions in Situated Dialog , 2006, ACL.

[89] Dan Klein,et al. Easy Victories and Uphill Battles in Coreference Resolution , 2013, EMNLP.

[90] King-Sun Fu,et al. A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[91] Kees van Deemter,et al. Generating referring expressions containing quantifiers , 2005 .

[92] Changsong Liu,et al. Towards Situated Dialogue: Revisiting Referring Expression Generation , 2013, EMNLP.

[93] Marilyn A. Walker,et al. Learning Attribute Selections for Non-Pronominal Expressions , 2000, ACL.

[94] King-Sun Fu,et al. A graph distance measure for image analysis , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[95] Luke S. Zettlemoyer,et al. Learning Distributions over Logical Forms for Referring Expression Generation , 2013, EMNLP.

[96] Emiel Krahmer,et al. Graph-Based Generation of Referring Expressions , 2003, CL.

[97] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[98] Mario Vento,et al. Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[99] Ann A. Copestake,et al. Evaluating an open-domain GRE algorithm on closed domains system IDs: CAM-B, CAM-T, CAM-BU and CAM-TU , 2007, MTSUMMIT.

[100] Dan Klein,et al. Optimization, Maxent Models, and Conditional Estimation without Magic , 2003, NAACL.

[101] Takayuki Kanda,et al. Design patterns for sociality in human-robot interaction , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[102] Alois Knoll,et al. The roles of haptic-ostensive referring expressions in cooperative, task-based human-robot dialogue , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[103] Joyce Yue Chai,et al. Integrating word acquisition and referential grounding towards physical world interaction , 2012, ICMI '12.

[104] Ehud Reiter,et al. Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[105] Alexander Koller,et al. Referring Expressions as Formulas of Description Logic , 2008, INLG.

[106] Deb Roy,et al. Learning visually grounded words and syntax of natural spoken language , 2000 .

[107] Nina Dethlefs,et al. Generation of Adaptive Route Descriptions in Urban Environments , 2011, Spatial Cogn. Comput..

[108] Changsong Liu,et al. Modeling Collaborative Referring for Situated Referential Grounding , 2013, SIGDIAL Conference.

[109] PetrouMaria,et al. Structural Matching in Computer Vision Using Probabilistic Relaxation , 1995 .

[110] Chen Yu,et al. A multimodal learning interface for grounding spoken language in sensory perceptions , 2003, ICMI '03.

[111] Christian Chiarcos,et al. Evaluating Salience Metrics for the Context-Adequate Realization of Discourse Referents , 2011, ENLG.

[112] Helen F. Hastie,et al. "The day after the day after tomorrow?" A machine learning approach to adaptive temporal expression generation: training and evaluation with real users , 2011, SIGDIAL Conference.

[113] Michael F. McTear,et al. Book Review: Spoken Dialogue Technology: Toward the Conversational User Interface, by Michael F. McTear , 2002, CL.

[114] Dengsheng Zhang,et al. An integrated approach to shape based image retrieval , 2002 .

[115] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[116] Emiel Krahmer,et al. Does Size Matter - How Much Data is Required to Train a REG Algorithm? , 2011, ACL.

[117] David Schlangen,et al. A Simple Method for Resolution of Definite Reference in a Shared Visual Context , 2008, SIGDIAL Workshop.

[118] Ielka van der Sluis,et al. Evaluating algorithms for the Generation of Referring Expressions using a balanced corpus , 2007, ENLG.

[119] Joyce Yue Chai,et al. Collaborative Models for Referring Expression Generation in Situated Dialogue , 2014, AAAI.

[120] Johanna D. Moore,et al. Report on the First NLG Challenge on Generating Instructions in Virtual Environments (GIVE) , 2009, ENLG.

[121] Emiel Krahmer,et al. A new model for the generation of multimodal referring expressions , 2003 .

[122] Maxine Eskénazi,et al. Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.

[123] Oliver Lemon,et al. Adaptive Referring Expression Generation in Spoken Dialogue Systems: Evaluation with Real Users , 2010, SIGDIAL Conference.

[124] Kees van Deemter,et al. Generating under Global Constraints: The Case of Scripted Dialogue , 2008 .

[125] Yannick Versley,et al. BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[126] Aleksandra Mojsilovic,et al. A computational model for color naming and describing color composition of images , 2005, IEEE Transactions on Image Processing.

[127] Mark Steedman,et al. Combinatory Categorial Grammar , 2011 .

[128] Nina Dethlefs,et al. Combining Hierarchical Reinforcement Learning and Bayesian Networks for Natural Language Generation in Situated Dialogue , 2011, ENLG.

[129] Giorgio Gallo,et al. Directed Hypergraphs and Applications , 1993, Discret. Appl. Math..

[130] Alexander Koller,et al. The GIVE-2 Corpus of Giving Instructions in Virtual Environments , 2010, LREC.

[131] Alex Pentland,et al. Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[132] Dan Klein,et al. Grounding spatial relations for human-robot interaction , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[133] Philip R. Cohen,et al. Referring as a Collaborative Process , 2003 .

[134] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[135] Matthew Stone,et al. Sentence generation as a planning problem , 2007, ACL.

[136] Robert Dale. Generating referring expressions - constructing descriptions in a domain of objects and processes , 1992, ACL-MIT press series in natural language processing.

[137] Matthias Scheutz,et al. First steps toward natural human-like HRI , 2007, Auton. Robots.

[138] Thora Tenbrink,et al. Modelling Illocutionary Structure: Combining Empirical Studies with Formal Model Analysis , 2010, CICLing.

[139] Allison Sauppé,et al. Robot Deictics: How Gesture and Context Shape Referential Communication , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[140] Uwe Reyle,et al. From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[141] Wojciech Palacz. Algebraic hierarchical graph transformation , 2004, J. Comput. Syst. Sci..

[142] Thomas L. Morin,et al. Branch-and-Bound Strategies for Dynamic Programming , 2015, Oper. Res..

[143] Britta Wrede,et al. Spontaneous Speech Understanding for Robust Multi-Modal Human-Robot Communication , 2006, ACL.

[144] Matthias Scheutz,et al. Incremental natural language processing for HRI , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[145] Michelle X. Zhou,et al. Optimization in Multimodal Interpretation , 2004, ACL.

[146] Brian Scassellati,et al. The Oz of Wizard: Simulating the human for interaction research , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[147] Madalina Croitoru,et al. A Conceptual Graph Approach for the Generation of Referring Expressions , 2007, IJCAI.

[148] Hwee Tou Ng,et al. A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[149] Michael Strube,et al. A Machine Learning Approach to Pronoun Resolution in Spoken Dialogue , 2003, ACL.

[150] Roberto Marcondes Cesar Junior,et al. Inexact graph matching for model-based recognition: Evaluation and comparison of optimization algorithms , 2005, Pattern Recognit..

[151] H. Grice. Logic and conversation , 1975 .

[152] Oliver Lemon,et al. Learning and Evaluation of Dialogue Strategies for New Applications: Empirical Methods for Optimization from Small Data Sets , 2011, CL.

[153] Laura Stoia,et al. Noun Phrase Generation for Situated Dialogs , 2006, INLG.

[154] Robert Dale,et al. Generating Referring Expressions Involving Relations , 1991, EACL.

[155] P. Fitts. The information capacity of the human motor system in controlling the amplitude of movement. , 1954, Journal of experimental psychology.

[156] Emiel Krahmer,et al. Cost-based attribute selection for GRE (GRAPH-SC/GRAPH-FP) , 2007 .

[157] Adelheit Stein,et al. Modeling Information-Seeking Dialogues: The Conversational Roles (COR) Model. , 1996 .

[158] Stefano Carpin,et al. USARSim: a robot simulator for research and education , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[159] Changsong Liu,et al. Ambiguities in Spatial Language Understanding in Situated Human Robot Dialogue , 2010, AAAI Fall Symposium: Dialog with Robots.

[160] David R Traum,et al. Towards a Computational Theory of Grounding in Natural Language Conversation , 1991 .

[161] H. H. Clark,et al. Changing Ideas about Reference , 2004 .

[162] Cynthia Breazeal,et al. Crowdsourcing HRI through Online Multiplayer Games , 2010, AAAI Fall Symposium: Dialog with Robots.

[163] Emiel Krahmer,et al. Efficient context-sensitive generation of referring expressions , 2002 .

[164] Joyce Yue Chai,et al. What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces , 2008, IUI '08.

[165] Jayant Krishnamurthy,et al. Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World , 2013, TACL.

[166] Daniel Jurafsky,et al. Learning to Follow Navigational Directions , 2010, ACL.

[167] Changsong Liu,et al. Awareness of Partner ’ s Eye Gaze in Situated Referential Grounding : An Empirical Study , 2011 .

[168] Andrew McCallum,et al. An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[169] N. Otsu. A threshold selection method from gray level histograms , 1979 .

[170] Harry Bunt,et al. Multimodal referece. Studies in automatic generation of multimodal referring expressions , 2000 .

[171] Julie C. Sedivy,et al. Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[172] Mario Vento,et al. How and Why Pattern Recognition and Computer Vision Applications Use Graphs , 2007, Applied Graph Theory in Computer Vision and Pattern Recognition.

[173] David DeVault,et al. An Information-State Approach to Collaborative Reference , 2005, ACL.

[174] Ewan Klein,et al. Natural Language Processing with Python , 2009 .

[175] Alfons Maes,et al. Overspecification facilitates object identification , 2011 .

[176] Matthew R. Walter,et al. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[177] David Schlangen,et al. Markov Logic Networks for Situated Incremental Natural Language Understanding , 2012, SIGDIAL Conference.

[178] Steven Gold,et al. A Graduated Assignment Algorithm for Graph Matching , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[179] Anne H. Anderson,et al. The Hcrc Map Task Corpus , 1991 .

[180] Thora Tenbrink,et al. Identifying Objects on the Basis of Spatial Contrast: An Empirical Study , 2004, Spatial Cognition.

[181] Eric Horvitz,et al. Dialog in the open world: platform and applications , 2009, ICMI-MLMI '09.

[182] Deb K. Roy,et al. Learning visually grounded words and syntax for a scene description task , 2002, Comput. Speech Lang..

[183] Emiel Krahmer,et al. Towards the generation of overspecified multimodal referring expressions , 2005 .

[184] Matthew W. Crocker,et al. Enhancing Referential Success by Tracking Hearer Gaze , 2012, SIGDIAL Conference.

[185] Luke S. Zettlemoyer,et al. Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[186] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[187] William J. Christmas,et al. Structural Matching in Computer Vision Using Probabilistic Relaxation , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[188] Iris Hendrickx,et al. GRAPH: The Costs of Redundancy in Referring Expressions , 2008, INLG.

[189] Paul Piwek. Salience in the generation of multimodal referring acts , 2009, ICMI-MLMI '09.

[190] Reinhard Moratz. Intuitive linguistic Joint Object Reference in Human-Robot Interaction: Human Spatial Reference Systems and Function-Based Categorization for Symbol Grounding , 2006, AAAI.

[191] Deb Roy,et al. Situated Language Understanding as Filtering Perceived Affordances , 2007, Cogn. Sci..

[192] R. Krauss,et al. Concurrent feedback, confirmation, and the encoding of referents in verbal communication. , 1966, Journal of personality and social psychology.

[193] King-Sun Fu,et al. Error-Correcting Isomorphisms of Attributed Relational Graphs for Pattern Analysis , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[194] Ielka van der Sluis,et al. Manual for TUNA Corpus: Referring Expressions in Two Domains , 2006 .

[195] David DeVault,et al. Learning to Interpret Utterances Using Dialogue History , 2009, EACL.

[196] T. Tenbrink,et al. Spatial reference in simulated human-robot interaction involving intrinsically oriented objects , 2007 .

[197] Takenobu Tokunaga,et al. Understanding referring expressions involving perceptual grouping , 2005, 2005 International Conference on Cyberworlds (CW'05).

[198] Mariët Theune,et al. From data to speech : language generation in context , 2000 .

[199] Pierre Lison,et al. Situated Dialogue Processing for Human-Robot Interaction , 2010, Cognitive Systems.

[200] Weixiong Zhang. State-space search - algorithms, complexity, extensions, and applications , 1999 .

[201] Albert Gatt,et al. Structuring Knowledge for Reference Generation: a Clustering Algorithm , 2022 .

[202] Mariët Theune,et al. Report on the Second Second Challenge on Generating Instructions in Virtual Environments (GIVE-2.5) , 2011, ENLG.

[203] C. Fillmore. Towards a Descriptive Framework for Spatial Deixis , 1982 .

[204] Luke S. Zettlemoyer,et al. A Joint Model of Language and Perception for Grounded Attribute Learning , 2012, ICML.

[205] Jennifer Chu-Carroll,et al. Collaborative Response Generation in Planning Dialogues , 1998, Comput. Linguistics.

[206] James M. Keller,et al. Generating Multi-Level Linguistic Spatial Descriptions from Range Sensor Readings Using the Histogram of Forces , 2003, Auton. Robots.

[207] Albert Gatt,et al. The TUNA Challenge 2008: Overview and Evaluation Results , 2008, INLG.