Contribution tracking: participating in task-oriented dialogue under uncertainty

The contribution of this dissertation is to show how interlocutors in dialogue can reason probabilistically about natural language interpretation, dialogue state (context), and natural language generation in a way that is consistent with three fundamental claims made by mainstream theories of pragmatic reasoning in human-human dialogue: (1) interlocutors track and exploit the evolving context to coordinate their individual contributions; (2) the current context depends on what the previous utterances of both interlocutors have meant (contributed); (3) what a speaker can recognizably mean (contribute) by a specific choice of words depends on the current context. Mainstream pragmatic theories depend on these assumptions to explain how a speaker can make linguistic choices that the hearer will interpret as intended, but these theories do not lend themselves to straightforward probabilistic reasoning. Engineering approaches to building dialogue systems implement straightforward probabilistic reasoning, but sacrifice one or more (sometimes all) of these fundamental aspects of pragmatic theory in order to do so. This dissertation shows how we can achieve the robustness and data-driven methodology enjoyed by engineering approaches while keeping our interlocutors on a sound theoretical footing, and thereby points the way toward a new class of dialogue systems that are empirically driven, that are robust pragmatic reasoners, and that exhibit human-like sensitivity to the ins and outs of language use in context.

[1]  Ehud Reiter,et al.  NLG vs. Templates , 1995, ArXiv.

[2]  Raymond J. Mooney,et al.  Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[3]  David R. Traum,et al.  Modelling Grounding and Discourse Obligations Using Update Rules , 2000, ANLP.

[4]  Sarit Kraus,et al.  Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[5]  Donna K. Byron,et al.  Resolving Pronominal Reference to Abstract Entities , 2002, ACL.

[6]  David I. Beaver Presupposition and Assertion in Dynamic Semantics , 2001 .

[7]  David DeVault,et al.  Natural Language Generation and Discourse Context: Computing Distractor Sets from the Focus Stack , 2004, FLAIRS.

[8]  Chris Mellish,et al.  Instance-based natural language generation , 2001, HTL 2001.

[9]  Hector J. Levesque,et al.  Performatives in a Rationally Based Speech Act Theory , 1990, ACL.

[10]  C. Raymond Perrault,et al.  A Plan-Based Analysis of Indirect Speech Act , 1980, CL.

[11]  Joel R. Tetreault,et al.  Using Reinforcement Learning to Build a Better Model of Dialogue State , 2006, EACL.

[12]  Oliver Lemon,et al.  Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation , 2008, ACL.

[13]  James F. Allen,et al.  An architecture for more realistic conversational systems , 2001, IUI '01.

[14]  Candace L. Sidner,et al.  COLLAGEN: Applying Collaborative Discourse Theory to Human-Computer Interaction , 2001, AI Mag..

[15]  Harry Bunt,et al.  Interaction management functions and context representation requirements , 1996 .

[16]  Irene Heim,et al.  The semantics of definite and indefinite noun phrases : a dissertation , 1982 .

[17]  David DeVault,et al.  Enlightened Update: A Computational Architecture for Presupposition and Other Pragmatic Phenomena , 2006 .

[18]  Joelle Pineau,et al.  Spoken Dialog Management for Robots , 2000, ACL 2000.

[19]  Jiang Hu,et al.  Adaptive language behavior in HCI: how expectations and beliefs about a system affect users' word choice , 2006, CHI.

[20]  Chen Yu,et al.  A multimodal learning interface for grounding spoken language in sensory perceptions , 2003, ICMI '03.

[21]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[22]  David DeVault,et al.  Societal Grounding Is Essential to Meaningful Language Use , 2006, AAAI.

[23]  D. Monderer,et al.  Approximating common knowledge with common beliefs , 1989 .

[24]  James F. Allen,et al.  An architecture for a generic dialogue shell , 2000, Natural Language Engineering.

[25]  Michael Tomasello,et al.  What Makes Human Cognition Unique? From Individual to Shared to Collective Intentionality , 2007 .

[26]  David Lewis Convention: A Philosophical Study , 1986 .

[27]  A. Goldman Theory of Human Action , 1970 .

[28]  Milind Tambe,et al.  Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[29]  H. H. Clark,et al.  Referring as a collaborative process , 1986, Cognition.

[30]  Milind Tambe,et al.  Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..

[31]  Raquel. FernaÌndez Rovira Non-sentential utterances in dialogue : classification, resolution and use , 2006 .

[32]  David DeVault,et al.  Thoughts on FML: Behavior Generation in the Virtual Human Communication Architecture , 2008, AAMAS 2008.

[33]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[34]  Csr Young,et al.  How to Do Things With Words , 2009 .

[35]  James F. Allen,et al.  A Task-Based Evaluation of the TRAINS-95 Dialogue System , 1996, ECAI Workshop on Dialogue Processing in Spoken Language Systems.

[36]  Matthew Stone,et al.  Microplanning with Communicative Intentions: The SPUD System , 2001, Comput. Intell..

[37]  Matthew Purver The Theory and Use of Clarification Requests in Dialogue , 2004 .

[38]  Michael E. Bratman,et al.  Shared Cooperative Activity , 1991 .

[39]  Roser Morante,et al.  A dialogue act based model for context updating , 2007 .

[40]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[41]  David DeVault,et al.  Scorekeeping in an Uncertain Language Game , 2006 .

[42]  David Traum,et al.  Speech Acts for Dialogue Agents , 1999 .

[43]  Graeme Hirst,et al.  Collaborating on Referring Expressions , 1991, CL.

[44]  Roberto Pieraccini,et al.  Using Markov decision process for learning dialogue strategies , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[45]  David DeVault,et al.  Interpreting Vague Utterances in Context , 2004, COLING.

[46]  David R Traum,et al.  Towards a Computational Theory of Grounding in Natural Language Conversation , 1991 .

[47]  David DeVault,et al.  An Information-State Approach to Collaborative Reference , 2005, ACL.

[48]  Philip R. Cohen,et al.  Plans as Complex Mental Attitudes , 2003 .

[49]  Jason Williams Demonstration of a POMDP Voice Dialer , 2008, ACL.

[50]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[51]  Matthew Purver,et al.  Robust language analysis and generation for spoken dialogue systems , 2006 .

[52]  C. Barker The Dynamics of Vagueness , 2002 .

[53]  David DeVault,et al.  Practical Grammar-Based NLG from Examples , 2008, INLG.

[54]  Phil Cohen,et al.  Dialogue modeling , 1997 .

[55]  Jennifer Chu-Carroll,et al.  A Plan-Based Model for Response Generation in Collaborative Task-Oriented Dialogues , 1994, AAAI.

[56]  C. Raymond Perrault,et al.  Speech Acts as a Basis for Understanding Dialogue Coherence , 1978, TINLAP.

[57]  Stephen T. Wu,et al.  A Framework for Fast Incremental Interpretation during Speech Decoding , 2009, Computational Linguistics.

[58]  Alexander I. Rudnicky,et al.  A “K Hypotheses + Other” Belief Updating Model , 2006 .

[59]  Martha E. Pollack,et al.  A Model of Plan Inference That Distinguishes Between the Beliefs of Actors and Observers , 1986, ACL.

[60]  Michael Tomasello,et al.  Reference and attitude in infant pointing. , 2007, Journal of child language.

[61]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.

[62]  Stephen T. Wu,et al.  Exploiting referential context in spoken language interfaces for data-poor domains , 2008, IUI '08.

[63]  James F. Allen,et al.  Toward Conversational Human-Computer Interaction , 2001, AI Mag..

[64]  James F. Allen,et al.  Towards Conversational Human-Computer Interaction , 2000 .

[65]  Matthew Stone,et al.  Communicative Intentions and Conversational Processes in Human-Human and Human-Computer Dialogue , 2002 .

[66]  L. Steels,et al.  coordinating perceptually grounded categories through language: a case study for colour , 2005, Behavioral and Brain Sciences.

[67]  C. Sidner,et al.  Plans for Discourse , 1988 .

[68]  Uwe Reyle,et al.  From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[69]  Matthew Stone Lexicalized Grammar 101 , 2002, ACL 2002.

[70]  David DeVault,et al.  Making Grammar-Based Generation Easier to Deploy in Dialogue Systems , 2008, SIGDIAL Workshop.

[71]  M. Tomasello,et al.  Understanding and sharing intentions: The origins of cultural cognition , 2005, Behavioral and Brain Sciences.

[72]  H. Kamp A Theory of Truth and Semantic Representation , 2008 .

[73]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[74]  György Gergely,et al.  The Development of Understanding Self and Agency , 2007 .

[75]  David DeVault,et al.  A flexible eyetracker for psychological applications , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[76]  Eric Horvitz,et al.  Harnessing Models of Users' Goals to Mediate Clarification Dialog in Spoken Language Systems , 2001, User Modeling.

[77]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[78]  Harry Bunt,et al.  Context and Dialogue Control , 1994 .

[79]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[80]  Paul R. Cohen,et al.  Toward natural language interfaces for robotic agents: grounding linguistic meaning in sensors , 2000, AGENTS '00.

[81]  Paul R. Cohen,et al.  Contentful mental states for robot baby , 2002, AAAI/IAAI.

[82]  H. H. Clark,et al.  Conceptual pacts and lexical choice in conversation. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[83]  Philip Edmonds,et al.  A Computational Model of Collaboration on Reference in Direction-Giving Dialogues , 1993 .

[84]  David R. Traum,et al.  CONVERSATION ACTS IN TASK‐ORIENTED SPOKEN DIALOGUE , 1992, Comput. Intell..

[85]  David DeVault,et al.  Managing ambiguities across utterances in dialogue , 2007 .

[86]  Matthew Stone Specifying Generation of Referring Expressions by Example , 2003 .

[87]  Z. Nadasdy,et al.  Taking the intentional stance at 12 months of age , 1995, Cognition.

[88]  Hector J. Levesque,et al.  Rational interaction as the basis for communication , 2003 .

[89]  David R. Traum,et al.  Conversational Actions and Discourse Situations , 1997, Comput. Intell..

[90]  James F. Allen,et al.  Towards tractable agent-based dialogue , 2005 .

[91]  Mark G. Core,et al.  Coding Dialogs with the DAMSL Annotation Scheme , 1997 .

[92]  Steve Young,et al.  Scaling POMDPs for dialog management with composite summary point-based value iteration (CSPBVI) , 2006 .

[93]  David Traum,et al.  Computational Models of Grounding in Collaborative Systems , 1999 .

[94]  Herbert H. Clark,et al.  Contributing to Discourse , 1989, Cogn. Sci..

[95]  David G. Novick,et al.  An Empirical Model of Acknowledgement for Spoken-Language Systems , 1994, ACL.

[96]  Jeroen Groenendijk,et al.  Formal methods in the study of language , 1983 .

[97]  Philip R. Cohen,et al.  Accommodation, Meaning, and Implicature: Interdisciplinary Foundations for Pragmatics , 2003 .

[98]  M. Tomasello,et al.  Social cognition, joint attention, and communicative competence from 9 to 15 months of age. , 1998, Monographs of the Society for Research in Child Development.

[99]  Geoffrey Zweig,et al.  Learning N-Best Correction Models from Implicit User Feedback in a Multi-Modal Local Search Application , 2008, SIGDIAL Workshop.

[100]  Karen E. Lochbaum,et al.  A Collaborative Planning Model of Intentional Structure , 1998, CL.

[101]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[102]  Harry Bunt,et al.  Dialogue pragmatics and context specification , 2000, Abduction, Belief and Context in Dialogue.

[103]  Hector J. Levesque,et al.  On Acting Together , 1990, AAAI.

[104]  Robin Cooper,et al.  Clarification, Ellipsis, and the Nature of Contextual Updates in Dialogue , 2004 .

[105]  Nicholas R. Jennings,et al.  Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[106]  H. H. Clark Arenas of language use , 1993 .

[107]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[108]  Joseph Y. Halpern,et al.  Knowledge and common knowledge in a distributed environment , 1984, JACM.

[109]  Randall W. Hill,et al.  Toward Virtual Humans , 2006, AI Mag..

[110]  David DeVault,et al.  Domain Inference in Incremental Interpretation , 2003 .

[111]  C. Raymond Perrault,et al.  Analyzing Intention in Utterances , 1986, Artif. Intell..

[112]  Anton Leuski,et al.  From domain specification to virtual humans: an integrated approach to authoring tactical questioning characters , 2008, INTERSPEECH.

[113]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .