Human interaction as a model for spoken dialogue system behaviour

This thesis is a step towards the long-term and high-reaching objec-tive of building dialogue systems whose behaviour is similar to a human dialogue partner. The aim is not to build a machine with ...

[1]  G. Lakoff,et al.  Metaphors We Live By , 1980 .

[2]  A. Postma Detection of errors during speech production: a review of speech monitoring models , 2000, Cognition.

[3]  Reva Freedman,et al.  Learning the Use of Discourse Markers in Tutorial Dialogue for an Intelligent Tutoring System , 2000 .

[4]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2009, EACL.

[5]  Douglas E. Appelt,et al.  Planning English Referring Expressions , 1985, Artif. Intell..

[6]  Jens Edlund,et al.  Applications of distributed dialogue systems : the KTH Connector , 2005 .

[7]  J. Beskow Talking Heads - Models and Applications for Multimodal Speech Synthesis , 2003 .

[8]  Gerard Kempen,et al.  Incremental Sentence Generation: Implications for the Structure of a Syntactic Processor , 1982, COLING.

[9]  Robin J. Lickley,et al.  On not remembering disfluencies , 1997, EUROSPEECH.

[10]  David B. Pisoni,et al.  Perception and Comprehension of Synthetic Speech 1 , 2004 .

[11]  David Schlangen,et al.  Push-to-talk ain't always bad! Comparing Different Interactivity Settings in Task-oriented Dialogue , 2007 .

[12]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[13]  Robin Cohen,et al.  A Computational Theory of the Function of Clue Words in Argument Understanding , 1984, ACL.

[14]  Anthony Jameson,et al.  Interpreting symptoms of cognitive load in speech input , 1999 .

[15]  F. Goldman-Eisler,et al.  Sequential Temporal Patterns in Spontaneous Speech , 1966 .

[16]  M. Selting On the Interplay of Syntax and Prosody in the Constitution of Turn-Constructional Units and Turns in Conversation , 1996 .

[17]  Heather H. Mitchell,et al.  Toward a Taxonomy of a Set of Discourse Markers in Dialog: A Theoretical and Computational Linguistic Account , 2003 .

[18]  P Howell,et al.  The Use of Prosody in Highlighting Alterations in Repairs from Unrestricted Speech , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[19]  A. Ichikawa,et al.  An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.

[20]  Marc Brysbaert,et al.  The Whorfian hypothesis and numerical cognition: is `twenty-four' processed in the same way as `four-and-twenty'? , 1998, Cognition.

[21]  Anne Cutler,et al.  Why is Mrs Thatcher interrupted so often? , 1982, Nature.

[22]  M. Swerts Filled pauses as markers of discourse structure , 1998 .

[23]  Gabriel Skantze,et al.  Incremental Dialogue Processing in a Micro-Domain , 2009, EACL.

[24]  Jens Edlund,et al.  The effects of prosodic features on the interpretation of clarification ellipses , 2005, INTERSPEECH.

[25]  F. Goldman-Eisler Pauses, Clauses, Sentences , 1972, Language and speech.

[26]  Jan-Peter de Holger N. J. Ruiter,et al.  Projecting the End of a Speaker's Turn: A Cognitive Cornerstone of Conversation , 2006 .

[27]  J. E. Tree The Effects of False Starts and Repetitions on the Processing of Subsequent Words in Spontaneous Speech , 1995 .

[28]  Sharon L. Oviatt,et al.  Predicting spoken disfluencies during human-computer interaction , 1995, Comput. Speech Lang..

[29]  Robin J. Lickley,et al.  On not recognizing disfluencies in dialogue , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[30]  David Schlangen,et al.  What we can learn from Dialogue Systems that don't work: On Dialogue Systems as Cognitive Models , 2009 .

[31]  M. F. Garrett,et al.  The Analysis of Sentence Production1 , 1975 .

[32]  D. O’connell Some intentions regarding speaking , 1992 .

[33]  Stacy Marsella,et al.  Serious Games for Language Learning: How Much Game, How Much AI? , 2005, AIED.

[34]  Mattias Heldner,et al.  Potential Benefits of Human-Like Dialogue Behaviour in the Call Routing Domain , 2008, PIT.

[35]  Julia Hirschberg,et al.  Disambiguating Cue Phrases in Text and Speech , 1990, COLING.

[36]  T Shipp,et al.  Minimal reaction times for phonatory initiation. , 1978, Journal of speech and hearing research.

[37]  Timothy W. Bickmore,et al.  Should Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents , 2009, IVA.

[38]  Alexander I. Rudnicky,et al.  Talking to Computers: An Empirical Investigation , 1988, Int. J. Man Mach. Stud..

[39]  J. Beskow,et al.  MushyPeek: A Framework for Online Investigation of Audiovisual Dialogue Phenomena , 2009, Language and speech.

[40]  Gabriel Skantze,et al.  GALATEA: A Discourse Modeller Supporting Concept-Level Error Handling in Spoken Dialogue Systems , 2005, SIGDIAL.

[41]  James C. Lester,et al.  Pronominalization in Generated Discourse and Dialogue , 2002, ACL.

[42]  Roger K. Moore Spoken language processing: Piecing together the puzzle , 2007, Speech Commun..

[43]  Stanley Peters,et al.  Generation of collaborative spoken dialogue contributions in dynamic task environments , 2003 .

[44]  Julia Hirschberg,et al.  Empirical Studies on the Disambiguation of Cue Phrases , 1993, Comput. Linguistics.

[45]  Andreas Stolcke,et al.  Is the speaker done yet? faster and more accurate end-of-utterance detection using prosody , 2002, INTERSPEECH.

[46]  Arne Jönsson,et al.  Distilling dialogues - A method using natural dialogue corpora for dialogue systems development , 2000, ANLP.

[47]  Markus F. Damian,et al.  Time pressure and phonological advance planning in spoken production , 2007 .

[48]  Cecilia E. Ford,et al.  Interaction and grammar: Interactional units in conversation: syntactic, intonational, and pragmatic resources for the management of turns , 1996 .

[49]  Julia Hirschberg,et al.  Communication and prosody: Functional aspects of prosody , 2002, Speech Commun..

[50]  Laurent Karsenty,et al.  Shifting the Design Philosophy of Spoken Natural Language Dialogue: From Invisible to Transparent Systems , 2002, Int. J. Speech Technol..

[51]  Nigel Gilbert,et al.  Simulating speech systems , 1991 .

[52]  Jennifer E. Arnold,et al.  Disfluencies Signal Theee, Um, New Information , 2003, Journal of psycholinguistic research.

[53]  A. Caramazza How many levels of processing are there in lexical access , 1997 .

[54]  Norman M. Fraser,et al.  Effects of system voice quality on user utterances in speech dialogue systems , 1991, EUROSPEECH.

[55]  Goldman-Eisler Frieda A Comparative Study of two Hesitation Phenomena , 1961 .

[56]  A. Lahiri,et al.  Prosodic Units in Speech Production , 1997 .

[57]  T. Pollock,et al.  A Grammar of Motives. , 1945 .

[58]  Koenraad De Smedt,et al.  IPF: an incremental parallel formulator , 1990 .

[59]  Gerard Kempen,et al.  An Incremental Procedural Grammar for Sentence Formulation , 1987, Cogn. Sci..

[60]  M. Corley,et al.  The Influence of Lexical , Conceptual and Planning Based Factors on Disfluency Production , 2006 .

[61]  A HeemanPeter,et al.  Speech repairs, intonational phrases, and discourse markers , 1999 .

[62]  Jens Edlund,et al.  EXPROS: A Toolkit for Exploratory Experimentation with Prosody in Customized Diphone Voices , 2008, PIT.

[63]  M. Pickering,et al.  Why is conversation so easy? , 2004, Trends in Cognitive Sciences.

[64]  Elizabeth Zoltan-Ford,et al.  How to Get People to Say and Type What Computers Can Understand , 1991, Int. J. Man Mach. Stud..

[65]  Andreas Stolcke,et al.  A prosody-based approach to end-of-utterance detection that does not require speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[66]  Susan E. Brennan,et al.  LEXICAL ENTRAINMENT IN SPONTANEOUS DIALOG , 1996 .

[67]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[68]  C. Goodwin Between and within: Alternative sequential treatments of continuers and assessments , 1986 .

[69]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[70]  Elizabeth R. Blacfkmer,et al.  Theories of monitoring and the timing of repairs in spontaneous speech , 1991, Cognition.

[71]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[72]  Arthur C. Graesser,et al.  Intelligent Tutoring Systems with Conversational Dialogue , 2001, AI Mag..

[73]  Masataka Goto,et al.  A real-time filled pause detection system for spontaneous speech recognition , 1999, EUROSPEECH.

[74]  W. Levelt,et al.  Speaking: From Intention to Articulation , 1990 .

[75]  John Local,et al.  Projection and ‘silences’: Notes on phonetic and conversational structure , 1986 .

[76]  Christian A. Müller,et al.  Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study , 2001, User Modeling.

[77]  Mattias Heldner,et al.  Towards human-like spoken dialogue systems , 2008, Speech Commun..

[78]  D. P. Hayes,et al.  Interruption Outcomes and Vocal Amplitude: Explorations in Social Psychophysics. , 1971 .

[79]  J. Bargh,et al.  The perception–behavior expressway: Automatic effects of social perception on social behavior. , 2001 .

[80]  Donald W. Fiske,et al.  Face-to-face interaction: Research, methods, and theory , 1977 .

[81]  W. Levelt,et al.  Monitoring and self-repair in speech , 1983, Cognition.

[82]  H. H. Clark,et al.  Using uh and um in spontaneous speaking , 2002, Cognition.

[83]  Andrew Stern,et al.  Façade: An Experiment in Building a Fully-Realized Interactive Drama , 2003 .

[84]  D. O’connell,et al.  Uh and Um Revisited: Are They Interjections for Signaling Delay? , 2005, Journal of psycholinguistic research.

[85]  D. Donaldson,et al.  It’s the way that you, er, say it: Hesitations in speech affect language comprehension , 2007, Cognition.

[86]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[87]  Ronald Rosenfeld,et al.  Speech Graffiti vs. Natural Language: Assessing the User Experience , 2004, HLT-NAACL.

[88]  Yves Bestgen Segmentation markers as trace and signal of discourse structure , 1998 .

[89]  Martin Corley,et al.  Hesitation Disfluencies in Spontaneous Speech: The Meaning of um , 2008, Lang. Linguistics Compass.

[90]  Joakim Gustafson,et al.  The NICE Fairy-tale Game System , 2004, SIGDIAL Workshop.

[91]  N. Iuppa,et al.  Story and Simulations for Serious Games: Tales from the Trenches , 2006 .

[92]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[93]  Boris E. R. de Ruyter,et al.  Benefits of Social Intelligence in Home Dialogue Systems , 2005, INTERACT.

[94]  James F. Allen,et al.  Identifying Discourse Markers in Spoken Dialog , 1998, ArXiv.

[95]  Jens Edlund,et al.  Robust interpretation in the Higgins spoken dialogue system , 2004 .

[96]  Josef C. Schrock,et al.  Discourse Markers in Spontaneous Speech: Oh What a Difference an Oh Makes , 1999 .

[97]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .

[98]  Anna Hjalmarsson,et al.  Embodied conversational agents in computer assisted language learning , 2009, Speech Commun..

[99]  Tiago Freitas,et al.  Intonation as a cue to turn management in telephone and face-to-face interactions , 2008, Speech Prosody 2008.

[100]  V. Fromkin Speech errors as linguistic evidence , 1976 .

[101]  James F. Allen,et al.  Toward Conversational Human-Computer Interaction , 2001, AI Mag..

[102]  Arne Jönsson,et al.  Wizard of Oz studies -- why and how , 1993, Knowl. Based Syst..

[103]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[104]  C C Oomen,et al.  Effects of Time Pressure on Mechanisms of Speech Production and Self-Monitoring , 2001, Journal of psycholinguistic research.

[105]  G. Kempen,et al.  A dual system for producing self-repairs in spontaneous speech: Evidence from experimentally elicited corrections , 1987, Cognitive Psychology.

[106]  Peter A. Heeman,et al.  Discourse marker use in task-oriented spoken dialog \lambda , 1997, EUROSPEECH.

[107]  Anne Cutler,et al.  Prosodic marking in speech repair , 1983 .

[108]  Li Gong,et al.  Shall we mix synthetic speech and human speech?: impact on users' performance, perception, and attitude , 2001, CHI.

[109]  S. Brennan,et al.  How Listeners Compensate for Disfluencies in Spontaneous Speech , 2001 .

[110]  Ben Shneiderman,et al.  Anthropomorphism: from Eliza to Terminator 2 , 1992, CHI.

[111]  Kåre Sjölander,et al.  An HMM-based system for automatic segmentation and alignment of speech , 2003 .

[112]  Joakim Gustafson,et al.  How do system questions influence lexical choices in user answers? , 1997, EUROSPEECH.

[113]  Justus D. Naumann,et al.  Prototyping: the new paradigm for systems development , 1982 .

[114]  Barbara Hayes-Roth What Makes Characters Seem Life-Like? , 2004, Life-like characters.

[115]  Daniel Schaffer,et al.  The role of intonation as a cue to turn taking in conversation , 1983 .

[116]  Rolf Carlson,et al.  Rule-Based Speech Synthesis , 2008 .

[117]  Mattias Heldner,et al.  Very short utterances in conversation , 2010 .

[118]  Kathleen F. McCoy,et al.  Generating Anaphoric Expressions: Pronoun or Definite Description? , 1999 .

[119]  Brian Butterworth,et al.  Hesitation and semantic planning in speech , 1975 .

[120]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[121]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[122]  Joseph Weizenbaum,et al.  and Machine , 1977 .

[123]  Stephanie Seneff,et al.  Intelligent barge-in in conversational systems , 2000, INTERSPEECH.

[124]  W. Marslen-Wilson,et al.  The temporal structure of spoken language understanding , 1980, Cognition.

[125]  S. Brennan,et al.  THE FEELING OF ANOTHER'S KNOWING : PROSODY AND FILLED PAUSES AS CUES TO LISTENERS ABOUT THE METACOGNITIVE STATES OF SPEAKERS , 1995 .

[126]  Ben Shneiderman,et al.  Looking for the bright side of user interface agents , 1995, INTR.

[127]  G. Jefferson Notes on ‘latency’ in overlap onset , 1986 .

[128]  B. Fraser What are discourse markers , 1999 .

[129]  Helmut Horacek,et al.  A Flexible Shallow Approach to Text Generation , 1998, INLG.

[130]  Keikichi Hirose,et al.  Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners , 2008, Speech Commun..

[131]  Mattias Heldner,et al.  Exploring Prosody in Interaction Control , 2005, Phonetica.

[132]  Marilyn A. Walker,et al.  The role of speech processing in human-computer intelligent communication , 1997, Speech Commun..

[133]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[134]  A. Meyer Lexical Access in Phrase and Sentence Production: Results from Picture–Word Interference Experiments , 1996 .

[135]  David Escudero Mancebo,et al.  Filled Pauses in Speech Synthesis: Towards Conversational Speech , 2007, TSD.

[136]  Giuseppe Riccardi,et al.  Stochastic language adaptation over time and state in natural spoken dialog systems , 2000, IEEE Trans. Speech Audio Process..