Towards Learning to Converse: Structuring Task-Oriented Human-Human Dialogs

Data-driven techniques have influenced many aspects of speech and language processing. Models derived from data are generally more robust than hand-crafted systems since they better reflect the distributions of the phenomena being modeled. With the availability of large spoken dialog corpora, dialog management can now reap the benefit of data-driven techniques. In this paper, we present our view of structuring human-human dialogs in order to learn models for human-machine dialogs. We present the problems of dialog segmentation and dialog act labeling, develop a model for predicting and labeling topic segments and dialog acts and evaluate the model on customer-agent dialogs from a catalog service domain

[1]  Andreas Stolcke,et al.  Statistical language modeling for speech disfluencies , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[3]  Gökhan Tür,et al.  Automated wizard-of-oz for spoken dialogue systems , 2005, INTERSPEECH.

[4]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[5]  Marilyn A. Walker,et al.  Towards Automatic Generation of Natural Language Generation Systems , 2002, COLING.

[6]  Mark G. Core Analyzing and Predicting Patterns of DAMSL Utterance Tags , 2002 .

[7]  Tobias Ruland,et al.  Making the most of multiplicity: a multi-parser multi-strategy architecture for the robust processing of spoken language , 1998, ICSLP.

[8]  Bangalore Srinivas A lightweight dependency analyzer for partial parsing , 2000 .

[9]  Tomek Strzalkowski,et al.  Data-Driven Strategies for an Automated Dialogue System , 2004, ACL.

[10]  Andreas Stolcke,et al.  AUTOMATIC DIALOG ACT LABELING WITH MINIMAL SUPERVISION , 2008 .

[11]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[12]  Ken Samuel,et al.  Computing Dialogue Acts from Features with Transformation-Based Learning , 1998, ArXiv.

[13]  Gwyneth Doherty-Sneddon,et al.  The Reliability of a Dialogue Structure Coding Scheme , 1997, CL.

[14]  John Bear,et al.  Integrating Multiple Knowledge Sources for Detection and Correction of Repairs in Human-Computer Dialog , 1992, ACL.

[15]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[16]  Elmar Nöth,et al.  Automatic classification of dialog acts with semantic classification trees and polygrams , 1995, Learning for Natural Language Processing.

[17]  Eugene Charniak,et al.  Edit Detection and Parsing for Transcribed Speech , 2001, NAACL.

[18]  Srinivas Bangalore,et al.  A lightweight dependency analyzer for partial parsing , 2000, Natural Language Engineering.

[19]  Lenhart K. Schubert,et al.  A Syntactic Framework for Speech Repairs and Other Disruptions , 1999, ACL.

[20]  Jennifer Chu-Carroll,et al.  A Statistical Model for Discourse Act Recognition in Dialogue Interactions , 1998 .

[21]  Eric Fosler-Lussier,et al.  Discourse Segmentation of Multi-Party Conversation , 2003, ACL.

[22]  Aravind K. Joshi,et al.  An Introduction to Tree Adjoining Grammar , 1987 .

[23]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[24]  Rong Zhang,et al.  Is this conversation on track? , 2001, INTERSPEECH.

[25]  Srinivas Bangalore,et al.  Extracting clauses in dialogue corpora: Applications to spoken language understanding , 2004 .

[26]  J.G. Wilpon,et al.  Intelligent virtual agents for contact center automation , 2005, IEEE Signal Processing Magazine.

[27]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[28]  Joseph Polifroni,et al.  Towards the automatic generation of mixed-initiative dialogue systems from web content , 2003, INTERSPEECH.

[29]  Steve J. Young,et al.  Talking to machines (statistically speaking) , 2002, INTERSPEECH.

[30]  Gina-Anne Levow,et al.  Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue , 2004, SIGDIAL Workshop.

[31]  Stephanie Seneff A relaxation method for understanding spontaneous speech utterances , 1992 .