Learning to Model Domain-Specific Utterance Sequences for Extractive Summarization of Contact Center Dialogues

This paper proposes a novel extractive summarization method for contact center dialogues. We use a particular type of hidden Markov model (HMM) called Class Speaker HMM (CSHMM), which processes operator/caller utterance sequences of multiple domains simultaneously to model domain-specific utterance sequences and common (domain-wide) sequences at the same time. We applied the CSHMM to call summarization of transcripts in six different contact center domains and found that our method significantly outperforms competitive baselines based on the maximum coverage of important words using integer linear programming.

[1]  Sadaoki Furui,et al.  A new approach to automatic speech summarization , 2003, IEEE Trans. Multim..

[2]  Karthik Visweswariah,et al.  Semi-automated logging of contact center telephone calls , 2008, CIKM '08.

[3]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[4]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[5]  Feifan Liu,et al.  Correlation between ROUGE and Human Evaluation of Extractive Meeting Summaries , 2008, ACL.

[6]  Tanja Schultz,et al.  Dynamic language model adaptation using variational Bayes inference , 2005, INTERSPEECH.

[7]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[8]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[9]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[10]  Raj Reddy,et al.  Automatic Speech Recognition: The Development of the Sphinx Recognition System , 1988 .

[11]  Milos Hauskrecht,et al.  Noisy-OR Component Analysis and its Application to Link Analysis , 2006, J. Mach. Learn. Res..

[12]  Shourya Roy,et al.  A Conversation-Mining System for Gathering Insights to Improve Agent Productivity , 2007, The 9th IEEE International Conference on E-Commerce Technology and The 4th IEEE International Conference on Enterprise Computing, E-Commerce and E-Services (CEC-EEE 2007).

[13]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[14]  Bob Carpenter,et al.  Vector-based Natural Language Call Routing , 1999, Comput. Linguistics.

[15]  Ryuichiro Higashinaka,et al.  Analysis of Listening-Oriented Dialogue for Building Listening Agents , 2009, SIGDIAL Conference.

[16]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[17]  Regina Barzilay,et al.  Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[18]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[19]  Miles Osborne,et al.  Using maximum entropy for sentence extraction , 2002, ACL 2002.

[20]  L. Venkata Subramaniam,et al.  Business Intelligence from Voice of Customer , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[21]  Benoit Favre,et al.  A Scalable Global Model for Summarization , 2009, ILP 2009.

[22]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[23]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[24]  Ryuichiro Higashinaka,et al.  Dialogue Control Algorithm for Ambient Intelligence based on Partially Observable Markov Decision Processes , 2010 .

[25]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.