Using Speech-Specific Characteristics for Automatic Speech Summarization

In this thesis we address the challenge of automatically summarizing spontaneous, multi-party spoken dialogues. The experimental hypothesis is that it is advantageous when summarizing such meeting speech to exploit a variety of speech-specific characteristics, rather than simply treating the task as text summarization with a noisy transcript. We begin by investigating which term-weighting metrics are effective for summarization of meeting speech, with the inclusion of two novel metrics designed specifically for multi-party dialogues. We then provide an in-depth analysis of useful multi-modal features for summarization, including lexical, prosodic, speaker, and structural features. A particular type of speech-specific information we explore is the presence of meta comments in meeting speech, which can be exploited to make extractive summaries more high-level and increasingly abstractive in quality. We conduct our experiments on the AMI and ICSI meeting corpora, illustrating how informative utterances can be realized in contrasting ways in differing domains of meeting speech. Our central summarization evaluation is a large-scale extrinsic task, a decision audit evaluation. In this evaluation, we explicitly compare the usefulness of extractive summaries to gold-standard abstracts and a baseline keyword condition for navigating through a large amount of meeting data in order to satisfy a complex information need.

[1]  Takaaki Hori,et al.  Speech summarization using weighted finite-state transducers , 2003, INTERSPEECH.

[2]  Shigeru Masuyama,et al.  SPEECH SUMMARIZATION VIA SENTENCE SHORTENING BASED ON PROSODIC FEATURES , 2003 .

[3]  Dragomir R. Radev,et al.  Experiments in Single and Multi-Document Summarization Using MEAD , 2001 .

[4]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[5]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[6]  Sadaoki Furui,et al.  TWO-STAGE AUTOMATIC SPEECH SUMMARIZATION BY SENTENCE EXTRACTION AND COMPACTION , 2003 .

[7]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[8]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[10]  Steve Renals,et al.  Term-Weighting for Summarization of Multi-party Spoken Dialogues , 2007, MLMI.

[11]  Mark Steedman,et al.  Information-Structural Semantics for English Intonation , 2008 .

[12]  Johanna D. Moore,et al.  Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[15]  Berna Erol,et al.  Portable meeting recorder , 2002, MULTIMEDIA '02.

[16]  Harold Borko,et al.  Abstracting Concepts and Methods , 1975 .

[17]  Julia Hirschberg,et al.  Acoustic indicators of topic segmentation , 1998, ICSLP.

[18]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[19]  Johanna D. Moore,et al.  Automatic Decision Detection in Meeting Speech , 2007, MLMI.

[20]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[21]  Giovanni Guida,et al.  Forward And Backward Reasoning In Automatic Abstracting , 1982, COLING.

[22]  Heidi Christensen,et al.  A Cascaded Broadcast News Highlighter , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Seiji Miike,et al.  Abstract Generation Based on Rhetorical Structure Extraction , 1994, COLING.

[24]  Kenneth Ward Church,et al.  Inverse Document Frequency (IDF): A Measure of Deviations from Poisson , 1995, VLC@ACL.

[25]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[26]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[27]  Donna Harman,et al.  Overview of the First Text REtrieval Conference. , 1993, SIGIR 1993.

[28]  D. Marcu,et al.  Bayesian Summarization at DUC and a Suggestion for Extrinsic Evaluation , .

[29]  Jean Carletta,et al.  The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.

[30]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[31]  Paul Over,et al.  Interactivity at the Text Retrieval Conference (TREC) , 2001, Inf. Process. Manag..

[32]  Martin Jansche,et al.  Information Extraction from Voicemail Transcripts , 2002, EMNLP.

[33]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[34]  Daniel Büring,et al.  Topic and focus : cross-linguistic perspectives on meaning and intonation , 2007 .

[35]  Lynette Hirschman,et al.  Deep Read: A Reading Comprehension System , 1999, ACL.

[36]  Tommi S. Jaakkola,et al.  Using term informativeness for named entity detection , 2005, SIGIR '05.

[37]  Sadaoki Furui,et al.  Automatic speech summarization applied to English broadcast news speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  Jimmy J. Lin,et al.  Will Pyramids Built of Nuggets Topple Over? , 2006, NAACL.

[39]  Stephen E. Robertson,et al.  Microsoft Cambridge at TREC 14: Enterprise Track , 2005, TREC.

[40]  Chris D. Paice,et al.  The automatic generation of literature abstracts: an approach based on the identification of self-indicating phrases , 1980, SIGIR '80.

[41]  James E. Rush,et al.  Automatic abstracting and indexing. II. Production of indicative abstracts by application of contextual inference and syntactic coherence criteria , 1971 .

[42]  Mark T. Maybury,et al.  Generating Summaries from Event Data , 1995, Inf. Process. Manag..

[43]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[44]  Marc Moens,et al.  Argumentative Classification of Extracted Sentences as a First Step Towards Flexible Abstracting , 1999 .

[45]  Jirí Jonos Theory of functional sentence perspective and its application for the purposes of automatic extracting , 1979, Inf. Process. Manag..

[46]  Kathleen R. McKeown,et al.  Summarization Evaluation Methods: Experiments and Analysis , 1998 .

[47]  J. Steinberger,et al.  Using Latent Semantic Analysis in Text Summarization and Summary Evaluation , 2004 .

[48]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[49]  Steve Renals,et al.  DBN Based Joint Dialogue Act Recognition of Multiparty Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[50]  Mari Ostendorf,et al.  TOBI: a standard for labeling English prosody , 1992, ICSLP.

[51]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[52]  Gustave J. Rath,et al.  The formation of abstracts by the selection of sentences , 1961 .

[53]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[54]  Stephen E. Robertson,et al.  Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[55]  Mirella Lapata,et al.  Constraint-Based Sentence Compression: An Integer Programming Approach , 2006, ACL.

[56]  Dominique Estival,et al.  Karen Sparck Jones & Julia R. Galliers, Evaluating Natural Language Processing Systems: An Analysis and Review. Lecture Notes in Artificial Intelligence 1083 , 1998, Machine Translation.

[57]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[58]  Daniel Marcu,et al.  Discourse Trees Are Good Indicators of Importance in Text , 1999 .

[59]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[60]  Alexander H. Waibel,et al.  Minimizing Word Error Rate in Textual Summaries of Spoken Language , 2000, ANLP.

[61]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[62]  Chin-Yew Lin,et al.  Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough? , 2004, NTCIR.

[63]  Ryan T. McDonald Discriminative Sentence Compression with Soft Syntactic Evidence , 2006, EACL.

[64]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[65]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[66]  Betty Ann Mathis Techniques for the Evaluation and Improvement of Computer-Produced Abstracts. , 1972 .

[67]  Thomas P. Moran,et al.  Speaker segmentation for browsing recorded audio , 1995, CHI 95 Conference Companion.

[68]  Gerald Penn,et al.  Summarization of spontaneous conversations , 2006, INTERSPEECH.

[69]  Julia Hirschberg,et al.  The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts , 2000, AAAI/IAAI.

[70]  Steve Whittaker,et al.  Time is of the essence: an evaluation of temporal compression algorithms , 2006, CHI.

[71]  Klaus Zechner,et al.  Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres , 2002, CL.

[72]  Andreas Stolcke,et al.  Observations on overlap: findings and implications for automatic processing of multi-party conversation , 2001, INTERSPEECH.

[73]  Karen Spärck Jones Automatic summarising: factors and directions , 1998, ArXiv.

[74]  Tatsunori Mori,et al.  Information Gain Ratio as Term Weight: The case of Summarization of IR Results , 2002, COLING.

[75]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[76]  Yoshihiko Gotoh,et al.  Towards Speaker Independent Features for Information Extraction from Meeting Audio Data , 2004 .

[77]  Michel Galley,et al.  A Skip-Chain Conditional Random Field for Ranking Meeting Utterances by Importance , 2006, EMNLP.

[78]  Dominic Widdows,et al.  Using Parallel Corpora to enrich Multilingual Lexical Resources , 2002, LREC.

[79]  Matthew Gordon,et al.  Topic and Focus , 2007 .

[80]  Douglas W. Oard,et al.  Extrinsic Evaluation of Automatic Metrics for Summarization , 2004 .

[81]  Konstantinos Koumpis,et al.  Automatic summarization of voicemail messages using lexical and prosodic features , 2005, TSLP.

[82]  Andreas Girgensohn,et al.  Keyframe-Based User Interfaces for Digital Video , 2001, Computer.

[83]  Aaron E. Rosenberg,et al.  SCANMail: browsing and searching speech data by content , 2001, INTERSPEECH.

[84]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[85]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[86]  Chin-Yew Lin,et al.  Automated Text Summarization , 2005, IJCNLP.

[87]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[88]  Heidi Christensen,et al.  Are extractive text summarisation techniques portable to broadcast news? , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[89]  Constantin Orasan,et al.  A Comparison of Summarisation Methods Based on Term Specificity Estimation , 2004, LREC.

[90]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[91]  Tilman Becker,et al.  Combining Multiple Information Layers for the Automatic Generation of Indicative Meeting Abstracts , 2007, ENLG.

[92]  Jian Zhang,et al.  A comparative study on speech summarization of broadcast news and lecture speech , 2007, INTERSPEECH.

[93]  Inderjeet Mani,et al.  Summarization Evaluation: An Overview , 2001, NTCIR.

[94]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[95]  Alex Waibel,et al.  MEETING BROWSER: TRACKING AND SUMMARIZING MEETINGS , 2007 .

[96]  Ellen M. Voorhees,et al.  Overview of the seventh text retrieval conference (trec-7) [on-line] , 1999 .

[97]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[98]  J. McClure,et al.  Time is of the essence , 2004, Archives of Disease in Childhood - Fetal and Neonatal Edition.

[99]  Heidi Christensen,et al.  Multi-stage compaction approach to broadcast news summarisation , 2005, INTERSPEECH.

[100]  Antonio Zamora,et al.  Automatic Abstracting Research at Chemical Abstracts Service , 1975, J. Chem. Inf. Comput. Sci..

[101]  Brigitte Endres-Niggemeyer,et al.  Summarizing information , 1998 .

[102]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[103]  Johanna D. Moore,et al.  Evaluating Automatic Summaries of Meeting Recordings , 2005, IEEvaluation@ACL.

[104]  Inderjeet Mani,et al.  The Tipster Summac Text Summarization Evaluation , 1999, EACL.

[105]  Mike Flynn,et al.  Browsing Recorded Meetings with Ferret , 2004, MLMI.

[106]  Jimmy J. Lin,et al.  Evaluating Summaries and Answers: Two Sides of the Same Coin? , 2005, IEEvaluation@ACL.

[107]  Edward Gibson,et al.  A comparison of inter-transcriber reliability for two systems of prosodic annotation: rap (rhythm and pitch) and toBI (tones and break indices) , 2006, INTERSPEECH.

[108]  Kishore Papineni,et al.  Why Inverse Document Frequency? , 2001, NAACL.

[109]  Wessel Kraaij,et al.  Task based evaluation of exploratory search systems , 2006 .

[110]  Young-In Song,et al.  A Term Weighting Method Based on Lexical Chain for Automatic Summarization , 2004, CICLing.

[111]  Robin Valenza SUMMARISATION OF SPOKEN AUDIO THROUGH INFORMATION EXTRACTION , 1999 .

[112]  Richard M. Schwartz,et al.  A Methodology for Extrinsic Evaluation of Text Summarization: Does ROUGE Correlate? , 2005, IEEvaluation@ACL.

[113]  Jun-ichi Fukumoto,et al.  Automated Summarization Evaluation with Basic Elements. , 2006, LREC.

[114]  Andreas Stolcke,et al.  Direct Modeling of Prosody: An Overview of Applications in Automatic Speech Processing , 2004 .

[115]  Eduard Hovy,et al.  A BE-based Multi-document Summarizer with Query Interpretation , 2005 .

[116]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[117]  Dragomir R. Radev,et al.  Summarization evaluation using relative utility , 2003, CIKM '03.

[118]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[119]  Sadaoki Furui,et al.  Speech Summarization: An Approach through Word Extraction and a Method for Evaluation , 2004, IEICE Trans. Inf. Syst..

[120]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[121]  Julia Hirschberg,et al.  Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization , 2005, INTERSPEECH.

[122]  Aaron E. Rosenberg,et al.  SCANMail: a voicemail interface that makes speech browsable, readable and searchable , 2002, CHI.

[123]  Fernando Pereira,et al.  FINDING INFORMATION IN AUDIO: A NEW PARADIGM FOR AUDIO BROWSING AND RETRIEVAL , 1999 .

[124]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[125]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[126]  Steve Whittaker,et al.  Accessing Multimodal Meeting Data: Systems, Problems and Possibilities , 2004, MLMI.

[127]  George M. Kasper,et al.  The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance , 1992, Inf. Syst. Res..

[128]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[129]  Megumi Kameyama,et al.  Coping with aboutness complexity in information extraction from spoken dialogues , 1994, ICSLP.

[130]  Barry Arons,et al.  SpeechSkimmer: a system for interactively skimming recorded speech , 1997, TCHI.

[131]  Lisa F. Rau,et al.  SCISOR: extracting information from on-line news , 1990, CACM.

[132]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[133]  Geoffrey Zweig,et al.  Information Extraction from Voicemail , 2001, ACL.

[134]  Hans van Halteren,et al.  Evaluating Information Content by Factoid Analysis: Human annotation and stability , 2004, EMNLP.

[135]  David Reitter,et al.  The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space , 2005 .

[136]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[137]  Eugene Charniak,et al.  Edit Detection and Parsing for Transcribed Speech , 2001, NAACL.

[138]  Mark Steedman Topic and Focus: Cross-Linguistic Perspectives on Meaning and Intonation , 2007 .

[139]  Seiichi Nakagawa,et al.  Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization , 2007, INTERSPEECH.

[140]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[141]  Steve Whittaker,et al.  A meeting browser evaluation test , 2005, CHI Extended Abstracts.

[142]  Andrew Hickl,et al.  Lite-GISTexter at DUC 2005 , 2005 .

[143]  Jan Robin Rohlicek Gisting Continuous Speech , 1993, HLT.

[144]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[145]  Julia Hirschberg,et al.  From text to speech summarization , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[146]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[147]  Sadaoki Furui,et al.  Automatic speech summarization based on word significance and linguistic likelihood , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[148]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.

[149]  Steve Whittaker,et al.  Design and evaluation of systems to support interaction capture and retrieval , 2008, Personal and Ubiquitous Computing.

[150]  Julia Hirschberg,et al.  A Speech-First Model for Repair Detection and Correction , 1993, HLT.

[151]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[152]  Ani Nenkova,et al.  The Pyramid Method: Incorporating human content selection variation in summarization evaluation , 2007, TSLP.

[153]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[154]  Norbert Reithinger,et al.  Summarizing Multilingual Spoken Negotiation Dialogues , 2000, ACL.

[155]  Johanna D. Moore,et al.  AUTOMATIC TOPIC SEGMENTATION AND LABELING IN MULTIPARTY DIALOGUE , 2006, 2006 IEEE Spoken Language Technology Workshop.