Automatic summarization of voicemail messages using lexical and prosodic features

This aticle presents trainable methods for extracting principal content words from voicemail messages. The short text summaries generated are suitable for mobile messaging applications. The system uses a set of classifiers to identify the summary words with each word described by a vector of lexical and prosodic features. We use an ROC-based algorithm, Parcel, to select input features (and classifiers). We have performed a series of objective and subjective evaluations using unseen data from two different speech recognition systems as well as human transcriptions of voicemail speech.

[1]  Mahesan Niranjan,et al.  Parcel: Feature Subset Selection in Variable Cost Domains , 1998 .

[2]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[3]  Marilyn A. Walker,et al.  Evaluating spoken dialogue agents with PARADISE: Two case studies , 1998, Comput. Speech Lang..

[4]  C. Osgood,et al.  Hesitation Phenomena in Spontaneous English Speech , 1959 .

[5]  Geoffrey Zweig,et al.  Information Extraction from Voicemail , 2001, ACL.

[6]  Mari Ostendorf,et al.  Robust information extraction from automatically generated speech transcriptions , 2000, Speech Commun..

[7]  Konstantinos Koumpis,et al.  The Role of Prosody in a Voicemail Summarization System , 2001 .

[8]  Konstantinos Koumpis,et al.  Extractive summarization of voicemail using lexical and prosodic feature subset selection , 2001, INTERSPEECH.

[9]  Steve Renals,et al.  Confidence measures from local posterior probability estimates , 1999, Comput. Speech Lang..

[10]  Elmar Nöth,et al.  Integrated dialog act segmentation and classification using prosodic features and language models , 1997, EUROSPEECH.

[11]  Elizabeth Shriberg To ‘errrr’ is human: ecology and acoustics of speech disfluencies , 2001, Journal of the International Phonetic Association.

[12]  J. Pierrehumbert The phonology and phonetics of English intonation , 1987 .

[13]  M. Beckman Stress And Non-Stress Accent , 1986 .

[14]  Eyal Yair,et al.  Super resolution pitch determination of speech signals , 1991, IEEE Trans. Signal Process..

[15]  Bhuvana Ramabhadran,et al.  Speech recognition performance on a voicemail transcription task , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[16]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[17]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[18]  Eric Fosler-Lussier,et al.  Speech recognition using on-line estimation of speaking rate , 1997, EUROSPEECH.

[19]  Ralph Weischedel,et al.  NAMED ENTITY EXTRACTION FROM SPEECH , 1998 .

[20]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[21]  George Saon,et al.  Data-driven approach to designing compound words for continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[22]  Karen Spärck Jones,et al.  TREC-6 1997 Spoken Document Retrieval Track Overview and Results , 1997, TREC.

[23]  Julia Hirschberg,et al.  Acoustic indicators of topic segmentation , 1998, ICSLP.

[24]  Konstantinos Koumpis,et al.  Transcription and summarization of voicemail speech , 2000, INTERSPEECH.

[25]  Steve Renals,et al.  Information extraction from broadcast news , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[26]  Damaris M. Ayuso,et al.  Gisting conversational speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Sadaoki Furui,et al.  Improvements in automatic speech summarization and evaluation methods , 2000, INTERSPEECH.

[28]  Aaron E. Rosenberg,et al.  SCANMail: browsing and searching speech data by content , 2001, INTERSPEECH.

[29]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[31]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[32]  Konstantinos Koumpis,et al.  An advanced integrated architecture for wireless voicemail data retrieval , 2001, Proceedings 15th International Conference on Information Networking.

[33]  Klaus Zechner,et al.  Automatic generation of concise summaries of spoken dialogues in unrestricted domains , 2001, SIGIR '01.

[34]  Louis H. Gray Man in Anglo-Saxon and Old High German Bible-Texts , 1945 .

[35]  Daniel P. W. Ellis,et al.  Connectionist speech recognition of Broadcast News , 2002, Speech Commun..

[36]  Antonella Giann Hesitation Phenomena In Sp , 2003 .

[37]  Y. Kato Voice message summary for voice services , 1994, Proceedings of ICSIPNN '94. International Conference on Speech, Image Processing and Neural Networks.

[38]  Konstantinos Koumpis Automatic Categorization of Voicemail Transcripts Using Stochastic Language Models , 2004, TSD.

[39]  Martin Jansche,et al.  Information Extraction from Voicemail Transcripts , 2002, EMNLP.

[40]  Mark Stevenson,et al.  Using Corpus-derived Name Lists for Named Entity Recognition , 2000, ANLP.

[41]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[42]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[43]  Gökhan Tür,et al.  Combining words and prosody for information extraction from speech , 1999, EUROSPEECH.

[44]  M. Sanderson Book Reviews: Advances in Automatic Text Summarization , 2000, Computational Linguistics.

[45]  John Linn,et al.  A variable-rate CELP coder for fast remote voicemail retrieval using a notebook computer , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[46]  Robin Valenza SUMMARISATION OF SPOKEN AUDIO THROUGH INFORMATION EXTRACTION , 1999 .

[47]  Konstantinos Koumpis,et al.  Automatic Voicemail Summarisation for Mobile Messaging , 2002 .