Using prior knowledge to assess relevance in speech summarization

We explore the use of topic-based automatically acquired prior knowledge in speech summarization, assessing its influence throughout several term weighting schemes. All information is combined using latent semantic analysis as a core procedure to compute the relevance of the sentence-like units of the given input source. Evaluation is performed using the self-information measure, which tries to capture the informativeness of the summary in relation to the summarized input source. The similarity of the output summaries of the several approaches is also analyzed.

[1]  Sadaoki Furui,et al.  TWO-STAGE AUTOMATIC SPEECH SUMMARIZATION BY SENTENCE EXTRACTION AND COMPACTION , 2003 .

[2]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[3]  Fernando Batista,et al.  Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news , 2008, Speech Commun..

[4]  Sadaoki Furui,et al.  Topic and Stylistic Adaptation for Speech Summarisation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  Alexander H. Waibel,et al.  Minimizing Word Error Rate in Textual Summaries of Spoken Language , 2000, ANLP.

[6]  João Paulo da Silva Neto,et al.  A Prototype System for Selective Dissemination of Broadcast News in European Portuguese , 2007, EURASIP J. Adv. Signal Process..

[7]  Sadaoki Furui RECENT ADVANCES IN AUTOMATIC SPEECH SUMMARIZATION , 2006, 2006 IEEE Spoken Language Technology Workshop.

[8]  Brigitte Endres-Niggemeyer,et al.  SimSum: an empirically founded simulation of summarizing , 2000, Inf. Process. Manag..

[9]  Julia Hirschberg,et al.  Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization , 2005, INTERSPEECH.

[10]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[11]  Sanda M. Harabagiu,et al.  Topic themes for multi-document summarization , 2005, SIGIR '05.

[12]  Xiaojun Wan,et al.  CollabSum: exploiting multiple document clustering for collaborative single document summarizations , 2007, SIGIR.

[13]  K. Spärck Jones,et al.  Between shallow and deep: an experiment in automatic summarising , 2005 .

[14]  Ricardo Ribeiro,et al.  Extractive Summarization of Broadcast News: Comparing Strategies for European Portuguese , 2007, TSD.

[15]  Takaaki Hori,et al.  Speech summarization using weighted finite-state transducers , 2003, INTERSPEECH.

[16]  Isabel Trancoso,et al.  Improving the topic indexation and segmentation modules of a media watch system , 2004, INTERSPEECH.

[17]  Fernando Batista,et al.  Recovering punctuation marks for automatic speech recognition , 2007, INTERSPEECH.

[18]  Johanna D. Moore,et al.  Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[19]  Isabel Trancoso,et al.  AUTOMATIC VS. MANUAL TOPIC SEGMENTATION AND INDEXATION IN BROADCAST NEWS , 2006 .

[20]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[21]  Daniel Marcu,et al.  A Noisy-Channel Model for Document Compression , 2002, ACL.

[22]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[23]  Julia Hirschberg,et al.  From text to speech summarization , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[24]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[25]  Brigitte Endres-niggemeyer Human-style WWW summarization , 2000 .

[26]  Gerald Penn,et al.  A Critical Reassessment of Evaluation Baselines for Speech Summarization , 2008, ACL.

[27]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.