论文信息 - Speech and Spoken Document

Speech and Spoken Document

rogress in both speech and language processing has spurred efforts to sup-port applications that rely on spoken—rather than written—language input.A key challenge in moving from text-based documents to such “spoken doc-uments” is that spoken language lacks explicit punctuation and formatting,which can be crucial for good performance. This article describes differentlevels of speech segmentation, approaches to automatically recovering segment bound-ary locations, and experimental results demonstrating impact on several language pro-cessing tasks. The results also show a need for optimizing segmentation for the endtask rather than independently.

[1] Sadaoki Furui,et al. Automatic speech summarization applied to English broadcast news speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Heidi Christensen,et al. From Text Summarisation to Style-Specific Summarisation for Broadcast News , 2004, ECIR.

[3] Julia Hirschberg,et al. Varying Input Segmentation for Story Boundary Detection in English, Arabic and Mandarin Broadcast News , 2007 .

[4] Mary P. Harper,et al. Reranking for Sentence Boundary Detection in Conversational Speech , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5] Hermann Ney,et al. The RWTH statistical machine translation system for the IWSLT 2006 evaluation , 2006, IWSLT.

[6] Patrick Nguyen,et al. Finding Speaker Identities with a Conditional Maximum Entropy Model , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7] Dilek Z. Hakkani-Tür,et al. The ICSI+ multilingual sentence segmentation system , 2006, INTERSPEECH.

[8] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9] Mary P. Harper,et al. 2005 Johns Hopkins Summer Workshop Final Report on Parsing and Spoken Structural Event Detection , 2005 .

[10] Feifan Liu,et al. Soundbite identification using reference and automatic transcripts of broadcast news speech , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[11] Geoffrey Zweig,et al. Maximum entropy model for punctuation annotation from speech , 2002, INTERSPEECH.

[12] Marcus Tomalin,et al. Discriminatively Trained Gaussian Mixture Models for Sentence Boundary Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[13] Hermann Ney,et al. Automatic sentence segmentation and punctuation prediction for spoken language translation , 2006, IWSLT.

[14] Elizabeth Shriberg,et al. The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[15] Douglas A. Reynolds,et al. Measuring human readability of machine generated text: three case studies in speech recognition and machine translation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[16] Marti A. Hearst. Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[17] Andreas Stolcke,et al. Enriching speech recognition with automatic detection of sentence boundaries and disfluencies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[18] Richard M. Schwartz,et al. Integrating Speech Recognition and Machine Translation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[19] Dilek Z. Hakkani-Tür,et al. IMPACT OF AUTOMATIC COMMA PREDICTION ON POS/NAME TAGGING OF SPEECH , 2006, 2006 IEEE Spoken Language Technology Workshop.

[20] Eugene Charniak,et al. Edit Detection and Parsing for Transcribed Speech , 2001, NAACL.

[21] Andreas Stolcke,et al. Automatic linguistic segmentation of conversational speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[22] Richard M. Schwartz,et al. The effects of speech recognition and punctuation on information extraction performance , 2005, INTERSPEECH.

[23] Gerald Penn,et al. Comparing the roles of textual, acoustic and spoken-language features on spontaneous-conversation summarization , 2006, NAACL.

[24] Mari Ostendorf,et al. Parsing Conversational Speech Using Enhanced Segmentation , 2004, NAACL.

[25] Dilek Z. Hakkani-Tür,et al. Punctuating speech for information extraction , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26] Hermann Ney,et al. Discriminative Reordering Models for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[27] Dilek Z. Hakkani-Tür,et al. Improving speech translation with automatic boundary prediction , 2007, INTERSPEECH.