Development and Evaluation of Automatic Punctuation for French and English Speech-to-Text

Automatic punctuation of speech is important to make speechto-text output more readable and to facilitate downstream language processing. This paper describes the development of an automatic punctuation system for French and English. The punctuation model uses both textual information and acoustic (prosodic) information and is based on adaptive boosting. The system is evaluated on a challenging speech corpus under real-application conditions using output from a state-of-the-art speech-to-text system and automatic audio segmentation and speaker diarization. Unlike previous work, automatic punctuation is scored on two independent manual references. Comparisons are made for the two languages and the performance of the automatic system is compared with inter-annotator agreement.

[1]  Dilek Z. Hakkani-Tür,et al.  Syntactically-informed models for comma prediction , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[3]  Pavel Král,et al.  Commas Recovery with Syntactic Features in French and in Czech , 2011, INTERSPEECH.

[4]  Elizabeth Shriberg,et al.  Speaker adaptation of language and prosodic models for automatic dialog act segmentation of speech , 2010, Speech Commun..

[5]  Ngoc Thang Vu,et al.  Speech recognition for machine translation in Quaero , 2011, IWSLT.

[6]  Geoffrey Zweig,et al.  Maximum entropy model for punctuation annotation from speech , 2002, INTERSPEECH.

[7]  Helena Moniz,et al.  Extending the punctuation module for european portuguese , 2010, INTERSPEECH.

[8]  Lori Lamel,et al.  On Development of Consistently Punctuated Speech Corpora , 2011, INTERSPEECH.

[9]  Heidi Christensen,et al.  Punctuation annotation using statistical prosody models. , 2001 .

[10]  Ji-Hwan Kim,et al.  A combined punctuation generation and speech recognition system and its performance enhancement using prosody , 2003, Speech Commun..

[11]  John D. Lafferty,et al.  Cyberpunc: a lightweight punctuation annotation system for speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[13]  Andreas Stolcke,et al.  Enriching speech recognition with automatic detection of sentence boundaries and disfluencies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Dilek Z. Hakkani-Tür,et al.  Any questions? Automatic question detection in meetings , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[15]  Michiel Bacchiani,et al.  Restoring punctuation and capitalization in transcribed speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Andreas Stolcke,et al.  Automatic linguistic segmentation of conversational speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.