Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text

This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text.

[1]  Michal Rott,et al.  SummEC: A Summarization Engine for Czech , 2013, TSD.

[2]  Torbjørn Svendsen,et al.  Combining NDHMM and phonetic feature detection for speech recognition , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[3]  Jan Silovský,et al.  Challenges in Speech Processing of Slavic Languages (Case Studies in Speech Recognition of Czech and Slovak) , 2009, COST 2102 Training School.

[4]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[5]  Sadaoki Furui,et al.  Speech-to-text and speech-to-speech summarization of spontaneous speech , 2004, IEEE Transactions on Speech and Audio Processing.

[6]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[7]  Michal Rott The Initial Study of Term Vector Generation Methods for News Summarization , 2015, RASLAN.

[8]  Sadaoki Furui,et al.  Automatic speech summarization applied to English broadcast news speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Jindrich Zdansky,et al.  Investigation into the use of deep neural networks for LVCSR of Czech , 2015, 2015 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM).

[10]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Jan Hajic,et al.  Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition , 2014, ACL.

[12]  Ales Horák,et al.  Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech , 2009, LTC.

[13]  Michaela Kucharová,et al.  Post-processing of the recognized speech for web presentation of large audio archive , 2012, 2012 35th International Conference on Telecommunications and Signal Processing (TSP).

[14]  Hsin-Min Wang,et al.  A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization , 2009, IEEE Transactions on Audio, Speech, and Language Processing.