Taiwanese TV news-to-document index system

This paper describes an index system from Taiwanese TV speech news to World Wide Web Chinese text documents. This system is based on two main techniques: automatic speech recognition (ASR) and bi-lingual text alignment. For the former, we utilized the speech-to-text approach to recognize the utterance of anchors in the TV news as Taiwanese tonal syllable sequences. Then we translated the Chinese text documents which obtained from the corresponding news website to the Taiwanese tonal syllables by a bi-lingual pronunciation lexicon. Afterward, a dynamic programming algorithm is used in the syllable-level alignment for linking the TV news and the documents. A corpus of speech data about 100 speakers and the text data with 840k Chinese characters were used to train the acoustic and language models in ASR. A bi-lingual lexicon contains 70k vocabularies is used as the resource of the pronunciation model for ASR and the statistical translation model for bi-lingual text alignment. Finally, the experiment of the TV news with 40 stories was evaluated for the document index system, and the accuracy rate of index is over 82% on average.

[1]  Dau-Cheng Lyu,et al.  Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling , 2003, INTERSPEECH.

[2]  Dan Tufis,et al.  Empirical Methods for Exploiting Parallel Texts , 2002, Lit. Linguistic Comput..

[3]  Chia-Hui Chang,et al.  Reconfigurable Web Wrapper Agents , 2003, IEEE Intell. Syst..

[4]  Helmer Strik,et al.  A data-driven method for modeling pronunciation variation , 2003, Speech Commun..

[5]  Michael J. Swain,et al.  SpeechBot: a Speech Recognition based Audio Indexing System for the Web , 2000, RIAO.

[6]  Jean Véronis,et al.  Parallel text processing :alignment and use of translationcorpora , 2000 .

[7]  Ren-Yuan Lyu,et al.  A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units , 2000, INTERSPEECH.

[8]  Yuang-chin Chiang,et al.  An efficient algorithm to select phonetically balanced scripts for constructing a speech corpus , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[9]  Jean V ronis Parallel Text Processing: Alignment and Use of Translation Corpora , 2002 .

[10]  Hsin-Min Wang,et al.  Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese , 2000, Speech Commun..

[11]  Oi Yee Kwong,et al.  Some Considerations on Guidelines for Bilingual Alignment and Terminology Extraction , 2002, SIGHAN@COLING.

[12]  Dau-Cheng Lyu,et al.  SPEAKER INDEPENDENT ACOUSTIC MODELING FOR LARGE VOCABULARY BI-LINGUAL TAIWANESE/MANDARIN CONTINUOUS SPEECH RECOGNITION , 2002 .