论文信息 - Dynamic language modeling for European Portuguese

Dynamic language modeling for European Portuguese

This paper reports on the work done on vocabulary and language model daily adaptation for a European Portuguese broadcast news transcription system. The proposed adaptation framework takes into consideration European Portuguese language characteristics, such as its high level of inflection and complex verbal system. A multi-pass speech recognition framework using contemporary written texts available daily on the Web is proposed. It uses morpho-syntactic knowledge (part-of-speech information) about an in-domain training corpus for daily selection of an optimal vocabulary. Using an information retrieval engine and the ASR hypotheses as query material, relevant documents are extracted from a dynamic and large-size dataset to generate a story-based language model. When applied to a daily and live closed-captioning system of live TV broadcasts, it was shown to be effective, with a relative reduction of out-of-vocabulary word rate (69%) and WER (12.0%) when compared to the results obtained by the baseline system with the same vocabulary size.

[1] Alexandre Allauzen,et al. Open vocabulary ASR for audiovisual document indexation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2] Alex Waibel,et al. TRANSCRIBING MULTILINGUAL BROADCAST NEWS USING HYPOTHESIS DRIVEN LEXICAL ADAPTATION , 1998 .

[3] Ciro Martins,et al. Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition , 1999, EUROSPEECH.

[4] João Paulo da Silva Neto,et al. A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models , 2005, INTERSPEECH.

[5] Andreas Stolcke,et al. Integrating MAP, marginals, and unsupervised language model adaptation , 2007, INTERSPEECH.

[6] Yan Huang,et al. Vocabulary and language model adaptation using information retrieval , 2004, INTERSPEECH.

[7] Jean-Luc Gauvain,et al. The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[8] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9] Wen Wang,et al. Techniques for effective vocabulary selection , 2003, INTERSPEECH.

[10] Ciro Martins,et al. Automatic estimation of language model parameters for unseen words using morpho-syntactic contextual information , 2008, INTERSPEECH.

[11] Mari Ostendorf,et al. Transforming out-of-domain estimates to improve in-domain language models , 1997, EUROSPEECH.

[12] Olivier Galibert,et al. Speech transcription in multiple languages , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13] W. Bruce Croft,et al. Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[14] Jean-Luc Gauvain,et al. Dynamic language modeling for broadcast news , 2004, INTERSPEECH.

[15] Patrick Cardinal,et al. Automated closed-captioning of live TV broadcast news in French , 2003, INTERSPEECH.

[16] I. Lee Hetherington. A characterization of the problem of new, out-of-vocabulary words in continuous-speech recognition and understanding , 1995 .

[17] Ricardo Ribeiro,et al. Using Morphossyntactic Information in TTS Systems: Comparing Stratgies for European Portuguese , 2003, PROPOR.

[18] Ciro Martins,et al. Broadcast news subtitling system in Portuguese , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19] João Paulo da Silva Neto,et al. AUDIMUS.MEDIA: A Broadcast News Speech Recognition System for the European Portuguese Language , 2003, PROPOR.

[20] Sven C. Martin,et al. Statistical Language Modeling Using Leaving-One-Out , 1997 .

[21] Mari Ostendorf,et al. Improving out-of-vocabulary name resolution , 2005, Comput. Speech Lang..

[22] Wen Wang,et al. Building a highly accurate Mandarin speech recognizer , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[23] Bhuvana Ramabhadran,et al. The IBM 2007 speech transcription system for European parliamentary speeches , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[24] Ciro Martins,et al. The development of a speaker independent continuous speech recognizer for portuguese , 1997, EUROSPEECH.

[25] Andreas Stolcke,et al. Morphology-based language modeling for conversational Arabic speech recognition , 2006, Comput. Speech Lang..

[26] Ciro Martins,et al. Dynamic language modeling for a daily broadcast news transcription system , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[27] João Paulo da Silva Neto,et al. Combination of acoustic models in continuous speech recognition hybrid systems , 2000, INTERSPEECH.

[28] Georges Linarès,et al. On-demand new word learning using world wide web , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29] Steve Young,et al. Corpus-based methods in language and speech processing , 1997 .

[30] Marcello Federico,et al. Broadcast news LM adaptation over time , 2004, Comput. Speech Lang..

[31] Isabel Trancoso,et al. Spoken Language Corpora for Speech Recognition and Synthesis in European Portuguese , 1998 .

[32] Isabel Trancoso,et al. THE DEVELOPMENT OF AN AUTOMATIC SYSTEM FOR SELECTIVE DISSEMINATION OF MULTIMEDIA INFORMATION , 2003 .

[33] Roger K. Moore. Computer Speech and Language , 1986 .

[34] Ciro Martins,et al. Dynamic Language Modeling for the European Portuguese , 2008, PROPOR.

[35] António Teixeira,et al. Language Models in Automatic Speech Recognition , 2005 .

[36] Ciro Martins,et al. Dynamic Vocabulary Adaptation for a daily and real-time Broadcast News Transcription System , 2006, 2006 IEEE Spoken Language Technology Workshop.

[37] Peng Xu,et al. Random forests and the data sparseness problem in language modeling , 2007, Comput. Speech Lang..

[38] Pascale Sébillot,et al. An unsupervised web-based topic language model adaptation method , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[39] Ciro Martins,et al. Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system , 1995, EUROSPEECH.

[40] Jerome R. Bellegarda,et al. Statistical language model adaptation: review and perspectives , 2004, Speech Commun..

[41] Alexandre Allauzen,et al. Diachronic vocabulary adaptation for broadcast news transcription , 2005, INTERSPEECH.

[42] James R. Glass,et al. Modeling out-of-vocabulary words for robust speech recognition , 2000, INTERSPEECH.

[43] Rong Zhang,et al. Data selection for speech recognition , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[44] Pascale Sébillot,et al. Constraint selection for topic-based MDI adaptation of language models , 2009, INTERSPEECH.

[45] James Glass,et al. Modelling out-of-vocabulary words for robust speech recognition , 2002 .

[46] Andreas Stolcke,et al. Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.

[47] Marcello Federico,et al. Development and Evaluation of an Italian Broadcast News Corpus , 2000, LREC.

[48] Isabel Trancoso,et al. Grapheme-to-phone using finite-state transducers , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[49] Isabel Trancoso,et al. AUTOMATIC VS. MANUAL TOPIC SEGMENTATION AND INDEXATION IN BROADCAST NEWS , 2006 .

[50] C. Huyck,et al. A stemming algorithm for the portuguese language , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.

[51] Tanja Schultz,et al. Unsupervised language model adaptation using latent semantic marginals , 2006, INTERSPEECH.

[52] Ciro Martins,et al. Vocabulary selection for a broadcast news transcription system using a morpho-syntactic approach , 2007, INTERSPEECH.