Vers une modélisation statistique multi-niveau du langage, application aux langues peu dotées. (Toward a multi-level statistical language modeling for under-resourced language)

Ce travail de these porte sur la reconnaissance automatique de la parole des langues peu dotees et ayant un systeme d'ecriture sans separation explicite entre les mots. La specificite des langues traitees dans notre contexte d'etude necessite la segmentation automatique en mots pour rendre la modelisation du langage n-gramme applicable. Alors que le manque de donnees textuelles a un impact sur la performance des modeles de langage, les erreurs introduites par la segmentation automatique peuvent rendre ces donnees encore moins exploitables. Pour tenter de pallier les problemes, nos recherches sont axees principalement sur la modelisation du langage, et en particulier sur le choix des unites lexicales et sous-lexicales, utilisees par les systemes de reconnaissance. Nous experimentons l'utilisation des multiples unites au niveau des modeles du langage et au niveau des sorties de systemes de reconnaissance. Nous validons ces approches de modelisation a base des multiples unites sur les systemes de reconnaissance pour un groupe de langues peu dotees : le khmer, le vietnamien, le thai et le laotien.

[1]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[2]  Mari Ostendorf,et al.  Analyzing and predicting language model improvements , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[3]  Laurent Besacier,et al.  Which units for acoustic and language modeling for Khmer automatic speech recognition? , 2008, SLTU.

[4]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[5]  Jean-Luc Gauvain,et al.  Broadcast news transcription in Mandarin , 2000, INTERSPEECH.

[6]  Ruhi Sarikaya,et al.  On the use of morphological analysis for dialectal Arabic speech recognition , 2006, INTERSPEECH.

[7]  Ebru Arisoy,et al.  Unsupervised segmentation of words into morphemes - morpho challenge 2005 application to automatic speech recognition , 2006, INTERSPEECH.

[8]  Surapant Meknavin,et al.  Feature-based Thai Word Segmentation , 1997 .

[9]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[10]  Laurent Besacier,et al.  Word/sub-word lattices decomposition and combination for speech recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Ossama Emam,et al.  Language Model Based Arabic Word Segmentation , 2003, ACL.

[12]  Hermann Ney,et al.  iROVER: Improving System Combination with Classification , 2007, NAACL.

[13]  Jean-Luc Gauvain,et al.  Connectionist language modeling for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Eiichiro Sumita,et al.  Improved Statistical Machine Translation by Multiple Chinese Word Segmentation , 2008, WMT@ACL.

[15]  Jia Liu,et al.  Fusing multiple systems into a compact lattice index for chinese spoken term detection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  J. Xu,et al.  Audio Indexing of Arabic broadcast news , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Laurent Besacier,et al.  Using the web for fast language model construction in minority languages , 2003, INTERSPEECH.

[18]  Tanja Schultz,et al.  SPICE: web-based tools for rapid language adaptation in speech processing systems , 2007, INTERSPEECH.

[19]  Einar Meister,et al.  BABEL: an Eastern European multi-language database , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  Jean-Luc Gauvain,et al.  MODELING CHARACTERS VERSUS WORDS FOR MANDARIN SPEECH RECOGNITION , 2009 .

[21]  Philip N. Jenner Cambodian System of Writing and Beginning Reader, with Drills and Glossary . By Franklin E. Huffman. New Haven: Yale University Press, 1970. xii, 365 pp. Glossary (Cambodian-English), Bibliography. $12.00. , 1971 .

[22]  Jean-François Bonastre,et al.  Automatic transcription of Somali language , 2006, INTERSPEECH.

[23]  Benjamin Lecouteux Reconnaissance automatique de la parole guidée par des transcriptions a priori. (driven decoding for speech recognition system combination) , 2008 .

[24]  Vincent Berment,et al.  Méthodes pour informatiser les langues et les groupes de langues « peu dotées ». (Methods to computerize "little equipped" languages and groups of languages) , 2004 .

[25]  Alex Waibel,et al.  The Janus Speech Recognizer , 1995 .

[26]  Tanja Schultz,et al.  Language-independent and language-adaptive acoustic modeling for speech recognition , 2001, Speech Commun..

[27]  Tanja Schultz,et al.  Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.

[28]  Laurent Besacier,et al.  Mining a Comparable Text Corpus for a Vietnamese-French Statistical Machine Translation System , 2009, WMT@EACL.

[29]  Mark Liberman,et al.  Transcriber: Development and use of a tool for assisting speech corpora production , 2001, Speech Commun..

[30]  Eiichiro Sumita,et al.  Chinese word segmentation and statistical machine translation , 2008, TSLP.

[31]  Dominique Vaufreydaz Modélisation statistique du langage à partir d'Internet pour la reconnaissance automatique de la parole continue. (Statistical language modelling using Internet documents for continuous speech recognition) , 2002 .

[32]  Hermann Ney,et al.  Multigram-based grapheme-to-phoneme conversion for LVCSR , 2003, INTERSPEECH.

[33]  Laurent Besacier,et al.  First Broadcast News Transcription System for Khmer Language , 2008, LREC.

[34]  Holger Schwenk,et al.  Continuous space language models , 2007, Comput. Speech Lang..

[35]  D. Crystal What is language death , 2002 .

[36]  Dipanjan Chakraborty,et al.  WWTW: the world wide telecom web , 2007, NSDR '07.

[37]  Laurent Besacier,et al.  Recent advances in automatic speech recognition for vietnamese , 2008, SLTU.

[38]  Loïc Barrault,et al.  Diagnostic pour la combinaison de systèmes de reconnaissance automatique de la parole. (Diagnosis for the combination of automatic speech recognition systems) , 2008 .

[39]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[40]  Ebru Arisoy,et al.  A unified language model for large vocabulary continuous speech recognition of Turkish , 2006, Signal Process..

[41]  Jean Caelen,et al.  EMACOP : Environnement Multimédia pour l'Acquisition et la gestion de COrpus Parole , 1998 .

[42]  C. Haruechaiyasak,et al.  A comparative study on Thai word segmentation approaches , 2008, 2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[43]  Sarah L. Nesbeitt Ethnologue: Languages of the World , 1999 .

[44]  Mehryar Mohri,et al.  A Rational Design for a Weighted Finite-State Transducer Library , 1997, Workshop on Implementing Automata.

[45]  Hynek Hermansky,et al.  Perceptual Linear Predictive (PLP) Analysis-Resynthesis Technique , 1991, Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics.

[46]  Tanja Schultz,et al.  A Grapheme Based Speech Recognition System for Russian , 2004 .

[47]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[48]  Tanja Schultz,et al.  Thai automatic speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[49]  Hermann Ney,et al.  Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[50]  Hermann Ney,et al.  Adaptive topic - dependent language modelling using word - based varigrams , 1997, EUROSPEECH.

[51]  T.H. Crystal,et al.  Linear prediction of speech , 1977, Proceedings of the IEEE.

[52]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[53]  Jean-Luc Gauvain,et al.  Combining multiple speech recognizers using voting and language model information , 2000, INTERSPEECH.

[54]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[55]  Virach Sornlertlamvanich,et al.  Character cluster based Thai information retrieval , 2000, IRAL '00.

[56]  Mathias Creutz,et al.  Morfessor in the Morpho Challenge , 2006 .