Multilinguality in speech and spoken language systems

Building modern speech and language systems currently requires large data resources such as texts, voice recordings, pronunciation lexicons, morphological decomposition information and parsing grammars. Based on a study of the most important differences between language groups, we introduce approaches to efficiently deal with the enormous task of covering even a small percentage of the world's languages. For speech recognition, we have reduced the resource requirements by applying acoustic model combination, bootstrapping and adaption techniques. Similar algorithms have been applied to improve the recognition of foreign accents. Segmenting language into appropriate units reduces the amount of data required to robustly estimate statistical models. The underlying morphological principles are also used to automatically adapt the coverage of our speech recognition dictionaries with the Hypothesis-Driven Lexical Adaptation (HDLA) algorithm. This reduces the out-of-vocabulary problems encountered in agglutinative languages. Speech recognition results are reported for the read GlobalPhone database and some broadcast news data. For speech translation, using a task-oriented Interlingua allows to build a system with N languages with linear, rather than quadratic effort. We have introduced a modular grammar design to maximize reusability and portability. End-to-end translation results are reported on a travel-domain task in the framework of C-STAR.

[1]  Philip C. Woodland,et al.  The use of accent-specific pronunciation dictionaries in acoustic model training , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Detlef Koll,et al.  Probabilistic dialogue act extraction for concept based multilingual translation systems , 1998, ICSLP.

[3]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[4]  Petra Geutner Adaptive vocabularies in large vocabulary conversational speech recognition , 2000 .

[5]  Joachim Köhler,et al.  In-service adaptation of multilingual hidden-Markov-models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Xavier L. Aubert,et al.  The Philips large-vocabulary recognition system for american English, French, and German , 1995, EUROSPEECH.

[7]  Chalapathy Neti,et al.  Towards a universal speech recognizer for multiple languages , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[8]  Tanja Schultz,et al.  Multilingual and Crosslingual Speech Recognition , 1998 .

[9]  Susanne Burger,et al.  Eliciting Natural Speech From Non-Native Users: Collecting Speech Data for LVCSR , 1999 .

[10]  David D. Palmer,et al.  A Trainable Rule-Based Algorithm for Word Segmentation , 1997, ACL.

[11]  Petra Geutner,et al.  Adaptive vocabularies for transcribing multilingual broadcast news , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Steven H. Weinberger,et al.  Interlanguage phonology : the acquisition of a second language sound system , 1987 .

[13]  S. Gokcen,et al.  A multilingual phoneme and model set: toward a universal base for automatic speech recognition , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[14]  河合 剛 Spoken language processing applied to nonnative language pronunciation learning , 1999 .

[15]  Horacio Franco,et al.  Automatic detection of mispronunciation for language instruction , 1997, EUROSPEECH.

[16]  N Gordon,et al.  The acquisition of a second language. , 2000, European journal of paediatric neurology : EJPN : official journal of the European Paediatric Neurology Society.

[17]  Matthias H. J. Munk,et al.  Shallow statistical parsing for machine translation , 1999 .

[18]  Klaus Ries,et al.  An automatic method for learning a Japanese lexicon for recognition of spontaneous speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[19]  Andreas Stolcke,et al.  Word predictability after hesitations: a corpus-based study , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  Alex Waibel,et al.  TRANSCRIBING MULTILINGUAL BROADCAST NEWS USING HYPOTHESIS DRIVEN LEXICAL ADAPTATION , 1998 .

[21]  Joachim Köhler Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks , 1998, ICASSP.

[22]  Lori Lamel,et al.  Issues in Large Vocabulary, Multilingual Speech Recognition , 1995, EUROSPEECH.

[23]  Kazuhiro Kondo,et al.  An evaluation of cross-language adaptation for rapid HMM development in a new language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Steve J. Young,et al.  Language learning based on non-native speech recognition , 1997, EUROSPEECH.

[25]  Steve J. Young,et al.  Off-line acoustic modelling of non-native accents , 1999, EUROSPEECH.

[26]  Alon Lavie,et al.  A modular approach to spoken language translation for large domains , 1998, AMTA.

[27]  Alexander H. Waibel,et al.  Phonetic-distance-based hypothesis driven lexical adaptation for transcribing multlingual broadcast news , 1998, ICSLP.

[28]  Francis Kubala,et al.  Modeling Those F-Conditions - Or Not , 1997 .

[29]  Victor Zue,et al.  Multilingual spoken-language understanding in the MIT Voyager system , 1995, Speech Commun..

[30]  Michael Finke,et al.  Wide context acoustic modeling in read vs. spontaneous speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Larry Gillick,et al.  Multilingual speech recognition at Dragon Systems , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[32]  Anne Cutler,et al.  The comparative perspective on spoken-language processing , 1997, Speech Commun..

[33]  Alexander H. Waibel,et al.  Selection criteria for hypothesis driven lexical adaptation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[34]  A. Constantinescu,et al.  On cross-language experiments and data-driven units for ALISP (Automatic Language Independent Speech Processing) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[35]  Tanja Schultz Adaptation of Pronunciation Dictionaries for Recognition of Unseen Languages , 1998 .

[36]  Cem H. Bozsahin,et al.  An Outline of Turkish Morphology , 1994 .

[37]  Tanja Schultz,et al.  Fast bootstrapping of LVCSR systems with multilingual phoneme sets , 1997, EUROSPEECH.

[38]  Alex Waibel,et al.  Testing generality in JANUS: a multi-lingual speech translation system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[39]  Alexander H. Waibel,et al.  Growing Semantic Grammars , 1998, COLING-ACL.