论文信息 - Language identification of individual words with joint sequence models

Language identification of individual words with joint sequence models

Abstract Within a multilingual automatic speech recognition (ASR) sys-tem, knowledge of the language of origin of unknown wordscan improve pronunciation modelling accuracy. This is of par-ticular importance for ASR systems required to deal with code-switched speech or proper names of foreign origin. For wordsthat occur in the language model, but do not occur in the pro-nunciation lexicon, text-based language identiﬁcation (T-LID)of a single word in isolation may be required. This is a chal-lenging task, especially for short words. We motivate for theimportance of accurate T-LID in speech processing systems andintroduce a novel way of applying Joint Sequence Models to theT-LID task. We obtain competitive results on a real-world 4-language task: for our best JSM system, an F-measure of 97:2%is obtained, compared to a F-measure of 95:2% obtained with astate-of-the-art Support Vector Machine (SVM).Index Terms: text-based language identiﬁcation, joint se-quence models, multilingual speech recognition

Marelie H. Davel | Oluwapelumi Giwa

[1] Van Heerden,et al. Efficient training of support vector machines and their hyperparameters , 2012 .

[2] Jean-Pierre Martens,et al. Improving Proper Name Recognition by Adding Automatically Learned Pronunciation Variants to the Lexicon , 2010, LREC.

[3] Marelie H. Davel,et al. Implications of Sepedi/English code switching for ASR systems , 2013 .

[4] Ariadna Font Llitjós,et al. Knowledge of language origin improves pronunciation accuracy of proper names , 2001, INTERSPEECH.

[5] Ioan Tabus,et al. Language identification of individualwords in a multilingual automatic speech recognition system , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Bart D'hoore,et al. How speaker tongue and name source language affect the automatic recognition of spoken names , 2009, INTERSPEECH.

[7] Grzegorz Kondrak,et al. Language identification of names with SVMs , 2010, HLT-NAACL.

[8] Etienne Barnard,et al. A Southern African corpus for multilingual name pronunciation , 2011 .

[9] Robert I. Damper,et al. A comparison of letter-to-sound conversion techniques for English text-to-speech synthesis , 1998 .

[10] Marelie H. Davel,et al. N-gram based language identification of individual words , 2013 .

[11] Etienne Barnard,et al. Factors that affect the accuracy of text-based language identification , 2012, Comput. Speech Lang..

[12] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[13] Paul Taylor,et al. Hidden Markov models for grapheme to phoneme conversion , 2005, INTERSPEECH.