Scalable neural network based language identification from written text

Automatic language identification is an integral part of multilingual automatic speech recognition (ASR) and synthesis systems. We propose a novel scalable method for neural network based language identification from written text. The developed algorithm is further deployed in a multilingual ASR system. The developed algorithm is particularly proposed for embedded implementation platforms with sparse memory resources. With the proposed approach, high rates of both language identification and recognition are achieved across several languages with a compact size of the language identification model. The major benefit of the approach is that the neural network based language identification model can be scaled to meet the memory requirements set by the target platform while maintaining the language identification accuracy of the baseline system. The experiments show that the suggested scalable approach can save more than 50% memory while the performance is comparable to that of the baseline system. The performance is also verified in a multilingual speech recognition task.

[1]  Jilei Tian,et al.  n-gram and decision tree based language identification for written words , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[2]  Imre Kiss,et al.  Speaker- and language-independent speech recognition in mobile communication systems , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Jilei Tian,et al.  On text-based language identification for multilingual speech recognition systems , 2002, INTERSPEECH.

[4]  A. Lawrence Spitz,et al.  Determination of the Script and Language Content of Document Images , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Søren Riis,et al.  Self-organizing letter code-book for text-to-phoneme neural network model , 2000, INTERSPEECH.

[6]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[7]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[8]  Jilei Tian,et al.  Multilingual pronunciation modeling for improving multilingual speech recognition , 2002, INTERSPEECH.