论文信息 - A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis

A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis

Recent progress in corpus-based concatenative text-to-speech synthesis has generated some interest in systems that are capable of synthesizing text from more than one language. In this paper we describe the language identification component of such a mixed-lingual text-to-speech system. Relying only on the input text, we employ two different methods, namely a transformation based learning approach and a stochastic n-gram approach, and we describe the combination of both methods. While the transformation-based learning approach already produces average error rates of less than 2 percent and outperforms the n-gram classification scheme, the combination of both methods results in a further error reduction of up to 50 percent.

Claire Waast-Richard | Volker Fischer | J. C. Marcadet

[1] Simon Corston-Oliver. Combining Decision Trees And Transformation-Based Learning To Correct Transferred Linguistic Representations , 2003 .

[2] Haiping Li,et al. Trainable Cantonese/English dual language speech synthesis system , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[3] Ossama Emam,et al. Multilingual acoustic models for speech recognition and synthesis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Beat Pfister,et al. From multilingual to polyglot speech synthesis , 1999, EUROSPEECH.

[5] Richard Sproat. Multilingual Text-to-Speech Synthesis , 1997 .

[6] Jilei Tian,et al. Scalable neural network based language identification from written text , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] 티안 질레이,et al. Scalable neural network-based language identification from written text , 2003 .

[8] Michael Picheny,et al. The IBM expressive speech synthesis system , 2004, INTERSPEECH.

[9] Richard Sproat,et al. Multilingual Text-to-Speech Synthesis: The Bell Labs Approach , 1998, CL.

[10] Mitchell P. Marcus,et al. Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[11] John M. Prager,et al. Linguini: language identification for multilingual documents , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.

[12] Marc A. Zissman,et al. Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[13] Mahesh Viswanathan,et al. Recent improvements to the IBM trainable speech synthesis system , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14] Eric Brill,et al. Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[15] Jilei Tian,et al. On text-based language identification for multilingual speech recognition systems , 2002, INTERSPEECH.

[16] Eric Brill,et al. A Simple Rule-Based Part of Speech Tagger , 1992, HLT.