A Method for Isolated Thai Tone Recognition Using a Combination of Neural Networks

Tone information is very important to speech recognition in a tonal language such as Thai. In this article, we present a method for isolated Thai tone recognition. First, we define three sets of tone features to capture the characteristics of Thai tones and employ a feedforward neural network to classify tones based on these features. Next, we describe several experiments using the proposed features. The experiments are designed to study the effect of initial consonants, vowels, and final consonants on tone recognition. We find that there are some correlations between tones and other phonemes, and the recognition performances are satisfying. A human perception test is then conducted to judge the recognition rate. The recognition rate of a human is much lower than that of a machine. Finally, we explore various combination schemes to enhance the recognition rate. Further improvements are found in most experiments.

[1]  Jeff Schneider A Locally Weighted Learning Tutorial using Vizier 1.0 , 1997 .

[2]  Sin-Horng Chen,et al.  Tone recognition of continuous Mandarin speech based on neural networks , 1995, IEEE Trans. Speech Audio Process..

[3]  Shusheng Gu,et al.  Mandarin four-tone recognition with the fuzzy C-means algorithm , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[4]  Rachada Kongkachandra,et al.  Thai intonation analysis in harmonic-frequency domain , 1998, IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. Proceedings (Cat. No.98EX242).

[5]  P. C. M. F. J. Owens BSc Signal Processing of Speech , 1993, Macmillan New Electronics Series.

[6]  Curtis F. Gerald,et al.  APPLIED NUMERICAL ANALYSIS , 1972, The Mathematical Gazette.

[7]  Sin-Horng Chen,et al.  Mandarin tone recognition by multi-layer perceptron , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  A. Tungthangthum Tone recognition for Thai , 1998, IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. Proceedings (Cat. No.98EX242).

[9]  Sung-Bae Cho,et al.  Combining modular neural networks developed by evolutionary algorithm , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[10]  Xiaoyan Zhu,et al.  An approach to smooth fundamental frequencies in tone recognition , 1998, ICCT'98. 1998 International Conference on Communication Technology. Proceedings (IEEE Cat. No.98EX243).

[11]  Yeshwant K. Muthusamy,et al.  A Segmental Approach to Automatic Language Identification , 1993 .

[12]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Nikko Ström,et al.  Sparse connection and pruning in large dynamic artificial neural networks , 1997, EUROSPEECH.

[14]  Jeff A. Bilmes,et al.  Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Tan Lee,et al.  Cantonese syllable recognition using neural networks , 1999, IEEE Trans. Speech Audio Process..

[16]  Sudaporn Luksaneeyanawin,et al.  Intonation in Thai. , 1983 .

[17]  Y R Wang,et al.  Tone recognition of continuous Mandarin speech assisted with prosodic information. , 1994, The Journal of the Acoustical Society of America.

[18]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Hsiao-Chuan Wang,et al.  Hidden Markov model for Mandarin lexical tone recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[20]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[21]  A. Moore A Locally Weighted Learning Tutorial using Vizier 1.0 , 1997 .

[22]  Robert P. W. Duin,et al.  Experiments with Classifier Combining Rules , 2000, Multiple Classifier Systems.

[23]  Fran H. L. Jian Classification of taiwanese tones based on pitch and energy movements , 1998, ICSLP.

[24]  P. C. Ching,et al.  An NN based tone classifier for Cantonese , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[25]  Pak-Chung Ching,et al.  Tone recognition of isolated Cantonese syllables , 1995, IEEE Trans. Speech Audio Process..