A Pitch Smoothing Method for Mandarin Tone Recognition

Mandarin Chinese is known as a tonal language with four lexical tones. Tone recognition plays an important role in automatic Chinese speech recognition in that the same syllable with different tones gives quite distinct meanings. The different tone can be characterized by its pitch contour, but the pitch contours are hardly ideal smooth curves. It is because the pitch points calculated by pitch detector normally have some error points. These error pitch points can cause the erroneous classification of Mandarin four-tone recognition. It is necessary to smooth the pitch contour before tone recognition. The classic smooth algorithms can not deal with error fundamental frequencies successively. A new smoothing method proposed in this paper can deal with the error pitch point appropriately. It first checks whether the current point is a correct or error point, then the error type, and finally modifies the error point according to the error type. For different error type, the corresponding smoothing method is also different. To confirm this smoothing method, four “one vs. all” Support Vector Machine classifier are built for Mandarin Tone Recognition. The test results indicate that error rate of Mandarin Chinese four tone recognition can be reduced under the smoothing method.

[1]  Sankar K. Pal,et al.  International Journal of Signal Processing , Image Processing and Pattern Recognition , 2008 .

[2]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[3]  Gina-Anne Levow,et al.  Tone recognition in Mandarin using focus , 2005, INTERSPEECH.

[4]  D. O'Shaughnessy,et al.  A Processing Method for Pitch Smoothing Based on Autocorrelation and Cepstral F0 Detection Approaches , 2007, 2007 International Symposium on Signals, Systems and Electronics.

[5]  J. Markel,et al.  The SIFT algorithm for fundamental frequency estimation , 1972 .

[6]  Michael Emonts,et al.  A memory-based approach to Cantonese tone recognition , 2003, INTERSPEECH.

[7]  Xiaoyan Zhu,et al.  An approach to smooth fundamental frequencies in tone recognition , 1998, ICCT'98. 1998 International Conference on Communication Technology. Proceedings (IEEE Cat. No.98EX243).

[8]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[9]  Hsiao-Chuan Wang,et al.  Hidden Markov model for Mandarin lexical tone recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[10]  L. Rabiner,et al.  System for automatic formant analysis of voiced speech. , 1970, The Journal of the Acoustical Society of America.

[11]  Gang Peng,et al.  Tone recognition of continuous Cantonese speech based on support vector machines , 2005, Speech Commun..

[12]  Jiao Licheng,et al.  Classification mechanism of support vector machines , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[13]  Sai Ji,et al.  Tone Recognition of Continuous Mandarin Speech Based on Binary-Class SVMs , 2009, 2009 First International Conference on Information Science and Engineering.

[14]  M. J. Cheng,et al.  Comparative performance study of several pitch detection algorithms , 1975 .

[15]  Ying Sun,et al.  A hidden Markov model applied to Chinese four-tone recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.