Chinese Intonation Classification Using Support Vector Machines

In conventional speech recognition system, only a plain text is presented as the final result, and all acoustic information of speech are cutoff. The aim of this publication is to add intonation information to traditional output of speech recognition engine, which is believed to reflect the emotion and intention of speaker. In this paper, we propose a robust approach to classify several kinds of intonations, e.g. declara-tive, interrogative, exclamatory, etc. Since it is still an open question on how to describe intonations, different kinds of features are investigated here to choose the most effective features for intonations classification. Support Vector Machine (SVM) is used as the classifier to perform the task of feature selection and combination. In our experiment, we address the speech recognition based methods, and use recognized results replace the transcribed text. Our goal is to simulate intonation classification in the real speech recognition. The speech materials used in this experiment were well designed includes three intonations, total about 4700 sentences. Experimental results show that our system can achieves the accuracy of (84.13%) for the task of three types of Chinese intonation classification. Keywords-Chinese intonation; SVM; speech recognition; intonation classification

[1]  Chilin Shih,et al.  Confusability of Chinese Intonation , 2003 .

[2]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3]  Jiahong Yuan,et al.  Mechanisms of Question Intonation in Mandarin , 2006, ISCSLP.

[4]  Jiahong Yuan,et al.  Perception of Mandarin intonation , 2004, 2004 International Symposium on Chinese Spoken Language Processing.

[5]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Tara L. Whitehill,et al.  Quantitative analysis of intonation patterns produced by Cantonese speakers with Parkinson's disease: a preliminary study , 2008, INTERSPEECH.

[7]  Bo Xu,et al.  Chinese intonation assessment using SEV features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Fang Liu,et al.  Classification of statement and question intonations in Mandarin. , 2006 .

[9]  Joseph Picone,et al.  Applications of support vector machines to speech recognition , 2004, IEEE Transactions on Signal Processing.

[10]  E. Newport,et al.  Science Current Directions in Psychological Statistical Learning : from Acquiring Specific Items to Forming General Rules on Behalf Of: Association for Psychological Science , 2022 .

[11]  Andreas Stolcke,et al.  Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? , 1998, Language and speech.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Tara L. Whitehill,et al.  Acoustic cues for the perception of intonation in Cantonese , 2008, INTERSPEECH.

[15]  Nello Cristianini,et al.  Support vector machines , 2009 .

[16]  Zengfu Wang,et al.  Affective Intonation-Modeling for Mandarin Based on PCA , 2007, Int. J. Comput. Linguistics Chin. Lang. Process..