Context Dependent Feature Based Bottom-up Rescoring SVM Classifier in Children's English Stress Mis-pronunciation Detection

Automatic assessment of word stress error is an integral part for oral language grading system. However, problems that the property of vowels depends on its context information and the data sparseness of different vowel class are yet to be solved. This paper shall briefly introduce a hybrid method consisting of both traditional prosodic features and proposed context dependent strategies. In classification word stress is determined by weighting a bottom-up fashioned group tree with modified distributed probability score. In experiment, the overall equal error rate of our proposed system achieves 9.41%, which exhibits relative reduction and its competence of use in stress error detection system.

[1]  Lukás Burget,et al.  Comparison of keyword spotting approaches for informal continuous speech , 2005, INTERSPEECH.

[2]  Nan Chen,et al.  Using Nonlinear Features in Automatic English Lexical Stress Detection , 2007, 2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007).

[3]  Mengjie Zhang,et al.  Detecting Stress in Spoken English using Decision Trees and Support Vector Machines , 2004, ACSW.

[4]  Shrikanth S. Narayanan,et al.  Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Fangyu Hu,et al.  A Hierarchical Approach to Automatic Stress Detection in English Sentences , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.