论文信息 - Exploring Content Features for Automated Speech Scoring

Exploring Content Features for Automated Speech Scoring

Most previous research on automated speech scoring has focused on restricted, predictable speech. For automated scoring of unrestricted spontaneous speech, speech proficiency has been evaluated primarily on aspects of pronunciation, fluency, vocabulary and language usage but not on aspects of content and topicality. In this paper, we explore features representing the accuracy of the content of a spoken response. Content features are generated using three similarity measures, including a lexical matching method (Vector Space Model) and two semantic similarity measures (Latent Semantic Analysis and Pointwise Mutual Information). All of the features exhibit moderately high correlations with human proficiency scores on human speech transcriptions. The correlations decrease somewhat due to recognition errors when evaluated on the output of an automatic speech recognition system; however, the additional use of word confidence scores can achieve correlations at a similar level as for human transcriptions.

Klaus Zechner | Keelan Evanini | Shasha Xie

[1] Lei Chen,et al. Applying Rhythm Features to Automatically Assess Non-Native Speech , 2011, INTERSPEECH.

[2] Peter D. Turney. Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[3] Vassilios Digalakis,et al. Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[4] Xiaoming Xi,et al. Improved pronunciation features for construct-driven assessment of non-native spontaneous speech , 2009, HLT-NAACL.

[5] Xiaoming Xi,et al. Automatic scoring of non-native spontaneous speech in tests of spoken English , 2009, Speech Commun..

[6] L. Boves,et al. Quantitative assessment of second language learners' fluency: comparisons between read and spontaneous speech. , 2002, The Journal of the Acoustical Society of America.

[7] Xiaoming Xi,et al. SpeechraterTM: a construct-driven approach to scoring spontaneous non-native speech , 2007, SLaTE.

[8] Steve J. Young,et al. Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[9] L. Boves,et al. Quantitative assessment of second language learners' fluency by means of automatic speech recognition technology. , 2000, The Journal of the Acoustical Society of America.

[10] Jian Cheng,et al. Validating automated speaking tests , 2010 .

[11] David B. Pisoni,et al. Two Experiments on Automatic Scoring of Spoken Language Proficiency , 2000 .