Exploring Content Features for Automated Speech Scoring

Most previous research on automated speech scoring has focused on restricted, predictable speech. For automated scoring of unrestricted spontaneous speech, speech proficiency has been evaluated primarily on aspects of pronunciation, fluency, vocabulary and language usage but not on aspects of content and topicality. In this paper, we explore features representing the accuracy of the content of a spoken response. Content features are generated using three similarity measures, including a lexical matching method (Vector Space Model) and two semantic similarity measures (Latent Semantic Analysis and Pointwise Mutual Information). All of the features exhibit moderately high correlations with human proficiency scores on human speech transcriptions. The correlations decrease somewhat due to recognition errors when evaluated on the output of an automatic speech recognition system; however, the additional use of word confidence scores can achieve correlations at a similar level as for human transcriptions.

[1]  Lei Chen,et al.  Applying Rhythm Features to Automatically Assess Non-Native Speech , 2011, INTERSPEECH.

[2]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[3]  Vassilios Digalakis,et al.  Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[4]  Xiaoming Xi,et al.  Improved pronunciation features for construct-driven assessment of non-native spontaneous speech , 2009, HLT-NAACL.

[5]  Xiaoming Xi,et al.  Automatic scoring of non-native spontaneous speech in tests of spoken English , 2009, Speech Commun..

[6]  L. Boves,et al.  Quantitative assessment of second language learners' fluency: comparisons between read and spontaneous speech. , 2002, The Journal of the Acoustical Society of America.

[7]  Xiaoming Xi,et al.  SpeechraterTM: a construct-driven approach to scoring spontaneous non-native speech , 2007, SLaTE.

[8]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[9]  L. Boves,et al.  Quantitative assessment of second language learners' fluency by means of automatic speech recognition technology. , 2000, The Journal of the Acoustical Society of America.

[10]  Jian Cheng,et al.  Validating automated speaking tests , 2010 .

[11]  David B. Pisoni,et al.  Two Experiments on Automatic Scoring of Spoken Language Proficiency , 2000 .

[12]  Jill Burstein,et al.  AUTOMATED ESSAY SCORING WITH E‐RATER® V.2.0 , 2004 .

[13]  Klaus Zechner,et al.  Computing and Evaluating Syntactic Complexity Features for Automated Scoring of Spontaneous Non-Native Speech , 2011, ACL.

[14]  Xiaoming Xi,et al.  Towards Automatic Scoring of a Test of Spoken Language with Heterogeneous Task Types , 2008 .

[15]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[16]  Helmer Strik,et al.  Automatic evaluation of Dutch pronunciation by using speech recognition technology , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[17]  Jian Cheng,et al.  Fluency and structural complexity as predictors of L2 oral proficiency , 2010, INTERSPEECH.

[18]  Peter W. Foltz,et al.  The intelligent essay assessor: Applications to educational technology , 1999 .

[19]  Maxine Eskénazi,et al.  An overview of spoken language technology for education , 2009, Speech Commun..

[20]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[21]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .