Using automatic speech recognition to assess the reading proficiency of a diverse sample of middle school students

This paper describes a study exploring automated assessment of reading proficiency, in terms of oral reading and reading comprehension, for a middle school population including students with reading disabilities and low reading proficiency, utilizing automatic speech recognition technology. We build statistical models using features related to fluency, pronunciation, and reading accuracy to predict three dependent variables: two are related to accuracy and speed of reading, the third is a reading comprehension measure from a state assessment of reading. The correlation coefficients of the best-performing linear regression models range from r = 0.64 (reading comprehension score) to 0.98 (correctly read words per minute). We further look at the features with the highest absolute regression weights in the three models and find that most of them fall into the classes of reading accuracy and reading speed. Still, features from the pronunciation class and other fluency features, e.g., relating to silences in the read speech, are also represented in the regression models but with less emphasis.

[1]  Abeer Alwan,et al.  A System for Technology Based Assessment of Language and Literacy in Young Children: the Role of Multiple Information Sources , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[2]  Jian Cheng,et al.  Validation of Automated Scoring of Oral Reading , 2012 .

[3]  S. Deno,et al.  Curriculum-Based Measurement: The Emerging Alternative , 1985, Exceptional children.

[4]  Sharon Vaughn,et al.  Responsiveness-to-Intervention , 2012, Journal of learning disabilities.

[5]  Jack Mostow,et al.  Predicting oral reading miscues , 2002, INTERSPEECH.

[6]  Mitch Weintraub,et al.  Automatic scoring of pronunciation quality , 2000, Speech Commun..

[7]  Michelle K. Hosp,et al.  Oral Reading Fluency as an Indicator of Reading Competence: A Theoretical, Empirical, and Historical Analysis , 2001 .

[8]  Jian Cheng,et al.  Validating automated speaking tests , 2010 .

[9]  Jack Mostow,et al.  Mining Data from Project LISTEN's Reading Tutor to Analyze Development of Children's Oral Reading Prosody , 2012, FLAIRS.

[10]  Jay R. Campbell,et al.  The Nation's Report Card: Reading, 2002. , 2003 .

[11]  Alan E. Farstrup,et al.  What Research Has to Say About Fluency Instruction , 2002 .

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[13]  Shrikanth S. Narayanan,et al.  A generative model for scoring children2s reading comprehension , 2008, WOCCI.

[14]  Abeer Alwan,et al.  Pronunciation verification of children²s speech for automatic literacy assessment , 2006, INTERSPEECH.

[15]  Abeer Alwan,et al.  A Bayesian network classifier for word-level reading assessment , 2007, INTERSPEECH.

[16]  John Sabatini,et al.  Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists , 2009, BEA@NAACL.

[17]  Jack Mostow,et al.  A Prototype Reading Coach that Listens , 1994, AAAI.

[18]  Jack Mostow,et al.  Two methods for assessing oral reading prosody , 2011, TSLP.

[19]  Lou Boves,et al.  Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms , 2000, Speech Commun..

[20]  G. Tindal,et al.  Oral Reading Fluency Norms: A Valuable Assessment Tool for Reading Teachers , 2006 .

[21]  Teri Wallace,et al.  Literature Synthesis on Curriculum-Based Measurement in Reading , 2007 .