A System for Asessing Children Readings as School

In this paper we describe a system for analyzing the reading errors made by children of the primary and middle schools. To assess the reading skills of children in terms of reading accuracy and speed, a standard reading achievement test, developed by educational psychologists and named “Prove MT” (MT reading test), is used in the Italian schools. This test is based on a set of texts specific for different ages, from 7 to 13 years old. At present, during the test, children are asked to read aloud short stories, while teachers manually write down the reading errors on a sheet and then compute a total score based on several measures, such as duration of the whole reading, number of read syllables per second, number and type of errors, etc. The system we have developed is aimed to support the teachers in this task by automatically detecting the reading errors and estimating the needed measures. To do this we use an automatic speech-totext transcription system that employs a language model (LM) trained over the texts containing the stories to read. In addition, we embed in the LM an error model that allows to take into account typical reading errors, mostly consisting in pronunciation errors, substitutions of syllables or words, word truncation, etc. To evaluate the performance of our system we collected 20 audio recordings, uttered by 8-13 years old children, reading a novel belonging to “Prove MT” set. It is worth mentioning that the error model proposed in this paper for assessing the reading capabilities of children performs closely to an “oracle” error model obtained from manual transcriptions of the readings themselves.

[1]  Daniele Falavigna,et al.  DNN adaptation by automatic quality estimation of ASR hypotheses , 2017, Comput. Speech Lang..

[2]  Michael Picheny,et al.  Improvements in children's speech recognition performance , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Vassilios Digalakis,et al.  Automatic pronunciation evaluation of foreign speakers using unknown text , 2007, Comput. Speech Lang..

[4]  Fabio Brugnara,et al.  Acoustic variability and automatic recognition of children's speech , 2007, Speech Commun..

[5]  Ronald A. Cole,et al.  Automatic assessment of expressive oral reading , 2013, Speech Commun..

[6]  Helmer Strik,et al.  Automatic Speech Recognition for second language learning: How and why it actually works , 2003 .

[7]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[8]  K. Johnson,et al.  Formants of children, women, and men: the effects of vocal intensity variation. , 1999, The Journal of the Acoustical Society of America.

[9]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[10]  Martin J. Russell,et al.  Why is automatic recognition of children's speech difficult? , 2001, INTERSPEECH.

[11]  M P Black,et al.  Automatic Prediction of Children's Reading Ability for High-Level Literacy Assessment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Jay G. Wilpon,et al.  A study of speech recognition for children and the elderly , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[13]  Jack Mostow,et al.  Towards a Reading Coach that Listens: Automated Detection of Oral Reading Errors , 1993, AAAI.

[14]  Lukás Burget,et al.  Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.

[15]  Helmer Strik,et al.  The Pedagogy-Technology Interface in Computer Assisted Pronunciation Training , 2002 .

[16]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[17]  Tara N. Sainath,et al.  Large vocabulary automatic speech recognition for children , 2015, INTERSPEECH.

[18]  Diego Giuliani,et al.  Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children† , 2016, Natural Language Engineering.

[19]  Fabio Brugnara,et al.  Towards age-independent acoustic modeling , 2009, Speech Commun..

[20]  Michael Carbonaro,et al.  Does Popular Speech Recognition Software Work with ESL Speech , 2000 .

[21]  Diego Giuliani,et al.  Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[22]  Shrikanth S. Narayanan,et al.  Robust recognition of children's speech , 2003, IEEE Trans. Speech Audio Process..

[23]  Raymond D. Kent,et al.  Speech segment durations in sentence recitations by children and adults , 1980 .

[24]  David Coniam,et al.  Voice Recognition Software Accuracy with Second Language Speakers of English. , 1999 .

[25]  Diego Giuliani,et al.  Investigating recognition of children's speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[26]  Kristin Precoda,et al.  The SRI EduSpeak System: Recognition and Pronunciation Scoring for Language Learning , 2007 .

[27]  Diego Giuliani,et al.  Large vocabulary children's speech recognition with DNN-HMM and SGMM acoustic modeling , 2015, INTERSPEECH.

[28]  Mauro Cettolo,et al.  IRSTLM: an open source toolkit for handling large scale language models , 2008, INTERSPEECH.

[29]  Michael L. Kamil Current and historical perspectives on reading research and instruction. , 2012 .

[30]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[31]  Shrikanth S. Narayanan,et al.  Acoustics of children's speech: developmental changes of temporal and spectral parameters. , 1999, The Journal of the Acoustical Society of America.

[32]  Diego Giuliani,et al.  DNN adaptation for recognition of children speech through automatic utterance selection , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).