Recognizing Young Readers' Spoken Questions

Free-form spoken input would be the easiest and most natural way for young children to communicate to an intelligent tutoring system. However, achieving such a capability poses a challenge both to instruction design and to automatic speech recognition. To address the difficulties of accepting such input, we adopt the framework of predictable response training, which aims at simultaneously achieving linguistic predictability and educational utility. We design instruction in this framework to teach children the reading comprehension strategy of self-questioning. To filter out some misrecognized speech, we combine acoustic confidence with language modeling techniques that exploit the predictability of the elicited responses. Compared to a baseline that does neither, this approach performs significantly better in concept recall 47% vs. 28% and precision 61% vs. 39% on 250 unseen utterances from 34 previously unseen speakers. We conclude with some design implications for future speech enabled tutoring systems.

[1]  W. Lewis Johnson,et al.  Improving the authoring of foreign language interactive lessons in the tactical language training system , 2007, SLaTE.

[2]  Gregory Aist,et al.  Generating Questions Automatically from Informational Text , 2009 .

[3]  D. Langenberg Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction , 2000 .

[4]  Wei Chen,et al.  Understanding Mental States in Natural Language , 2009, IWCS.

[5]  Brady Clark,et al.  Responding to Student Uncertainty in Spoken Tutorial Dialogue Systems , 2006, Int. J. Artif. Intell. Educ..

[6]  Jack Mostow,et al.  Exploiting Predictable Response Training to Improve Automatic Recognition of Children's Spoken Responses , 2010, Intelligent Tutoring Systems.

[7]  Jack Mostow,et al.  Generating Instruction Automatically for the Reading Strategy of Self-Questioning , 2009, AIED.

[8]  Shrikanth S. Narayanan,et al.  Acoustic analysis and automatic recognition of spontaneous children²s speech , 2006, INTERSPEECH.

[9]  Michael E. Bratman,et al.  Intention, Plans, and Practical Reason , 1991 .

[10]  P. David Pearson,et al.  Effective Practices for Developing Reading Comprehension , 2009 .

[11]  Scott E. Fahlman,et al.  Marker-Passing Inference in the Scone Knowledge-Base System , 2006, KSEM.

[12]  I. Hirsh,et al.  Development of speech sounds in children. , 1969, Acta oto-laryngologica. Supplementum.

[13]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[14]  Shrikanth S. Narayanan,et al.  A review of the acoustic and linguistic properties of children's speech , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[15]  Jack Mostow,et al.  Designing spoken tutorial dialogue with children to elicit predictable but educationally valuable responses , 2009, INTERSPEECH.

[16]  Alan E. Farstrup,et al.  What Research Has to Say about Reading Instruction, Third Edition , 2011 .

[17]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[18]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[19]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[20]  Bonnie J. F. Meyer,et al.  Design and Pilot of a Web-Based Intelligent Tutoring System to Improve Reading Comprehension in Middle School Students , 2006 .

[21]  Jack Mostow,et al.  Giving Help and Praise in a Reading Tutor with Imperfect Listening--Because Automated Speech Recognition Means Never Being Able to Say You're Certain , 2013, CALICO Journal.

[22]  Jack Mostow,et al.  Using Automated Questions to Assess Reading Comprehension, Vocabulary, and Effects of Tutorial Interventions , 2004 .

[23]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[24]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[25]  Albert T. Corbett,et al.  Evaluation of an Automated Reading Tutor That Listens: Comparison to Human Tutoring and Classroom Instruction , 2003 .

[26]  Jack Mostow,et al.  Evaluating tutors that listen: an overview of project LISTEN , 2001 .

[27]  Ronald Rosenfeld,et al.  Improving trigram language modeling with the World Wide Web , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[28]  Albert T. Corbett,et al.  Mining Free-form Spoken Responses to Tutor Prompts , 2008, EDM.

[29]  Jack Mostow,et al.  Predictable and educational spoken dialogues: pilot results , 2009, SLaTE.

[30]  Anthony Jameson,et al.  Interpreting symptoms of cognitive load in speech input , 1999 .

[31]  David G. Novick,et al.  Systematic design of spoken prompts , 1996, CHI.

[32]  E. W. Dolch,et al.  A Basic Sight Vocabulary , 1936, The Elementary School Journal.

[33]  Diane J. Litman,et al.  ITSPOKE: An Intelligent Tutoring Spoken Dialogue System , 2004, NAACL.

[34]  B. Rosenshine,et al.  Teaching Students to Generate Questions: A Review of the Intervention Studies , 1996 .

[35]  Martin J. Russell,et al.  Challenges for computer recognition of children2s speech , 2007, SLaTE.

[36]  Ronald A. Cole,et al.  Advances in Children's Speech Recognition within an Interactive Literacy Tutor , 2004, HLT-NAACL.

[37]  Stephanie Seneff,et al.  Automatic induction of language model data for a spoken dialogue system , 2006, SIGDIAL.