A comparison of human and computer recognition accuracy for children's speech

Several studies have shown that automatic speech recognition error rates are greater for children’s speech than for adult’s speech. Investigations have demonstrated that word recognition error rates increase as age decreases, and that recognition performance for children’s speech is more sensitive to bandwidth reduction, compared with adult speech. This paper presents the results of experiments to measure human recognition performance for children’s speech. The paper compares human and machine recognition performance on the same children’s speech data. It is shown that human recognition performance for children’s speech exhibits similar effects of age and bandwidth to those observed for automatic systems. The results suggest that effects of age and bandwidth on automatic speech recognition accuracy are due to properties of children’s speech rather than artifacts of the technology

[1]  Qun Li,et al.  An analysis of the causes of increased error rates in children²s speech recognition , 2002, INTERSPEECH.

[2]  Jack Mostow,et al.  A Prototype Reading Coach that Listens , 1994, AAAI.

[3]  Shrikanth S. Narayanan,et al.  Acoustics of children's speech: developmental changes of temporal and spectral parameters. , 1999, The Journal of the Acoustical Society of America.

[4]  Martin J. Russell,et al.  Recognition of read and spontaneous children's speech using two new corpora , 2004, INTERSPEECH.

[5]  Jay G. Wilpon,et al.  A study of speech recognition for children and the elderly , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  Daniel Elenius,et al.  The PF_STAR children's speech corpus , 2005, INTERSPEECH.