A readability evaluation of real-time crowd captions in the classroom

Deaf and hard of hearing individuals need accommodations that transform aural to visual information, such as captions that are generated in real-time to enhance their access to spoken information in lectures and other live events. The captions produced by professional captionists work well in general events such as community or legal meetings, but is often unsatisfactory in specialized content events such as higher education classrooms. In addition, it is hard to hire professional captionists, especially those that have experience in specialized content areas, as they are scarce and expensive. The captions produced by commercial automatic speech recognition (ASR) software are far cheaper, but is often perceived as unreadable due to ASR's sensitivity to accents, background noise and slow response time. We ran a study to evaluate the readability of captions generated by a new crowd captioning approach versus professional captionists and ASR. In this approach, captions are typed by classmates into a system that aligns and merges the multiple incomplete caption streams into a single, comprehensive real-time transcript. Our study asked 48 deaf and hearing readers to evaluate transcripts produced by a professional captionist, ASR and crowd captioning software respectively and found the readers preferred crowd captions over professional captions and ASR.

[1]  Carl Ralph Scott Jensema,et al.  Closed-Captioned Television Presentation Speed and Vocabulary , 1996, American annals of the deaf.

[2]  Mike Wald Creating accessible educational multimedia through editing automatic speech recognition captioning in real time , 2006, Interact. Technol. Smart Educ..

[3]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[4]  Robert Burch,et al.  Time Spent Viewing Captions on Television Programs , 2000, American annals of the deaf.

[5]  Michiel Bacchiani,et al.  Restoring punctuation and capitalization in transcribed speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Rob Miller,et al.  VizWiz: nearly real-time answers to visual questions , 2010, UIST.

[7]  R. Mitchell,et al.  How many deaf people are there in the United States? Estimates from the Survey of Income and Program Participation. , 2005, Journal of deaf studies and deaf education.

[8]  Andreas Stolcke,et al.  Enriching speech recognition with automatic detection of sentence boundaries and disfluencies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  M. Wald Using Automatic Speech Recognition to Enhance Education for All Students: Turning a Vision into Reality , 2005, Proceedings Frontiers in Education 35th Annual Conference.

[10]  Walter S. Lasecki,et al.  Online quality control for real-time crowd captioning , 2012, ASSETS '12.

[11]  Thomas Way,et al.  Inclusion of deaf students in computer science classes using real-time speech transcription , 2007, ITiCSE '07.

[12]  Denis Burnham,et al.  Parameters in television captioning for deaf and hard-of-hearing adults: effects of caption rate versus text reduction on comprehension. , 2008, Journal of deaf studies and deaf education.

[13]  Rob Miller,et al.  Real-time crowd control of existing interfaces , 2011, UIST.

[14]  Paul Dourish,et al.  Proceedings of the 8th international conference on Ubiquitous Computing , 2006 .

[15]  Tara Matthews,et al.  Scribe4Me: Evaluating a Mobile Sound Transcription Tool for the Deaf , 2006, UbiComp.

[16]  Haoqi Zhang,et al.  An Iterative Dual Pathway Structure for Speech-to-Text Transcription , 2011, Human Computation.

[17]  Walter S. Lasecki,et al.  Real-time captioning by groups of non-experts , 2012, UIST.

[18]  Wei Zhang,et al.  Developing high performance asr in the IBM multilingual speech-to-speech translation system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  S. Jay Samuels,et al.  Establishing Appropriate Purpose for Reading and Its Effect on Flexibility of Reading Rate. , 1975 .