Preferred Appearance of Captions Generated by Automatic Speech Recognition for Deaf and Hard-of-Hearing Viewers

As the accuracy of Automatic Speech Recognition (ASR) nears human-level quality, it might become feasible as an accessibility tool for people who are Deaf and Hard of Hearing (DHH) to transcribe spoken language to text. We conducted a study using in-person laboratory methodologies, to investigate requirements and preferences for new ASR-based captioning services when used in a small group meeting context. The open-ended comments reveal an interesting dynamic between: caption readability (visibility of text) and occlusion (captions blocking the video contents). Our 105 DHH participants provided valuable feedback on a variety of caption-appearance parameters (strongly preferring familiar styles such as closed captions), and in this paper we start a discussion on how ASR captioning could be visually styled to improve text readability for DHH viewers.

[1]  Raja S. Kushalnagar,et al.  Tracked Speech-To-Text Display: Enhancing Accessibility and Readability of Real-Time Speech-To-Text , 2015, ASSETS.

[2]  Larwan Berke,et al.  Deaf and Hard-of-Hearing Perspectives on Imperfect Automatic Speech Recognition for Captioning One-on-One Meetings , 2017, ASSETS.

[3]  Soraia Silva Prietch,et al.  A Speech-To-Text System's Acceptance Evaluation: Would Deaf Individuals Adopt This Technology in Their Lives? , 2014, HCI.

[4]  Matt Huenerfauth,et al.  Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing , 2017, ASSETS.

[5]  Matt Huenerfauth,et al.  Deaf and Hard of Hearing Individuals' Perceptions of Communication with Hearing Colleagues in Small Groups , 2016, ASSETS.

[6]  Larwan Berke,et al.  Methods for Evaluation of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at Different Reading Literacy Levels , 2018, CHI.

[7]  Richard E. Ladner,et al.  Improving Real-Time Captioning Experiences for Deaf and Hard of Hearing Students , 2016, ASSETS.

[8]  Mike Y. Chen,et al.  SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing People in Group Conversations , 2018, CHI.

[9]  Agnieszka Szarkowska,et al.  Subtitles on the moving image: an overview of eye tracking studies , 2015 .

[10]  Carl Ralph Scott Jensema,et al.  Closed-Captioned Television Presentation Speed and Vocabulary , 1996, American annals of the deaf.

[11]  Shari Trewin,et al.  Leveraging Pauses to Improve Video Captions , 2018, ASSETS.

[12]  Meng Wang,et al.  Dynamic captioning: video accessibility enhancement for hearing impairment , 2010, ACM Multimedia.

[13]  Claudia M Pagliaro,et al.  Technology use among adults who are deaf and hard of hearing: a national survey. , 2014, Journal of deaf studies and deaf education.

[14]  Rhianne Jones,et al.  Online News Videos: The UX of Subtitle Position , 2015, ASSETS.

[15]  Zuzanna Klyszejko,et al.  Verbatim, Standard, or Edited?: Reading Patterns of Different Captioning Styles Among Deaf, Hard of Hearing, and Hearing Viewers , 2011, American annals of the deaf.