What Makes Videos Accessible to Blind and Visually Impaired People?

User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description of the visual content, BVI people in our formative studies reported that they used a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7 of the identified heuristics (2 audio-related, 2 video-related, and 3 audio-visual) as automated metrics to assess video accessibility. We collected a dataset of accessibility ratings of videos by BVI people and found that our automatic video accessibility metrics correlated with the accessibility ratings (Adjusted R2 = 0.642). We augmented a video search interface with our video accessibility metrics and predictions. BVI people using our augmented video search interface selected an accessible video more efficiently than when using the original search interface. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.

[1]  Denis Laurendeau,et al.  Towards computer-vision software tools to increase production and accessibility of video description for people with vision loss , 2009, Universal Access in the Information Society.

[2]  Hironobu Takagi,et al.  Providing synthesized audio description for online videos , 2009, Assets '09.

[3]  Gregg C. Vanderheiden,et al.  Web Content Accessibility Guidelines (WCAG) 2.0 , 2008 .

[4]  Berkeley J. Dietvorst,et al.  Algorithm Aversion: People Erroneously Avoid Algorithms after Seeing Them Err , 2014, Journal of experimental psychology. General.

[5]  Lixin Gao,et al.  The impact of YouTube recommendation system on video views , 2010, IMC '10.

[6]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[7]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[8]  Shaomei Wu,et al.  Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service , 2017, CSCW.

[9]  Jeffrey P. Bigham,et al.  Rescribe: Authoring and Automatically Editing Audio Descriptions , 2020, UIST.

[10]  Deborah I. Fels,et al.  LiveDescribe: Can Amateur Describers Create High-Quality Audio Description? , 2012 .

[11]  Paul M. Haridakis,et al.  Social Interaction and Co-Viewing With YouTube: Blending Mass Communication Reception and Social Connection , 2009 .

[12]  Hironobu Takagi,et al.  Are synthesized video descriptions acceptable? , 2010, ASSETS '10.

[13]  Hyunggu Jung,et al.  Understanding the community of blind or visually impaired vloggers on YouTube , 2020, Universal Access in the Information Society.

[14]  Patricia Acosta-Vargas,et al.  Web Accessibility Evaluation of Videos Published on YouTube by Worldwide Top-Ranking Universities , 2020, IEEE Access.

[15]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[16]  Meredith Ringel Morris,et al.  How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media , 2018, CHI.

[17]  Victoria Johansson,et al.  Lexical diversity and lexical density in speech and writing , 2009 .

[18]  Markel Vigo,et al.  Automatic web accessibility metrics: Where we are and where we can go , 2011, Interact. Comput..

[19]  Manfred Tscheligi,et al.  Ensuring Accessibility: Individual Video Playback Enhancements for Low Vision Users , 2020, ASSETS.

[20]  Haitao Liu,et al.  The effects of sentence length on dependency distance, dependency direction and the implications–Based on a parallel English–Chinese dependency treebank , 2015 .

[21]  M. Laeeq Khan Social media engagement: What motivates user participation and consumption on YouTube? , 2017, Comput. Hum. Behav..

[22]  Yannick Prié,et al.  An adaptive videos enrichment system based on decision trees for people with sensory disabilities , 2011, W4A.

[23]  Hernisa Kacorri,et al.  ViScene: A Collaborative Authoring Tool for Scene Descriptions in Videos , 2020, ASSETS.

[24]  Jennifer Mankoff,et al.  Is your web page accessible?: a comparative study of methods for assessing web page accessibility for the blind , 2005, CHI.

[25]  Markel Vigo,et al.  Enriching web information scent for blind users , 2009, Assets '09.

[26]  Hsiu-Sen Chiang,et al.  YouTube stickiness: the needs, personal, and environmental perspective , 2015, Internet Res..

[27]  Markel Vigo,et al.  Benchmarking web accessibility evaluation tools: measuring the harm of sole reliance on automated tests , 2013, W4A.

[28]  Peter Gregor,et al.  Evaluating web resources for disability access , 2000, Assets '00.

[29]  Heng Tao Shen,et al.  Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.

[30]  Jaclyn Packer,et al.  An Overview of Video Description: History, Benefits, and Guidelines , 2015 .

[31]  Robert Fildes,et al.  Against Your Better Judgment? How Organizations Can Improve Their Use of Management Judgment in Forecasting , 2007, Interfaces.

[32]  Gilly Leshed,et al.  How Blind People Interact with Visual Content on Social Networking Services , 2016, CSCW.

[33]  Paloma Martínez,et al.  Checklist for Accessible Media Player Evaluation , 2017, ASSETS.

[34]  Pooyan Fazli,et al.  Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users , 2020, Conference on Designing Interactive Systems.

[35]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[36]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[37]  Roger Wattenhofer,et al.  The YouTube Social Network , 2012, ICWSM.

[38]  Jill Whitehead,et al.  What is audio description , 2005 .

[39]  Meredith Ringel Morris,et al.  “It's almost like they're trying to hide it”: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible , 2019, WWW.

[40]  Ronald J. Chenail YouTube as a Qualitative Research Asset: Reviewing User Generated Videos as Learning Resources. , 2011 .

[41]  张海涛,et al.  帝都·王者 JEEP Grand Cherokee , 2012 .

[42]  Bernt Schiele,et al.  A dataset for Movie Description , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).