"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

Access to digital images is important to people who are blind or have low vision (BLV). Many contemporary image description efforts do not take into account this population's nuanced image description preferences. In this paper, we present a qualitative study that provides insight into 28 BLV people's experiences with descriptions of digital images from news websites, social networking sites/platforms, eCommerce websites, employment websites, online dating websites/platforms, productivity applications, and e-publications. Our findings reveal how image description preferences vary based on the source where digital images are encountered and the surrounding context. We provide recommendations for the development of next-generation image description technologies inspired by our empirical analysis.

[1]  Kristen Grauman,et al.  BrowseWithMe: An Online Clothes Shopping Assistant for People with Visual Impairments , 2018, ASSETS.

[2]  Meredith Ringel Morris,et al.  AI and accessibility , 2019, Commun. ACM.

[3]  Jian Sun,et al.  Rich Image Captioning in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Cynthia L. Bennett,et al.  What is the point of fairness? , 2019, ACM SIGACCESS Access. Comput..

[6]  Meredith Ringel Morris,et al.  Gauging Receptiveness to Social Microvolunteering , 2015, CHI.

[7]  Meredith Ringel Morris,et al.  Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images , 2017, CHI.

[8]  Adam Fourney,et al.  Exploring the Role of Conversational Cues in Guided Task Support with Virtual Assistants , 2018, CHI.

[9]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  John M. Slatin,et al.  Maximum Accessibility: Making Your Web Site More Usable for Everyone , 2002 .

[11]  V. Braun,et al.  Using thematic analysis in psychology , 2006 .

[12]  Alex Q. Chen,et al.  Web accessibility guidelines , 2011, World Wide Web.

[13]  Jeffrey P. Bigham,et al.  VizWiz: nearly real-time answers to visual questions , 2010, W4A.

[14]  Meredith Ringel Morris,et al.  “It's almost like they're trying to hide it”: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible , 2019, WWW.

[15]  Yuhang Zhao,et al.  The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People with Visual Impairments , 2017, Proc. ACM Hum. Comput. Interact..

[16]  Shaomei Wu,et al.  Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service , 2017, CSCW.

[17]  Darren Guinness,et al.  Caption Crawler: Enabling Reusable Alternative Text Descriptions using Reverse Image Search , 2018, CHI.

[18]  Peng Wang,et al.  Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Trisha O'Connell,et al.  Audible Image Description as an Accommodation in Statewide Assessments for Students with Visual and Print Disabilities , 2017 .

[20]  Anselm L. Strauss,et al.  Basics of Qualitative research : techniques and , 2008 .

[21]  Meredith Ringel Morris,et al.  Toward Scalable Social Alt Text: Conversational Crowdsourcing as a Tool for Refining Vision-to-Language Technology for the Blind , 2017, HCOMP.

[22]  Jiebo Luo,et al.  VizWiz Grand Challenge: Answering Visual Questions from Blind People , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Meredith Ringel Morris,et al.  Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing , 2018, IJCAI.

[24]  Richard E. Ladner,et al.  The design of human-powered access technology , 2011, ASSETS.

[25]  Jonathan Lazar,et al.  Improving web accessibility: a study of webmaster perceptions , 2004, Comput. Hum. Behav..

[26]  Meredith Ringel Morris,et al.  Toward fairness in AI for people with disabilities SBG@a research roadmap , 2019, ACM SIGACCESS Access. Comput..

[27]  Gilly Leshed,et al.  How Blind People Interact with Visual Content on Social Networking Services , 2016, CSCW.

[28]  Meredith Ringel Morris,et al.  How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media , 2018, CHI.

[29]  Erin Brady,et al.  Visual challenges in the everyday lives of blind people , 2013, CHI.

[30]  Morgan Klaus Scheuerman,et al.  "Is Someone There? Do They Have a Gun": How Visual Information about Others Can Improve Personal Safety Management for Blind Individuals , 2017, ASSETS.

[31]  Julie A. Kientz,et al.  Filtered Out , 2017, Proc. ACM Hum. Comput. Interact..

[32]  Walter S. Lasecki,et al.  RegionSpeak: Quick Comprehensive Spatial Descriptions of Complex Images for Blind Users , 2015, CHI.

[33]  Meredith Ringel Morris,et al.  Rich Representations of Visual Content for Screen Reader Users , 2018, CHI.

[34]  Jeffrey P. Bigham,et al.  Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities , 2012, ASSETS '12.

[35]  Valerie Morash,et al.  Guiding Novice Web Workers in Making Image Descriptions Using Templates , 2015, ACM Trans. Access. Comput..

[36]  Pam J. Mayhew,et al.  How many participants are really enough for usability studies? , 2014, 2014 Science and Information Conference.

[37]  HarperSimon,et al.  Web Accessibility Guidelines , 2003, Computer.

[38]  Meredith Ringel Morris,et al.  "With most of it being pictures now, I rarely use it": Understanding Twitter's Evolving Accessibility to Blind Users , 2016, CHI.

[39]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).