论文信息 - Revisiting Blind Photography in the Context of Teachable Object Recognizers

Revisiting Blind Photography in the Context of Teachable Object Recognizers

For people with visual impairments, photography is essential in identifying objects through remote sighted help and image recognition apps. This is especially the case for teachable object recognizers, where recognition models are trained on user's photos. Here, we propose real-time feedback for communicating the location of an object of interest in the camera frame. Our audio-haptic feedback is powered by a deep learning model that estimates the object center location based on its proximity to the user's hand. To evaluate our approach, we conducted a user study in the lab, where participants with visual impairments (N=9) used our feedback to train and test their object recognizer in vanilla and cluttered environments. We found that very few photos did not include the object (2% in the vanilla and 8% in the cluttered) and the recognition performance was promising even for participants with no prior camera experience. Participants tended to trust the feedback even though they know it can be wrong. Our cluster analysis indicates that better feedback is associated with photos that include the entire object. Our results provide insights into factors that can degrade feedback and recognition performance in teachable interfaces.

[1] Dragan Ahmetovic,et al. Sonification of guidance data during road crossing for people with visual impairments or blindness , 2015, Int. J. Hum. Comput. Stud..

[2] Francesca Odone,et al. “Hands On” Visual Recognition for Visually Impaired Users , 2017, TACC.

[3] Jiebo Luo,et al. VizWiz Grand Challenge: Answering Visual Questions from Blind People , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Andrew Zisserman,et al. Flowing ConvNets for Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5] Jeremi Sudol. LookTel - Computer Vision Applications for the Visually Impaired , 2013 .

[6] Josh D. Tenenberg,et al. A blind person's interactions with technology , 2009, Commun. ACM.

[7] Xiaofeng Ren,et al. Figure-ground segmentation improves handled object recognition in egocentric video , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8] Sri Hastuti Kurniawan,et al. Blind Photographers and VizSnap: A Long-Term Study , 2016, ASSETS.

[9] Stefan Lee,et al. Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10] Sri Hastuti Kurniawan,et al. A blind-friendly photography application for smartphones , 2014, ACM SIGACCESS Access. Comput..

[11] Kris M. Kitani,et al. Going Deeper into First-Person Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Chieko Asakawa,et al. People with Visual Impairment Training Personal Object Recognizers: Feasibility and Challenges , 2017, CHI.

[13] Jeffrey P. Bigham,et al. VizWiz: nearly real-time answers to visual questions , 2010, W4A.

[14] Cristian Bernareggi,et al. Towards Large Scale Evaluation of Novel Sonification Techniques for Non Visual Shape Exploration , 2015, ASSETS.

[15] Rabia Jafri,et al. Computer vision-based object recognition for the visually impaired in an indoors environment: a survey , 2013, The Visual Computer.

[16] Hironobu Takagi,et al. Supporting Orientation of People with Visual Impairment: Analysis of Large Scale Usage Data , 2016, ASSETS.

[17] Larry D. Rosen,et al. The Media and Technology Usage and Attitudes Scale: An empirical investigation , 2013, Comput. Hum. Behav..

[18] Kyungjun Lee,et al. Exploring Machine Teaching for Object Recognition with the Crowd , 2019, CHI Extended Abstracts.

[19] Jeffrey P. Bigham,et al. EasySnap: real-time audio feedback for blind photography , 2010, UIST '10.

[20] Meredith Ringel Morris,et al. How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media , 2018, CHI.

[21] Erin Brady,et al. Visual challenges in the everyday lives of blind people , 2013, CHI.

[22] Kyungjun Lee,et al. Hands Holding Clues for Object Recognition in Teachable Machines , 2019, CHI.

[23] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Jean-Loup Guillaume,et al. Fast unfolding of communities in large networks , 2008, 0803.0476.

[25] Jeffrey P. Bigham,et al. Real time object scanning using a mobile phone and cloud-based visual search engine , 2013, ASSETS.

[26] Yuhang Zhao,et al. A Face Recognition Application for People with Visual Impairments: Understanding Use Beyond the Lab , 2018, CHI.

[27] Feng Zhou,et al. Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Zdenek Míkovec,et al. BlindCamera: Central and Golden-ratio Composition for Blind Photographers , 2015, MIDI '15.

[29] I. Gauthier,et al. Visual object understanding , 2004, Nature Reviews Neuroscience.

[30] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31] Jeffrey P. Bigham,et al. VizWiz::LocateIt - enabling blind people to locate objects in their environment , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[32] Aaron Steinfeld,et al. Helping visually impaired users properly aim a camera , 2012, ASSETS '12.

[33] Susumu Harada,et al. Accessible photo album: enhancing the photo sharing experience for people with visual impairment , 2013, CHI.

[34] Jeffrey P. Bigham,et al. Supporting blind photography , 2011, ASSETS.

[35] Sri Hastuti Kurniawan,et al. Interviewing blind photographers: design insights for a smartphone application , 2013, ASSETS.

[36] Gang Wang,et al. Unsupervised Clickstream Clustering for User Behavior Analysis , 2016, CHI.

[37] Per Ola Kristensson,et al. Supporting blind navigation using depth sensing and sonification , 2013, UbiComp.

[38] Hironobu Takagi,et al. Insights on Assistive Orientation and Mobility of People with Visual Impairment Based on Large-Scale Longitudinal Data , 2018, ACM Trans. Access. Comput..

[39] Aaron Steinfeld,et al. An Assisted Photography Framework to Help Visually Impaired Users Properly Aim a Camera , 2014, TCHI.

[40] James M. Rehg,et al. Learning to recognize objects in egocentric activities , 2011, CVPR 2011.

[41] James M. Rehg,et al. Learning to Recognize Daily Actions Using Gaze , 2012, ECCV.