Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning

According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called Labeldroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show thatour model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.

[1]  Richard E. Ladner,et al.  Design for user empowerment , 2014, CHI Extended Abstracts.

[2]  Zhenchang Xing,et al.  Mining Likely Analogical APIs Across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding , 2019, IEEE Transactions on Software Engineering.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[5]  Zhenchang Xing,et al.  Learning a dual-language vector space for domain-specific cross-lingual question retrieval , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[6]  Zhenchang Xing,et al.  ActionNet: Vision-Based Workflow Action Recognition From Programming Screencasts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[7]  Raquel Pérez-delHoyo,et al.  A Comprehensive System for Monitoring Urban Accessibility in Smart Cities , 2017, Sensors.

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Alexander G. Schwing,et al.  Convolutional Image Captioning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Jacob O. Wobbrock,et al.  Examining Image-Based Button Labeling for Accessibility in Android Apps through Large-Scale Analysis , 2018, ASSETS.

[12]  Zhenlong Yuan,et al.  DroidDetector: Android Malware Characterization and Detection Using Deep Learning , 2016 .

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Christopher Vendome,et al.  How developers detect and fix performance bottlenecks in Android apps , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[15]  Zhenchang Xing,et al.  Gallery D.C.: Design Search and Knowledge Discovery through Auto-created GUI Component Gallery , 2019, Proc. ACM Hum. Comput. Interact..

[16]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[17]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[18]  Richard E. Ladner,et al.  Design for user empowerment , 2015, Interactions.

[19]  Norman E. Youngblood,et al.  E-government in Alabama: An analysis of county voting and election website content, usability, accessibility, and mobile readiness , 2016, Gov. Inf. Q..

[20]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[21]  Min Chen,et al.  A novel pre-cache schema for high performance Android system , 2016, Future Gener. Comput. Syst..

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[24]  Minhui Xue,et al.  GUI-Squatting Attack: Automated Generation of Android Phishing Apps , 2019, IEEE Transactions on Dependable and Secure Computing.

[25]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[26]  Guoqiang Li,et al.  Data-Driven Proactive Policy Assurance of Post Quality in Community q&a Sites , 2018, Proc. ACM Hum. Comput. Interact..

[27]  Yang Liu,et al.  From UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[28]  Taedong Goh,et al.  Toward accessible mobile application design: developing mobile application accessibility guidelines for people with visual impairment , 2014 .

[29]  Margaret Butler,et al.  Android: Changing the Mobile Landscape , 2011, IEEE Pervasive Computing.

[30]  Romain Rouvoy,et al.  Tracking the Software Quality of Android Applications Along Their Evolution (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[31]  Tobias Dehling,et al.  Exploring the Far Side of Mobile Health: Information Security and Privacy of Mobile Health Apps on iOS and Android , 2015, JMIR mHealth and uHealth.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Zhenchang Xing,et al.  Seenomaly: Vision-Based Linting of GUI Animation Effects Against Design-Don't Guidelines , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[35]  Christos Faloutsos,et al.  Why people hate your app: making sense of user feedback in a mobile app store , 2013, KDD.

[36]  Tomoki Toda,et al.  Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[37]  C. Lawrence Zitnick,et al.  CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[39]  Zhenchang Xing,et al.  Domain-specific machine translation with recurrent neural network for software localization , 2019, Empirical Software Engineering.

[40]  Zhenchang Xing,et al.  A Neural Model for Method Name Generation from Functional Description , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  Bo Li,et al.  Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach , 2017, Comput. Secur..

[43]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[45]  Hai-Feng Guo,et al.  Debugging Energy-Efficiency Related Field Failures in Mobile Apps , 2016, 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[46]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[47]  Yepang Liu,et al.  Taming Android fragmentation: Characterizing and detecting compatibility issues for Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[48]  James Fogarty,et al.  Robust Annotation of Mobile Application Interfaces in Methods for Accessibility Repair and Enhancement , 2018, UIST.

[49]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[50]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[51]  Fahui Wang,et al.  Measurement, Optimization, and Impact of Health Care Accessibility: A Methodological Review , 2012, Annals of the Association of American Geographers. Association of American Geographers.

[52]  Frederick Lippman,et al.  Blindness and vision impairment , 2008 .

[53]  Richard E. Ladner,et al.  Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities , 2009, Assets '09.

[54]  Lei Ma,et al.  MobiDroid: A Performance-Sensitive Malware Detection System on Mobile Platform , 2019, 2019 24th International Conference on Engineering of Complex Computer Systems (ICECCS).

[55]  Abhik Roychoudhury,et al.  Automated Re-factoring of Android Apps to Enhance Energy-Efficiency , 2016, 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[56]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[57]  Jacob O. Wobbrock,et al.  Interaction Proxies for Runtime Repair and Enhancement of Mobile Application Accessibility , 2017, CHI.

[58]  Lingling Fan,et al.  StoryDroid: Automated Generation of Storyboard for Android Apps , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[59]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[60]  Gordon Fraser,et al.  Automated Accessibility Testing of Mobile Apps , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[61]  Jean Carletta,et al.  Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization , 2005, ACL 2005.

[62]  Shunguo Yan,et al.  The Current Status of Accessibility in Mobile Apps , 2019, ACM Trans. Access. Comput..