Semantic matching of GUI events for test reuse: are we there yet?

GUI testing is an important but expensive activity. Recently, research on test reuse approaches for Android applications produced interesting results. Test reuse approaches automatically migrate human-designed GUI tests from a source app to a target app that shares similar functionalities. They achieve this by exploiting semantic similarity among textual information of GUI widgets. Semantic matching of GUI events plays a crucial role in these approaches. In this paper, we present the first empirical study on semantic matching of GUI events. Our study involves 253 configurations of the semantic matching, 337 unique queries, and 8,099 distinct GUI events. We report several key findings that indicate how to improve semantic matching of test reuse approaches, propose SemFinder a novel semantic matching algorithm that outperforms existing solutions, and identify several interesting research directions.

[1]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Marian Kremers 2021 , 2021, Vakblad Sociaal Werk.

[4]  Leonardo Mariani,et al.  Extracting Widget Descriptions from GUIs , 2012, FASE.

[5]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[6]  Hongmin Li,et al.  Comparison of Word Embeddings and Sentence Encodings as Generalized Representations for Crisis Tweet Classification Tasks , 2018 .

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  A. Orso,et al.  AppTestMigrator , 2020, Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings.

[9]  Michael Pradel,et al.  Monkey see, monkey do: effective generation of GUI tests with inferred macro events , 2017, Software Engineering.

[10]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[12]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[13]  Alessandro Orso,et al.  Test migration for efficient large-scale assessment of mobile app coding assignments , 2018, ISSTA.

[14]  Michael D. Ernst,et al.  Automatically repairing broken workflows for evolving GUI applications , 2013, ISSTA.

[15]  V. Guiard,et al.  The robustness of parametric statistical methods , 2004 .

[16]  Christopher Vendome,et al.  Automatically Discovering, Reporting and Reproducing Android Application Crashes , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[17]  Atif M. Memon,et al.  GUI ripping: reverse engineering of graphical user interfaces for testing , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[18]  Alessandra Gorla,et al.  Automated Test Input Generation for Android: Are We There Yet? (E) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Junfeng Yang,et al.  AppFlow: using machine learning to synthesize robust, reusable UI tests , 2018, ESEC/SIGSOFT FSE.

[20]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[21]  Mayur Naik,et al.  Dynodroid: an input generation system for Android apps , 2013, ESEC/FSE 2013.

[22]  E. S. Pearson,et al.  Tests for departure from normality: Comparison of powers , 1977 .

[23]  Mauro Pezzè,et al.  Automatically repairing test cases for evolving method declarations , 2010, 2010 IEEE International Conference on Software Maintenance.

[24]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[25]  Leonardo Mariani,et al.  Augusto: Exploiting Popular Functionalities for the Generation of Semantic GUI Tests with Oracles , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[26]  Xue Qin,et al.  TestMig: migrating GUI test cases from iOS to Android , 2019, ISSTA.

[27]  Andreas Zeller,et al.  Transferring Tests Across Web Applications , 2018, ICWE.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Liming Zhu,et al.  Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[30]  Yang Liu,et al.  ReCDroid: Automatically Reproducing Android Application Crashes from Bug Reports , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[31]  W. Buxton Human-Computer Interaction , 1988, Springer Berlin Heidelberg.

[32]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[33]  Reyhaneh Jabbarvand,et al.  Test Transfer Across Mobile Apps Through Semantic Mapping , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[34]  Hongseok Yang,et al.  Automated concolic testing of smartphone apps , 2012, SIGSOFT FSE.

[35]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[36]  Atif M. Memon,et al.  Automating regression testing for evolving GUI software , 2005, J. Softw. Maintenance Res. Pract..

[37]  Atif M. Memon,et al.  Automatically repairing event sequence-based GUI test suites for regression testing , 2008, TSEM.

[38]  Mauro Pezzè,et al.  Supporting Test Suite Evolution through Test Case Adaptation , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[39]  Atif M. Memon,et al.  What test oracle should I use for effective GUI testing? , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[40]  Alessandro Orso,et al.  Poster: Automated Test Migration for Mobile Apps , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[41]  Mark Harman,et al.  Crowd intelligence enhances automated mobile testing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[42]  Leonardo Mariani,et al.  An Evolutionary Approach to Adapt Tests Across Mobile Apps , 2021, 2021 IEEE/ACM International Conference on Automation of Software Test (AST).

[43]  Sam Malek,et al.  SIG-Droid: Automated system input generation for Android applications , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[44]  Mark Harman,et al.  The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.

[45]  Zhen Dong,et al.  Time-travel Testing of Android Apps , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[46]  Porfirio Tramontana,et al.  Using GUI ripping for automated testing of Android applications , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[47]  Mario Linares Vásquez,et al.  Mining Android App Usages for Generating Actionable GUI-Based Execution Scenarios , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[48]  Tara N. Sainath,et al.  Deep Neural Network Language Models , 2012, WLM@NAACL-HLT.

[49]  Alessandro Orso,et al.  Test Migration Between Mobile Apps with Similar Functionality , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[50]  Jian Lu,et al.  Practical GUI Testing of Android Applications Via Model Abstraction and Refinement , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[51]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[52]  D. Hamby A comparison of sensitivity analysis techniques. , 1995, Health physics.

[53]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[54]  Shlomo Argamon,et al.  Effects of Age and Gender on Blogging , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[55]  Andreas Zeller,et al.  Poster: Efficient GUI Test Generation by Learning from Tests of Other Apps , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[56]  Alessandro Orso,et al.  AppTestMigrator: A Tool for Automated Test Migration for Android Apps * , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[57]  Mark Harman,et al.  FrUITeR - A Framework for Evaluating UI Test Reuse , 2020, ArXiv.

[58]  Zhenyu Chen,et al.  SITAR: GUI Test Script Repair , 2016, IEEE Transactions on Software Engineering.

[59]  Yue Jia,et al.  Sapienz: multi-objective automated testing for Android applications , 2016, ISSTA.

[60]  Alessandro Orso,et al.  Automated test migration for mobile apps , 2018, ICSE.

[61]  Peng Gao,et al.  IconIntent: Automatic Identification of Sensitive UI Widgets Based on Icon Classification for Android Apps , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[62]  Xiao Li,et al.  ATOM: Automatic Maintenance of GUI Test Scripts for Evolving Mobile Applications , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).