Automatic Equivalent Mutants Classification Using Abstract Syntax Tree Neural Networks

Mutation testing is a testing technique that is effective at designing tests and evaluating an existing test suite. Even though mutation testing has been developed to be applicable and effective towards different types of software systems and programing languages for many years, wide industrial use of mutation testing has not yet been seen. One primary reason that prevents developers and testers from using mutation testing is the expensive computational cost. Specifically, the need to manually identify equivalent mutants is a major obstacle and makes mutation testing very time consuming and labor intensive. This paper addresses this limitation and proposes a machine learning-based approach that designs and trains an abstract syntax tree recurrent neural network model to automatically classify equivalent mutants during the process of mutation testing. A pilot study with 582 mutants shows that the proposed machine learning-based approach can automatically classify equivalent mutants with an accuracy higher than 90%. The approach can significantly save the manual effort and time spent on identifying equivalent mutants during the process of mutation testing.

[1]  Yue Jia,et al.  MILU: A Customizable, Runtime-Optimized Higher Order Mutation Testing Tool for the Full C Language , 2008, Testing: Academic & Industrial Conference - Practice and Research Techniques (taic part 2008).

[2]  Lu Zhang,et al.  Predictive Mutation Testing , 2016, IEEE Transactions on Software Engineering.

[3]  Hailong Sun,et al.  A Novel Neural Source Code Representation Based on Abstract Syntax Tree , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[4]  Hamad Naeem,et al.  A machine learning approach for classification of equivalent mutants , 2020, J. Softw. Evol. Process..

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Yves Le Traon,et al.  Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[7]  Mark Harman,et al.  A study of equivalent and stubborn mutation operators using human analysis of equivalence , 2014, ICSE.

[8]  Walid Maalej,et al.  Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning , 2017, 2017 IEEE 25th International Requirements Engineering Conference (RE).

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Gregg Rothermel,et al.  An experimental determination of sufficient mutant operators , 1996, TSEM.

[11]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[12]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[13]  A. Jefferson Offutt,et al.  Automatically detecting equivalent mutants and infeasible paths , 1997, Softw. Test. Verification Reliab..

[14]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[15]  K. N. King,et al.  A fortran language system for mutation‐based software testing , 1991, Softw. Pract. Exp..

[16]  Claudinei Brito Junior,et al.  A Preliminary Investigation into Using Machine Learning Algorithms to Identify Minimal and Equivalent Mutants , 2020, 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[17]  Josh Dehlinger,et al.  Project Achilles: A Prototype Tool for Static Method-Level Vulnerability Detection of Java Source Code Using a Recurrent Neural Network , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW).

[18]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[19]  Carlo Luschi,et al.  Revisiting Small Batch Training for Deep Neural Networks , 2018, ArXiv.

[20]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.