SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry

In this paper, we aim at the automated unit coverage-based testing for embedded software. To achieve the goal, by analyzing the industrial requirements and our previous work on automated unit testing tool CAUT, we rebuild a new tool, SmartUnit, to solve the engineering requirements that take place in our partner companies. SmartUnit is a dynamic symbolic execution implementation, which supports statement, branch, boundary value and MC/DC coverage. SmartUnit has been used to test more than one million lines of code in real projects. For confidentiality motives, we select three in-house real projects for the empirical evaluations. We also carry out our evaluations on two open source database projects, SQLite and PostgreSQL, to test the scalability of our tool since the scale of the embedded software project is mostly not large, 5K-50K lines of code on average. From our experimental results, in general, more than 90% of functions in commercial embedded software achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in SQLite achieve 100% MC/DC coverage, and more than 60% of functions in PostgreSQL achieve 100% MC/DC coverage. Moreover, SmartUnit is able to find the runtime exceptions at the unit testing level. We also have reported exceptions like array index out of bounds and divided-by-zero in SQLite. Furthermore, we analyze the reasons of low coverage in automated unit testing in our setting and give a survey on the situation of manual unit testing with respect to automated unit testing in industry. SmartUnit is a dynamic symbolic execution implementation, which supports statement, branch, boundary value and MC/DC coverage. SmartUnit has been used to test more than one million lines of code in real projects. For confidentiality motives, we select three in-house real projects for the empirical evaluations. We also carry out our evaluations on two open source database projects, SQLite and PostgreSQL, to test the scalability of our tool since the scale of the embedded software project is mostly not large, 5K-50K lines of code on average. From our experimental results, in general, more than 90% of functions in commercial embedded software achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in SQLite achieve 100% MC/DC coverage, and more than 60% of functions in PostgreSQL achieve 100% MC/DC coverage. Moreover, SmartUnit is able to find the runtime exceptions at the unit testing level. We also have reported exceptions like array index out of bounds and divided-by-zero in SQLite. Furthermore, we analyze the reasons of low coverage in automated unit testing in our setting and give a survey on the situation of manual unit testing with respect to automated unit testing in industry.

[1]  Standard Glossary of Software Engineering Terminology , 1990 .

[2]  Corina S. Pasareanu,et al.  JPF-SE: A Symbolic Execution Extension to Java PathFinder , 2007, TACAS.

[3]  Nikolai Tillmann,et al.  Parameterized unit tests with unit meister , 2005, ESEC/FSE-13.

[4]  Michael Hicks,et al.  Directed Symbolic Execution , 2011, SAS.

[5]  Patrice Godefroid,et al.  SAGE: Whitebox Fuzzing for Security Testing , 2012, ACM Queue.

[6]  Koushik Sen,et al.  CUTE and jCUTE: Concolic Unit Testing and Explicit Path Model-Checking Tools , 2006, CAV.

[7]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[8]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[9]  Sarfraz Khurshid,et al.  Symbolic execution for software testing in practice: preliminary assessment , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[10]  Rupak Majumdar,et al.  Reducing Test Inputs Using Information Partitions , 2009, CAV.

[11]  Yang Liu,et al.  Guided, stochastic model-based GUI testing of Android apps , 2017, ESEC/SIGSOFT FSE.

[12]  Michael D. Ernst,et al.  Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[13]  Yang Liu,et al.  Automated Requirements Validation for ATP Software via Specification Review and Testing , 2016, ICFEM.

[14]  Tao Xie,et al.  Random unit-test generation with MUT-aware sequence recommendation , 2010, ASE '10.

[15]  Gul Agha,et al.  Scalable Automated Methods for Dynamic Program Analysis , 2006 .

[16]  Rupak Majumdar,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 1997, Lecture Notes in Computer Science.

[17]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[18]  Guodong Li,et al.  KLOVER: A Symbolic Execution and Automatic Test Generation Tool for C++ Programs , 2011, CAV.

[19]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[20]  Moonzoo Kim,et al.  Concolic Testing on Embedded Software - Case Studies on Mobile Platform Programs , 2012 .

[21]  Gordon Fraser,et al.  Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[22]  Thomas D. LaToza,et al.  Software Development at Microsoft Observed , 2005 .

[23]  Z. Hasan A Survey on Shari’Ah Governance Practices in Malaysia, GCC Countries and the UK , 2011 .

[24]  Zhendong Su,et al.  Automated coverage-driven testing: combining symbolic execution and model checking , 2016, Science China Information Sciences.

[25]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[26]  Per Runeson,et al.  A survey of unit testing practices , 2006, IEEE Software.

[27]  Daniel Kroening,et al.  Decision Procedures - An Algorithmic Point of View , 2008, Texts in Theoretical Computer Science. An EATCS Series.

[28]  Koushik Sen,et al.  Heuristics for Scalable Dynamic Test Generation , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[29]  Moonzoo Kim,et al.  Industrial application of concolic testing approach: A case study on libexif by using CREST-BV and KLEE , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[30]  Gogul Balakrishnan,et al.  Feedback-directed unit test generation for C/C++ using concolic execution , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[31]  Hong Zhu,et al.  Software unit test coverage and adequacy , 1997, ACM Comput. Surv..

[32]  Bin Fang,et al.  Automated Coverage-Driven Test Data Generation Using Dynamic Symbolic Execution , 2014, 2014 Eighth International Conference on Software Security and Reliability.

[33]  Koushik Sen,et al.  Symbolic execution for software testing: three decades later , 2013, CACM.

[34]  Clark W. Barrett “Decision Procedures: An Algorithmic Point of View,” by Daniel Kroening and Ofer Strichman, Springer-Verlag, 2008 , 2013, Journal of Automated Reasoning.

[35]  Zhendong Su,et al.  A Survey on Data-Flow Testing , 2017, ACM Comput. Surv..

[36]  GORDON FRASER,et al.  A Large-Scale Evaluation of Automated Unit Test Generation Using EvoSuite , 2014, ACM Trans. Softw. Eng. Methodol..

[37]  Tadahiro Uehara,et al.  Enhancing Symbolic Execution to Test the Compatibility of Re-engineered Industrial Software , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[38]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[39]  Zhendong Su,et al.  Combining Symbolic Execution and Model Checking for Data Flow Testing , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[40]  Michael D. Ernst,et al.  Combined static and dynamic automated test generation , 2011, ISSTA '11.

[41]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[42]  Michael D. Ernst,et al.  Randoop: feedback-directed random testing for Java , 2007, OOPSLA '07.

[43]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[44]  David Notkin,et al.  Tool-assisted unit test selection based on operational violations , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[45]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[46]  Gordon Fraser,et al.  An Industrial Evaluation of Unit Test Generation: Finding Real Faults in a Financial Application , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).