A Controlled Experiment in Testing of Safety-Critical Embedded Software

In engineering of safety critical systems, regulatory standards often put requirements on both traceable specification-based testing, and structural coverage on program units. Automated test generation techniques can be used to generate inputs to cover the structural aspects of a program. However, there is no conclusive evidence on how automated test generation compares to manual test design, or how testing based on the program implementation relates to specification-based testing. In this paper, we investigate specification -- and implementation-based testing of embedded software written in the IEC 61131-3 language, a programming standard used in many embedded safety critical software systems. Further, we measure the efficiency and effectiveness in terms of fault detection. For this purpose, a controlled experiment was conducted, comparing tests created by a total of twenty-three software engineering master students. The participants worked individually on manually designing and automatically generating tests for two IEC 61131-3 programs. Tests created by the participants in the experiment were collected and analyzed in terms of mutation score, decision coverage, number of tests, and testing duration. We found that, when compared to implementation-based testing, specification-based testing yields significantly more effective tests in terms of the number of faults detected. Specifically, specification-based tests more effectively detect comparison and value replacement type of faults, compared to implementation-based tests. On the other hand, implementation-based automated test generation leads to fewer tests (up to 85% improvement) created in shorter time than the ones manually created based on the specification.

[1]  Wang Yi,et al.  Uppaal in a nutshell , 1997, International Journal on Software Tools for Technology Transfer.

[2]  Nikolai Tillmann,et al.  Transferring an automated test generation tool to practice: from pex to fakes and code digger , 2014, ASE.

[3]  Gordon Fraser,et al.  Does Automated Unit Test Generation Really Help Software Testers? A Controlled Empirical Study , 2015, ACM Trans. Softw. Eng. Methodol..

[4]  Yi-Chen Wu,et al.  Automatic test case generation for structural testing of function block diagrams , 2014, Inf. Softw. Technol..

[5]  Dietmar Winkler,et al.  Random Test Case Generation and Manual Unit Testing: Substitute or Complement in Retrofitting Tests for Legacy Code? , 2012, 2012 38th Euromicro Conference on Software Engineering and Advanced Applications.

[6]  Lionel C. Briand,et al.  A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering , 2014, Softw. Test. Verification Reliab..

[7]  Daniel Sundmark,et al.  Using Logic Coverage to Improve Testing Function Block Diagrams , 2013, ICTSS.

[8]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[9]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[10]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[11]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[12]  Doo-Hwan Bae,et al.  Automated test case generation for FBD programs implementing reactor protection system software , 2014, Softw. Test. Verification Reliab..

[13]  Doo-Hwan Bae,et al.  Empirical evaluation on FBD model-based test coverage criteria using mutation analysis , 2012, MODELS'12.

[14]  Junbeom Yoo,et al.  Software safety analysis of function block diagrams using fault trees , 2005, Reliab. Eng. Syst. Saf..

[15]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[16]  John C. Mulder,et al.  Control system devices : architectures and supply channels overview. , 2010 .

[17]  Daniel Sundmark,et al.  Model-Based Test Suite Generation for Function Block Diagrams Using the UPPAAL Model Checker , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops.

[18]  D. C. Howell Statistical Methods for Psychology , 1987 .

[19]  Gregg Rothermel,et al.  Software testing: a research travelogue (2000–2014) , 2014, FOSE.

[20]  Tim Menzies,et al.  Text is Software Too , 2004, MSR.

[21]  Vahid Garousi,et al.  A survey of software testing practices in Canada , 2013, J. Syst. Softw..

[22]  Michael Tiegelkamp,et al.  IEC 61131-3: Programming Industrial Automation Systems: Concepts and Programming Languages, Requirements for Programming Systems, Decision-Making Aids , 2001 .

[23]  A. Jefferson Offutt,et al.  Book review: Introduction to Software Testing written by Paul Amman & Jeff Offutt. and published by CUP, 2008, 978-0-521-88038 322 pp., 0-471-20282-7 , 2008, SOEN.

[24]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[25]  Klaus Pohl,et al.  Industry needs and research directions in requirements engineering for embedded systems , 2011, Requirements Engineering.

[26]  Theodorich Kopetzky,et al.  A Replicated Study on Random Test Case Generation and Manual Unit Testing: How Many Bugs Do Professional Developers Find? , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.

[27]  Elaine J. Weyuker,et al.  Automated test generation using model checking: an industrial evaluation , 2014, International Journal on Software Tools for Technology Transfer.

[28]  Luciano Baresi,et al.  An Introduction to Software Testing , 2006, FoVMT.

[29]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[30]  Gordon Fraser,et al.  Does automated white-box test generation really help software testers? , 2013, ISSTA.

[31]  Gregory Gay,et al.  The Risks of Coverage-Directed Test Case Generation , 2015, IEEE Transactions on Software Engineering.

[32]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[33]  Doo-Hwan Bae,et al.  A data flow-based structural testing technique for FBD programs , 2009, Inf. Softw. Technol..

[34]  Markus Bohlin,et al.  Search Based Testing of Embedded Systems Implemented in IEC 61131-3: An Industrial Case Study , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops.

[35]  Sarfraz Khurshid,et al.  Test input generation with java PathFinder , 2004, ISSTA '04.

[36]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[37]  Claes Wohlin,et al.  Using Students as Subjects—A Comparative Study of Students and Professionals in Lead-Time Impact Assessment , 2000, Empirical Software Engineering.