Empirical Evaluation of Test Coverage for Functional Programs

The correlation between test coverage and test effectiveness is important to justify the use of coverage in practice. Existing results on imperative programs mostly show that test coverage predicates effectiveness. However, since functional programs are usually structurally different from imperative ones, it is unclear whether the same result may be derived and coverage can be used as a prediction of effectiveness on functional programs. In this paper we report the first empirical study on the correlation between test coverage and test effectiveness on functional programs. We consider four types of coverage: as input coverages, statement/branch coverage and expression coverage, and as oracle coverages, count of assertions and checked coverage. We also consider two types of effectiveness: raw effectiveness and normalized effectiveness. Our results are twofold. (1) In general the findings on imperative programs still hold on functional programs, warranting the use of coverage in practice. (2) On specific coverage criteria, the results may be unexpected or different from the imperative ones, calling for further studies on functional programs.

[1]  Will Partain,et al.  The nofib Benchmark Suite of Haskell Programs , 1992, Functional Programming.

[2]  John Hughes,et al.  How functional programming mattered , 2015 .

[3]  Carlo A. Furia,et al.  A Comparative Study of Programming Languages in Rosetta Code , 2014, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[4]  Phyllis G. Frankl,et al.  An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing , 1993, IEEE Trans. Software Eng..

[5]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[6]  Lu Zhang,et al.  Inner oracles: input-specific assertions on internal states , 2015, ESEC/SIGSOFT FSE.

[7]  Colin Runciman,et al.  Haskell program coverage , 2007, Haskell '07.

[8]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[9]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[10]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[11]  Philip Wadler,et al.  The Glasgow Haskell Compiler: a technical overview , 1993 .

[12]  Akbar Siami Namin,et al.  The influence of size and coverage on test suite effectiveness , 2009, ISSTA.

[13]  Tao Xie,et al.  Cooperative Software Testing and Analysis: Advances and Challenges , 2014, Journal of Computer Science and Technology.

[14]  Alex Groce,et al.  Code coverage for suite evaluation by developers , 2014, ICSE.

[15]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[16]  Colin Runciman,et al.  Smallcheck and lazy smallcheck: automatic exhaustive testing for small values , 2008, Haskell '08.

[17]  Susan Eisenbach,et al.  High coverage testing of Haskell programs , 2011, ISSTA '11.

[18]  Ccf Key Test-Data Generation Guided by Static Defect Detection , 2009 .

[19]  Christopher Piro,et al.  Functional programming at Facebook , 2009, CUFP '09.

[20]  Alex Groce,et al.  MuCheck: an extensible tool for mutation testing of haskell programs , 2014, ISSTA 2014.

[21]  Pascale Thévenod-Fosse,et al.  Software error analysis: a real case study involving real faults and mutations , 1996, ISSTA '96.

[22]  Phyllis G. Frankl,et al.  Further empirical studies of test effectiveness , 1998, SIGSOFT '98/FSE-6.

[23]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[24]  Doo-Hwan Bae,et al.  Empirical evaluation on FBD model-based test coverage criteria using mutation analysis , 2012, MODELS'12.

[25]  Yucheng Zhang,et al.  Assertions are strongly correlated with test suite effectiveness , 2015, ESEC/SIGSOFT FSE.

[26]  Andreas Zeller,et al.  Assessing Oracle Quality with Checked Coverage , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[27]  M.P.E. Heimdahl,et al.  On MC/DC and implementation structure: An empirical study , 2008, 2008 IEEE/AIAA 27th Digital Avionics Systems Conference.

[28]  Ajitha Rajan,et al.  The effect of program and model structure on mc/dc test adequacy coverage , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[29]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2000, ICFP.