How to Measure the Performance of Automated Program Repair

As with the advance of automated program repair, many novel repair approaches have been proposed in recent. There also exist empirical work focusing on the performance comparison among those approaches. In this paper we survey recent repair work in five main venues, and plan to answer the question of how to measure the performance of automated program repair in term of evaluation metrics. We summary the evaluation metrics in literature, and conduct the discussion on how to construct the common metrics for further research in the area of automated program repair.

[1]  Yuhua Qi,et al.  Using automated program repair for evaluating the effectiveness of fault localization techniques , 2013, ISSTA.

[2]  Natalia Juristo Juzgado,et al.  Are Students Representatives of Professionals in Software Engineering Experiments? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[3]  Westley Weimer,et al.  A human study of patch maintainability , 2012, ISSTA 2012.

[4]  Andreas Zeller,et al.  Automated Fixing of Programs with Contracts , 2014 .

[5]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[6]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[7]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[8]  Abhik Roychoudhury,et al.  relifix: Automated Repair of Software Regressions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[9]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[10]  Yuhua Qi,et al.  Efficient Automated Program Repair through Fault-Recorded Testing Prioritization , 2013, 2013 IEEE International Conference on Software Maintenance.

[11]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[12]  Yuhua Qi,et al.  On the Evaluation Metrics of Automated Program Repair , 2017, 2017 International Conference on Dependable Systems and Their Applications (DSA).

[13]  Baowen Xu,et al.  A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization , 2013, TSEM.

[14]  Yuriy Brun,et al.  The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs , 2015, IEEE Transactions on Software Engineering.

[15]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[16]  Martin Monperrus,et al.  A critical review of "automatic patch generation learned from human-written patches": essay on the problem statement and the evaluation of automatic software repair , 2014, ICSE.

[17]  Andrea Arcuri,et al.  Evolutionary repair of faulty software , 2011, Appl. Soft Comput..

[18]  Sunghun Kim,et al.  Automatically generated patches as debugging aids: a human study , 2014, SIGSOFT FSE.

[19]  Andrea Arcuri,et al.  On the automation of fixing software bugs , 2008, ICSE Companion '08.

[20]  Westley Weimer,et al.  Leveraging program equivalence for adaptive program repair: Models and first results , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[21]  Claire Le Goues,et al.  Measuring Code Quality to Improve Specification Mining , 2012, IEEE Transactions on Software Engineering.

[22]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[23]  Alessandro Orso,et al.  MintHint: automated synthesis of repair hints , 2013, ICSE.

[24]  Yuriy Brun,et al.  Is the cure worse than the disease? overfitting in automated program repair , 2015, ESEC/SIGSOFT FSE.

[25]  Claire Le Goues,et al.  Designing better fitness functions for automated program repair , 2010, GECCO '10.

[26]  Claire Le Goues,et al.  Current challenges in automatic software repair , 2013, Software Quality Journal.

[27]  Yuhua Qi,et al.  An Empirical Study on the Usage of Fault Localization in Automated Program Repair , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[28]  Bertrand Meyer,et al.  Code-based automated program fixing , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[29]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[30]  Bertrand Meyer,et al.  Inferring better contracts , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[31]  Yuhua Qi,et al.  Making automatic repair for large-scale programs more efficient using weak recompilation , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).