QuixBugs: a multi-lingual program repair benchmark set based on the quixey challenge

Recent years have seen an explosion of work in automated program repair. While previous work has focused exclusively on tools for single languages, recent work in multi-language transformation has opened the door for multi-language program repair tools. Evaluating the performance of such a tool requires having a benchmark set of similar buggy programs in different languages. We present QuixBugs, consisting of 40 programs translated to both Python and Java, each with a bug on a single line. The QuixBugs benchmark suite is based on problems from the Quixey Challenge, where programmers were given a short buggy program and 1 minute to fix the bug.

[1]  Matias Martinez,et al.  ASTOR: a program repair library for Java (demo) , 2016, ISSTA.

[2]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[3]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[4]  Armando Solar-Lezama,et al.  Incremental parametric syntax for multi-language transformation , 2017, SPLASH.

[5]  Abhik Roychoudhury,et al.  Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[6]  Matias Martinez ASTOR: A Program Repair Library for Java , 2016 .

[7]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).