Large-Scale Dataset of Local Java Software Build Results

When a person decides to inspect or modify a third-party software project, the first necessary step is its successful compilation from source code using a build system. However, such attempts often end in failure. In this data descriptor paper, we provide a dataset of build results of open source Java software systems. We tried to automatically build a large number of Java projects from GitHub using their Maven, Gradle, and Ant build scripts in a Docker container simulating a standard programmer’s environment. The dataset consists of the output of two executions: 7264 build logs from a study executed in 2016 and 7233 logs from the 2020 execution. In addition to the logs, we collected exit codes, file counts, and various project metadata. The proportion of failed builds in our dataset is 38% in the 2016 execution and 59% in the 2020 execution. The published data can be helpful for multiple purposes, such as correlation analysis of factors affecting build success, build failure prediction, and research in the area of build breakage repair.

[1]  Robert W. Bowdidge,et al.  Programmers' build errors: a case study (at google) , 2014, ICSE.

[2]  Joel Spolsky The Joel Test: 12 Steps to Better Code , 2004 .

[3]  Ivan Lukovic Issues and Lessons Learned in the Development of Academic Study Programs in Data Science , 2019, DAMDID/RCDL.

[4]  Gabriele Bavota,et al.  There and back again: Can you compile that snapshot? , 2017, J. Softw. Evol. Process..

[5]  Taher Ahmed Ghaleb,et al.  An empirical study of the long duration of continuous integration builds , 2019, Empirical Software Engineering.

[6]  Nuno Oliveira,et al.  Comparing general-purpose and domain-specific languages: An empirical study , 2010, Comput. Sci. Inf. Syst..

[7]  Philipp Leitner,et al.  An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[8]  Foyzul Hassan,et al.  HireBuild: An Automatic Approach to History-Driven Repair of Build Scripts , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[9]  Michael W. Godfrey,et al.  Build system issues in multilanguage software , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[10]  Foyzul Hassan,et al.  Change-Aware Build Prediction Model for Stall Avoidance in Continuous Integration , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[11]  Lech Madeyski,et al.  Continuous Defect Prediction: The Idea and a Related Dataset , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[12]  Shane McIntosh,et al.  Automatically repairing dependency-related build breakage , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[13]  Ming Li,et al.  Cost-Effective Build Outcome Prediction Using Cascaded Classifiers , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[14]  Yann-Gaël Guéhéneuc,et al.  A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems , 2019, Empirical Software Engineering.

[15]  Daniel Alencar da Costa,et al.  Studying the Impact of Noises in Build Breakage Data , 2019, IEEE Transactions on Software Engineering.

[16]  Foutse Khomh,et al.  Why Do Automated Builds Break? An Empirical Study , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[17]  Shane McIntosh,et al.  Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Jozef Kostelanský,et al.  An evaluation of output from current Java bytecode decompilers: Is it Android which is responsible for such quality boost? , 2017, 2017 Communication and Information Technologies (KIT).

[19]  Matús Sulír,et al.  A quantitative study of Java software buildability , 2016, PLATEAU@SPLASH.

[20]  Harald C. Gall,et al.  Every build you break: developer-oriented assistance for build failure resolution , 2019, Empirical Software Engineering.

[21]  Foyzul Hassan,et al.  Automatic Building of Java Projects in Software Repositories: A Study on Feasibility and Challenges , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[22]  Xin Peng,et al.  A large-scale empirical study of compiler errors in continuous integration , 2019, ESEC/SIGSOFT FSE.

[23]  Andy Zaidman,et al.  LogChunks: A Data Set for Build Log Analysis , 2020, MSR.

[24]  Shane McIntosh,et al.  Revisiting "Programmers' Build Errors" in the Visual Studio Context , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[25]  Chris Parnin,et al.  Gistable: Evaluating the Executability of Python Code Snippets on GitHub , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[26]  Georgios Gousios,et al.  Oops, My Tests Broke the Build: An Explorative Analysis of Travis CI with GitHub , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).