113 times Tomcat: A dataset

Measuring software to get information about its properties and quality is one of the main issues in modern software engineering. The aim of this paper is to present a dataset of metrics associated to 113 versions of Tomcat. We describe the dataset along with the adopted criteria and the opportunities of research, providing preliminary results. This dataset can enhance the reliability of empirical studies, enabling their reproducibility, reducing their cost, and it can foster further research on software quality and software metrics.

[1]  Olcay Taner Yildiz,et al.  Software defect prediction using Bayesian networks , 2012, Empirical Software Engineering.

[2]  Arie van Deursen,et al.  Refactoring test code , 2001 .

[3]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[4]  Michele Marchesi,et al.  The Emotional Side of Software Developers in JIRA , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[5]  Brian Robinson,et al.  Improving industrial adoption of software engineering research: a comparison of open and closed source software , 2010, ESEM '10.

[6]  Jing Li,et al.  The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies , 2010, 2010 Asia Pacific Software Engineering Conference.

[7]  Serge Demeyer,et al.  Mining Version Control Systems for FACs (Frequently Applied Changes) , 2004, MSR.

[8]  Michael Burch,et al.  Visual Data Mining in Software Archives to Detect How Developers Work Together , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[9]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[10]  Gerardo Canfora,et al.  Multi-objective Cross-Project Defect Prediction , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[11]  Michele Marchesi,et al.  Micro Patterns in Agile Software , 2013, XP.

[12]  Arif Ali Khan,et al.  Performance Evaluation of Ensemble Methods For Software Fault Prediction: An Experiment , 2015, ASWEC.

[13]  Mary Shaw,et al.  Empirical evaluation of defect projection models for widely-deployed production software systems , 2004, SIGSOFT '04/FSE-12.

[14]  Gail E. Kaiser,et al.  BUGMINER: Software Reliability Analysis Via Data Mining of Bug Reports , 2011, SEKE.

[15]  Ashish Sureka,et al.  LogOpt: Static Feature Extraction from Source Code for Automated Catch Block Logging Prediction , 2016, ISEC.

[16]  Michele Marchesi,et al.  Micro Pattern Fault-Proneness , 2012, 2012 38th Euromicro Conference on Software Engineering and Advanced Applications.

[17]  Xiaohui Liu,et al.  Comparing Test and Production Code Quality in a Large Commercial Multicore System , 2016, 2016 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[18]  Michele Marchesi,et al.  A Curated Benchmark Collection of Python Systems for Empirical Studies on Software Engineering , 2015, PROMISE.

[19]  Matias Martinez,et al.  CVS-Vintage: A Dataset of 14 CVS Repositories of Java Software , 2012 .

[20]  Gary T. Leavens,et al.  @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.