论文信息 - Statistical dataset on software metrics in object-oriented systems

Statistical dataset on software metrics in object-oriented systems

This paper presents a set of statistical data on the software metrics of object-oriented systems. The data were generated from the Qualitas.class Corpus, which gathered a large amount of metrics data from the 111 systems included in Qualitas Corpus. We used the R project for Statistical Computing to generate 6 statistical graphs, 4 summarization/aggregation tables and an R script, for each of the 21 metrics evaluated in each of the 111 systems. This amounted to 13,986 graphs, 8,800 tables and 2,200 R scripts. We also utilized EasyFit to fit a large number of distributions to each dataset, which in turn provided the best fitting statistical distribution, as well as the fit ranking. We also provide a MySQL database dump that normalizes the metric measures and facilitates data manipulation tasks such as filtering and aggregation. By making this set available, we intend to help researchers in their work on software metrics.

Mariza Andrade da Silva Bigonha | Kecia Aline M. Ferreira | Tarcísio G. S. Filó

[1] Jing Li,et al. The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies , 2010, 2010 Asia Pacific Software Engineering Conference.

[2] Ricardo Terra,et al. Qualitas.class corpus: a compiled version of the qualitas corpus , 2013, SOEN.

[3] Ewan D. Tempero,et al. Understanding the shape of Java software , 2006, OOPSLA '06.