Does pareto's law apply to evidence distribution in software engineering? an initial report

Data is the source as well as raw format of evidence. As an important research methodology in evidence-based software engineering, systematic literature reviews (SLRs) are used for identifying the evidence and critically appraising the evidence, i.e. empirical studies that report (empirical) data about specific research questions. The 80/20 Rule (or Pareto's Law) reveals a 'vital few' phenomenon widely observed in many disciplines in the last century. However, the applicability of Pareto's Law to evidence distribution in software engineering (SE) is never tested yet. The objective of this paper is to investigate the applicability of Pareto's Law to the evidence distribution on specific topic areas in software engineering (in the form of systematic reviews), which may help us better understand the possible distribution of evidence in software engineering, and further improve the effectiveness and efficiency of literature search. We performed a tertiary study of SLRs in software engineering dated between 2004 and 2012. We further tested the Pareto's Law by collecting, analyzing, and interpreting the distribution (over publication venues) of the primary studies reported in the existing SLRs. Our search identified 255 SLRs, 107 of which were included according to the selection criteria. The analysis of the extracted data from these SLRs presents a preliminary view of the evidence (study) distribution in software engineering. The nonuniform distribution of evidence is supported by the data from the existing SLRs in SE. However, the present observation reflects a weaker 'vital few' relation between study and venue than the 80/20 Rule statement. Top referenced venues are suggested when researchers search for studies in software engineering. It is also noticeable to the community that the primary studies are improperly or incompletely reported in many SLRs.

[1]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[2]  Muhammad Ali Babar,et al.  Identifying relevant studies in software engineering , 2011, Inf. Softw. Technol..

[3]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[4]  Muhammad Ali Babar,et al.  An Empirical Investigation of Systematic Reviews in Software Engineering , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[5]  Tore Dybå,et al.  Evidence-based software engineering , 2016, Perspectives on Data Science for Software Engineering.

[6]  Tore Dybå,et al.  Evidence-Based Software Engineering for Practitioners , 2005, IEEE Softw..

[7]  Tayana Conte,et al.  Systematic Literature Reviews in Distributed Software Development: A Tertiary Study , 2012, 2012 IEEE Seventh International Conference on Global Software Engineering.

[8]  Muhammad Ali Babar,et al.  Systematic reviews in software engineering: An empirical investigation , 2013, Inf. Softw. Technol..

[9]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[10]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A tertiary study , 2010, Inf. Softw. Technol..

[11]  Daniela Cruzes,et al.  Research synthesis in software engineering: A tertiary study , 2011, Inf. Softw. Technol..

[12]  Pearl Brereton,et al.  Systematic literature reviews in global software development: A tertiary study , 2012, EASE.

[13]  André L. M. Santos,et al.  Six years of systematic literature reviews in software engineering: An updated tertiary study , 2011, Inf. Softw. Technol..

[14]  Ivica Crnkovic,et al.  15 years of CBSE symposium: impact on the research community , 2012, CBSE '12.