Identifying Metrics' Biases When Measuring or Approximating Size in Heterogeneous Languages

Context: To compare the effectiveness of development techniques, the size of compared software systems needs to be taken into account. However, in industry new development techniques often come with changes in the applied programming languages. Goal: Our goal is to investigate how different size metrics and approximations are biased towards the languages c and c++. Further, we investigate whether triangulation of metrics has the potential to compensate for biases. Method: We identify crucial preconditions for a triangulation and investigate on 34 open source projects, whether a set of 16 size metrics fulfills these preconditions for the languages c and c++. Results: We identify how metrics differ in their biases and find that the preconditions for triangulation are fulfilled. Conclusion: Triangulation has the potential to address language biases, but high variance among metrics and tools need to be taken into account, too.

[1]  Paulo Meirelles,et al.  Analizo: an Extensible Multi-Language Source Code Analysis and Visualization Toolkit , 2010 .

[2]  Premkumar T. Devanbu,et al.  A large scale study of programming languages and code quality in github , 2014, SIGSOFT FSE.

[3]  Lutz Prechelt,et al.  An Empirical Comparison of Seven Programming Languages , 2000, Computer.

[4]  Yuming Zhou,et al.  A metrics-based comparative study on object-oriented programming languages , 2015, SEKE.

[5]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[6]  Iulian Neamtiu,et al.  Assessing programming language impact on development and maintenance: a study on c and c++ , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[7]  Carlo A. Furia,et al.  A Comparative Study of Programming Languages in Rosetta Code , 2014, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[8]  Geoffrey Phipps Comparing observed bug and productivity rates for Java and C++ , 1999 .

[9]  Geoffrey Phipps Comparing Observed Bug and Productivity Rates for Java and C++ , 1999, Softw. Pract. Exp..

[10]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[11]  Michel R. V. Chaudron,et al.  Empirical Analysis of the Relation between Level of Detail in UML Models and Defect Density , 2008, MoDELS.

[12]  Yashwant K. Malaiya,et al.  Module size distribution and defect density , 2000, Proceedings 11th International Symposium on Software Reliability Engineering. ISSRE 2000.

[13]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .