On the Terms Within- and Cross-Company in Software Effort Estimation

Background: the terms Within-Company (WC) and Cross-Company (CC) in Software Effort Estimation (SEE) have the connotation that CC projects are considerably different from WC projects, and that WC projects are more similar to the projects being estimated. However, as WC projects can themselves be heterogeneous, this is not always the case. Therefore, the use of the terms WC and CC has been questioned as potentially misleading and possibly unhelpful. Aims: to raise awareness of the SEE community in terms of the problems presented by the terms WC and CC, and to encourage discussions on the appropriateness of these terms. Method: existing literature on CC and WC SEE is discussed to raise evidence in favour and against the use of these terms. Results: existing evidence suggests that the terms WC and CC are helpful, because distinguishing between WC and CC projects can help the predictive performance of SEE models. However, due to their connotation, they can be misleading and potentially lead to wrong conclusions in studies comparing WC and CC SEE models. Conclusions: the issue being tackled when investigating WC and CC SEE is heterogeneity, and not the different origins of the software projects per se. Given that the terms WC and CC can be misleading, researchers are encouraged to discuss and consider the problems presented by these terms in SEE papers. Labelling projects as "potentially homogeneous" and "potentially heterogeneous" may be safer than directly labelling them as WC and CC projects.

[1]  Stephen G. MacDonell,et al.  Comparing Local and Global Software Effort Estimation Models -- Reflections on a Systematic Review , 2007, ESEM 2007.

[2]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[3]  Tim Menzies,et al.  Transfer learning in effort estimation , 2015, Empirical Software Engineering.

[4]  Burak Turhan,et al.  Sharing Data and Models in Software Engineering , 2014 .

[5]  Ayse Bener,et al.  Evaluation of Feature Extraction Methods on Software Cost Estimation , 2007, ESEM 2007.

[6]  Lionel C. Briand,et al.  A replicated Assessment of Common Software Cost Estimation Techniques , 2000, ICSE 2000.

[7]  Stephen G. MacDonell,et al.  Comparing Local and Global Software Effort Estimation Models -- Reflections on a Systematic Review , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[8]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[9]  Xin Yao,et al.  How to make best use of cross-company data in software effort estimation? , 2014, ICSE.

[10]  Burak Turhan,et al.  Chapter 21 – How to Adapt Models in a Dynamic World , 2015, MoDELS 2015.

[11]  Martin J. Shepperd,et al.  Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets , 2003, GECCO.

[12]  Guilherme Horta Travassos,et al.  Cross versus Within-Company Cost Estimation Studies: A Systematic Review , 2007, IEEE Transactions on Software Engineering.

[13]  Burak Turhan,et al.  A Comparison of Cross-Versus Single-Company Effort Prediction Models for Web Projects , 2014, 2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications.

[14]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.