A Comparison of Three Algorithms for Computing Truck Factors

Truck Factor (also known as Bus Factor or Lottery Number) is the minimal number of developers that have to be hit by a truck (or leave) before a project is incapacitated. Therefore, it is a measure that reveals the concentration of knowledge and the key developers in a project. Due to the importance of this information to project managers, algorithms were proposed to automatically compute Truck Factors, using maintenance activity data extracted from version control systems. However, to the best of our knowledge, we still lack studies that compare the accuracy of the results produced by such algorithms. Therefore, in this paper, we evaluate and compare the results of three Truck Factor algorithms. To this end, we empirically determine the truck factors of 35 open-source systems by consulting their developers. Our results show that two algorithms are very accurate, especially when the systems have a small Truck Factor. We also evaluate the impact of different thresholds and configurations in algorithm results.

[1]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[2]  Emerson R. Murphy-Hill,et al.  A degree-of-knowledge model to capture source code familiarity , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[3]  Emily Hill,et al.  Degree-of-knowledge , 2014, ACM Trans. Softw. Eng. Methodol..

[4]  Marco Tulio Valente,et al.  A novel approach for estimating Truck Factors , 2016, 2016 IEEE 24th International Conference on Program Comprehension (ICPC).

[5]  Audris Mockus,et al.  Quantifying and Mitigating Turnover-Induced Knowledge Loss: Case Studies of Chrome and a Project at Avaya , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[6]  Marco Torchiano,et al.  Is my project's truck factor low?: theoretical and empirical considerations about the truck factor threshold , 2011, WETSoM '11.

[7]  Naoyasu Ubayashi,et al.  Revisiting the applicability of the pareto principle to core development teams in open source software projects , 2015, IWPSE.

[8]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[9]  Audris Mockus,et al.  Organizational volatility and its effects on software defects , 2010, FSE '10.

[10]  Filippo Ricca,et al.  Are Heroes common in FLOSS projects? , 2010, ESEM '10.

[11]  Harald C. Gall,et al.  Don't touch my code!: examining the effects of ownership on software quality , 2011, ESEC/FSE '11.

[12]  Kent L. Beck,et al.  Extreme programming explained - embrace change , 1990 .

[13]  Premkumar T. Devanbu,et al.  Ownership, experience and defects: a fine-grained study of authorship , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[14]  Marco Torchiano,et al.  On the Difficulty of Computing the Truck Factor , 2011, PROFES.

[15]  Forrest Shull,et al.  Are developers complying with the process: an XP study , 2010, ESEM '10.

[16]  Kouichi Kishida,et al.  Toward an understanding of the motivation of open source software developers , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[17]  Marco Tulio Valente,et al.  Assessing Code Authorship: The Case of the Linux Kernel , 2017, OSS.

[18]  Jordi Cabot,et al.  Assessing the bus factor of Git repositories , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[19]  Volker Gruhn,et al.  Algorithmic Complexity of the Truck Factor Calculation , 2014, PROFES.

[20]  Kent Beck,et al.  Extreme Programming Explained: Embrace Change (2nd Edition) , 2004 .

[21]  Laurie A. Williams,et al.  Pair Programming Illuminated , 2002 .

[22]  Xavier Blanc,et al.  Code ownership in open-source software , 2014, EASE '14.

[23]  Marco Tulio Valente,et al.  Understanding the Factors That Impact the Popularity of GitHub Repositories , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[24]  Tom Mens,et al.  An Ecosystemic and Socio-Technical View on Software Maintenance and Evolution , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[25]  Sven Apel,et al.  Classifying Developers into Core and Peripheral: An Empirical Study on Count and Network Metrics , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).