Open Source Software Development and Lotka's Law: Bibliometric Patterns in Programming

This research applies Lotka's Law to metadata on open source software development. Lotka's Law predicts the proportion of authors at different levels of productivity. Open source software development harnesses the creativity of thousands of programmers worldwide, is important to the progress of the Internet and many other computing environments, and yet has not been widely researched. We examine metadata from the Linux Software Map (LSM), which documents many open source projects, and Sourceforge, one of the largest resources for open source developers. Authoring patterns found are comparable to prior studies of Lotka's Law for scientific and scholarly publishing. Lotka's Law was found to be effective in understanding software development productivity patterns, and offer promise in predicting aggregate behavior of open source developers.

[1]  Paul Travis Nicholls,et al.  Empirical validation of Lotka's law , 1986, Inf. Process. Manag..

[2]  A. Pritchard,et al.  Statistical bibliography or bibliometrics , 1969 .

[3]  Alan Edward Schorr Lotka's Law and Map Librarianship , 1975, J. Am. Soc. Inf. Sci..

[4]  Chris DiBona,et al.  Open Sources: Voices from the Open Source Revolution , 1999 .

[5]  E. Rogers Diffusion of Innovations , 1962 .

[6]  Leo Egghe,et al.  Modelling Multi-Relational Data with Special Attention to the Average Number of Collaborators as a Variable in Informetric Distributions , 1996, Inf. Process. Manag..

[7]  Felix Auerbach Geschichtstafeln der Physik , 1910, Nature.

[8]  Audris Mockus,et al.  A case study of open source software development: the Apache server , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[9]  Jane Greenberg,et al.  A Quantitative Profile of a Community of Open Source Linux Developers , 1999 .

[10]  Miranda Lee Pao,et al.  Lotka's law: A testing procedure , 1985, Inf. Process. Manag..

[11]  John M. Fang,et al.  A modification of Lotka's function for scientific productivity , 1995 .

[12]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[13]  I. N. Sengupta Bibliometrics, Informetrics, Scientometrics and Librametrics: An Overview , 1992 .

[14]  Eric S. Raymond,et al.  The cathedral and the bazaar - musings on Linux and open source by an accidental revoltionary (rev. ed.) , 2001 .

[15]  Jane Greenberg,et al.  Who is an open source software developer? , 2002, CACM.

[16]  Hailin Wu,et al.  The privacy practices of Web browser extensions , 2001, CACM.

[17]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .

[18]  Blaise Cronin,et al.  Bibliometrics and beyond: some thoughts on web-based citation analysis , 2001, J. Inf. Sci..