Inequalities in Open Source Software Development: Analysis of Contributor’s Commits in Apache Software Foundation Projects

While researchers are becoming increasingly interested in studying OSS phenomenon, there is still a small number of studies analyzing larger samples of projects investigating the structure of activities among OSS developers. The significant amount of information that has been gathered in the publicly available open-source software repositories and mailing-list archives offers an opportunity to analyze projects structures and participant involvement. In this article, using on commits data from 263 Apache projects repositories (nearly all), we show that although OSS development is often described as collaborative, but it in fact predominantly relies on radically solitary input and individual, non-collaborative contributions. We also show, in the first published study of this magnitude, that the engagement of contributors is based on a power-law distribution.

[1]  Dariusz Jemielniak,et al.  Common Knowledge?: An Ethnography of Wikipedia , 2014 .

[2]  Lee Sproull,et al.  Essence of Distributed Work: The Case of the Linux Kernel , 2000, First Monday.

[3]  Georg von Krogh,et al.  Open Source Software and the "Private-Collective" Innovation Model: Issues for Organization Science , 2003, Organ. Sci..

[4]  Benjamin Mako Hill,et al.  The Remixing Dilemma , 2012, ArXiv.

[5]  Hoda Baytiyeh,et al.  Open source software: A community of altruists , 2010, Comput. Hum. Behav..

[6]  H. Sips,et al.  Pirates and Samaritans: A decade of measurements on peer production and their implications for net neutrality and copyright , 2008 .

[7]  E. Ostrom Collective action and the evolution of social norms , 2000, Journal of Economic Perspectives.

[8]  David B. Nieborg,et al.  Wikinomics and its discontents: a critical analysis of Web 2.0 business manifestos , 2009, New Media Soc..

[9]  Heidi E. Buchanan,et al.  Collectivism vs. Individualism in a Wiki World: Librarians Respond to Jaron Lanier's Essay “Digital Maoism: The Hazards of the New Online Collectivism” , 2007 .

[10]  Sandeep Krishnamurthy,et al.  Cave or Community? An Empirical Examination of 100 Mature Open Source Projects , 2002, First Monday.

[11]  Kevin Crowston,et al.  The Perils and Pitfalls of Mining SourceForge , 2004, MSR.

[12]  Samer Faraj,et al.  Emergence of Power Laws in Online Communities: The Role of Social Mechanisms and Preferential Attachment , 2014, MIS Q..

[13]  Yochai Benkler,et al.  Coase's Penguin, or Linux and the Nature of the Firm , 2001, ArXiv.

[14]  E. Hargittai,et al.  THE PARTICIPATION DIVIDE: Content creation and sharing in the digital age1 , 2008 .

[15]  Nicolas Ducheneaut,et al.  Socialization in an Open Source Software Community: A Socio-Technical Analysis , 2005, Computer Supported Cooperative Work (CSCW).

[16]  Roy T. Fielding,et al.  Shared leadership in the Apache project , 1999, CACM.

[17]  Samer Faraj,et al.  Why Should I Share? Examining Social Capital and Knowledge Contribution in Electronic Networks of Practice , 2005, MIS Q..

[18]  Jesús M. González-Barahona,et al.  Intensive metrics for the study of the evolution of open source projects: Case studies from Apache Software Foundation projects , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  Luis Rodero-Merino,et al.  Studying the evolution of libre software projects using publicly available data , 2012 .

[20]  Robert I. Lerman,et al.  Improving the accuracy of estimates of Gini coefficients , 1989 .

[21]  Daniel Kreiss,et al.  The limits of peer production: Some reminders from Max Weber for the network society , 2011, New Media Soc..

[22]  Y. Benkler,et al.  Commons‐based Peer Production and Virtue* , 2006 .

[23]  Benoît Demil,et al.  Neither Market nor Hierarchy nor Network: The Emergence of Bazaar Governance , 2006 .

[24]  Peter A. Gloor,et al.  Swarm Creativity: Competitive Advantage Through Collaborative Innovation Networks , 2006 .

[25]  Brian Fitzgerald,et al.  The Transformation of Open Source Software , 2006, MIS Q..

[26]  Alessandro Gabbiadini,et al.  Virtual Users Support Forum: Do Community Members Really Want to Help You? , 2013, Cyberpsychology Behav. Soc. Netw..

[27]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[28]  Fernando De Maio,et al.  Income inequality measures , 2007 .

[29]  Benno Luthiger Fun and Software Development , 2005 .

[30]  Dirk Riehle,et al.  The Single-Vendor Commercial Open Source Business Model , 2010 .

[31]  Jock Given,et al.  The wealth of networks: How social production transforms markets and freedom , 2007, Inf. Econ. Policy.

[32]  Marco Aurélio Gerosa,et al.  Characterizing Key Developers: A Case Study with Apache Ant , 2012, CRIWG.

[33]  Mathieu O'Neil Shirky and Sanger, or the costs of crowdsourcing , 2010 .

[34]  Andrew Keen,et al.  Book Review: Andrew Keen, The Cult of the Amateur: How Today's Internet Is Killing Our Culture and Assaulting Our Economy. London and Boston, MA: Currency/Doubleday, 2007. 228 pp. ISBN 0—3855—2080—8, $22.95 (pbk) , 2008, New Media Soc..

[35]  Malgorzata Ciesielska,et al.  Boundary object as a trust buffer. The study of an open source code repository , 2013 .

[36]  Dirk Riehle,et al.  Open Collaboration within Corporations Using Software Forges , 2009, IEEE Software.

[37]  F. De Maio,et al.  Income inequality measures , 2007, Journal of Epidemiology & Community Health.

[38]  Dirk Riehle,et al.  The Commit Size Distribution of Open Source Software , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[39]  Dariusz Jemielniak,et al.  Time as symbolic currency in knowledge work , 2009, Inf. Organ..

[40]  Stefan Koch,et al.  Results from software engineering research into open source development projects using public data , 2000 .

[41]  Julita Vassileva,et al.  Collaboration and Technology , 2014, Lecture Notes in Computer Science.

[42]  J. Kaivo-oja,et al.  Smart Regions: Two Cases of Crowdsourcing for Regional Development , 2013 .

[43]  H. Bradbury The SAGE Handbook of Action Research , 2007 .

[44]  Christian Bird,et al.  Who? Where? What? Examining distributed development in two large open source projects , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[45]  Peter A. Gloor,et al.  The New Principles of a Swarm Business , 2007 .

[46]  Jürgen Bitzer,et al.  Intrinsic motivation in open source software development , 2007 .

[47]  Steven Weber,et al.  The Success of Open Source , 2004 .

[48]  Jan Ljungberg,et al.  Open source movements as a model for organising , 2000, ECIS.

[49]  G. Ritzer,et al.  Production, Consumption, Prosumption , 2010 .

[50]  Eric A. von Hippel,et al.  How Open Source Software Works: 'Free' User-to-User Assistance? , 2000 .

[51]  M. Bauwens Class and capital in peer production , 2009 .

[52]  D. Hebb,et al.  On the nature of fear. , 1946, Psychological review.

[53]  Jesús M. González-Barahona,et al.  Geographic origin of libre software developers , 2008, Inf. Econ. Policy.

[54]  Mathieu O’Neil The sociology of critique in Wikipedia , 2011 .

[55]  Kevin Crowston,et al.  Hierarchy and centralization in free and open source software team communications , 2006 .

[56]  Cliff Lampe,et al.  Defining, Understanding, and Supporting Open Collaboration , 2013 .

[57]  Michael Hahsler,et al.  Discussion of a Large-Scale Open Source Data Collection Methodology , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[58]  Magnus Bergquist,et al.  The power of gifts: organizing social relationships in open source communities , 2001, Inf. Syst. J..

[59]  Kevin Crowston,et al.  Free/Libre open-source software development: What we know and what we do not know , 2012, CSUR.

[60]  Dariusz Jemielniak,et al.  Naturally Emerging Regulation and the Danger of Delegitimizing Conventional Leadership: Drawing on the Example of Wikipedia , 2015 .

[61]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[62]  Audris Mockus,et al.  Automating the Measurement of Open Source Projects , 2003 .

[63]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[64]  Haim Shalit Calculating the Gini Index of Inequality for Individual Data , 2009 .