Herding a Deluge of Good Samaritans: How GitHub Projects Respond to Increased Attention

Collaborative crowdsourcing is a well-established model of work, especially in the case of open source software development. The structure and operation of these virtual and loosely-knit teams differ from traditional organizations. As such, little is known about how their behavior may change in response to an increase in external attention. To understand these dynamics, we analyze millions of actions of thousands of contributors in over 1100 open source software projects that topped the GitHub Trending Projects page and thus experienced a large increase in attention, in comparison to a control group of projects identified through propensity score matching. In carrying out our research, we use the lens of organizational change, which considers the challenges teams face during rapid growth and how they adapt their work routines, organizational structure, and management style. We show that trending results in an explosive growth in the effective team size. However, most newcomers make only shallow and transient contributions. In response, the original team transitions towards administrative roles, responding to requests and reviewing work done by newcomers. Projects evolve towards a more distributed coordination model with newcomers becoming more central, albeit in limited ways. Additionally, teams become more modular with subgroups specializing in different aspects of the project. We discuss broader implications for collaborative crowdsourcing teams that face attention shocks.

[1]  D. Whetten Organizational Growth and Decline Processes , 1987 .

[2]  C. Fombrun,et al.  Structuring small firms for rapid growth , 1989 .

[3]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[4]  D. Gergle,et al.  Hot Off the Wiki , 2013 .

[5]  Premkumar T. Devanbu,et al.  Gender and Tenure Diversity in GitHub Teams , 2015, CHI.

[6]  NORA McDONALD,et al.  Modeling Distributed Collaboration on GitHub , 2014, Adv. Complex Syst..

[7]  P. Blau A FORMAL THEORY OF DIFFERENTIATION IN ORGANIZATIONS , 1970 .

[8]  Charlene L. Nicholls-Nixon Rapid growth and high performance: The entrepreneur's “impossible dream?” , 2005 .

[9]  Premkumar T. Devanbu,et al.  Developer onboarding in GitHub: the role of prior social links and language experience , 2015, ESEC/SIGSOFT FSE.

[10]  Markus Reihlen,et al.  A Process Perspective on Organizational Failure: A Qualitative Meta‐Analysis , 2018, Journal of Management Studies.

[11]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[12]  Premkumar T. Devanbu,et al.  Quality and productivity outcomes relating to continuous integration in GitHub , 2015, ESEC/SIGSOFT FSE.

[13]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[14]  Yochai Benkler,et al.  Peer Production: A Form of Collective Intelligence , 2016 .

[15]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Ceren Budak,et al.  Shocking the Crowd: The Effect of Censorship Shocks on Chinese Wikipedia , 2017, ICWSM.

[17]  Daniel M. Romero,et al.  Crowd Development , 2017 .

[18]  BoschJan,et al.  Social Networking Meets Software Development , 2013 .

[19]  Jesús M. González-Barahona,et al.  On the Inequality of Contributions to Wikipedia , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[20]  James D. Herbsleb,et al.  Impression formation in online peer production: activity traces and personal profiles in github , 2013, CSCW.

[21]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[22]  Alberto Abadie Semiparametric Difference-in-Differences Estimators , 2005 .

[23]  Kelly Blincoe,et al.  The Sky Is Not the Limit: Multitasking Across GitHub Projects , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[24]  A. Hung,et al.  Reweighted Mahalanobis distance matching for cluster‐randomized trials with missing data , 2012, Pharmacoepidemiology and drug safety.

[25]  Danny Miller,et al.  WHAT HAPPENS AFTER SUCCESS: THE PERILS OF EXCELLENCE* , 1994 .

[26]  Niklas Kiviluoto Growth as evidence of firm success: myth or reality? , 2013 .

[27]  Jon M. Kleinberg,et al.  Coordination and Efficiency in Decentralized Collaboration , 2015, ICWSM.

[28]  D. Hambrick,et al.  Stumblers and stars in the management of rapid growth , 1985 .

[29]  Peter C. Gronn Distributed leadership as a unit of analysis , 2002 .

[30]  David Lo,et al.  Why and how developers fork what from whom in GitHub , 2017, Empirical Software Engineering.

[31]  Daniel M. Romero,et al.  Network Structure, Efficiency, and Performance in WikiProjects , 2018, ICWSM.

[32]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[33]  David Laniado,et al.  Co-authorship 2.0: patterns of collaboration in Wikipedia , 2011, HT '11.

[34]  Jock Given,et al.  The wealth of networks: How social production transforms markets and freedom , 2007, Inf. Econ. Policy.

[35]  Antonio Lima,et al.  Coding Together at Scale: GitHub as a Collaborative Social Network , 2014, ICWSM.

[36]  Oded Nov,et al.  Determinants of wikipedia quality: the roles of global and local contribution inequality , 2010, CSCW '10.

[37]  Jan Bosch,et al.  Social Networking Meets Software Development: Perspectives from GitHub, MSDN, Stack Exchange, and TopCoder , 2013, IEEE Software.

[38]  John B. Cullen,et al.  Administrative Reorganization and Configurational Context: The Contingent Effects of Age, Size, and Change in Size , 1993 .

[39]  Lionel P. Robert,et al.  Participation of New Editors after Times of Shock on Wikipedia , 2019, ICWSM.

[40]  Aniket Kittur,et al.  Harnessing the wisdom of crowds in wikipedia: quality through coordination , 2008, CSCW.

[41]  Toni M. Somers,et al.  Organizational design and pricing capabilities for superior firm performance , 2014 .

[42]  James D. Herbsleb,et al.  Social coding in GitHub: transparency and collaboration in an open software repository , 2012, CSCW.