Developer Learning Dynamics in Open Source Software Projects : A Hidden Markov Model Analysis

This work proposes a dynamic model of developer learning in open source software (OSS) projects. A Hidden Markov Model (HMM) is proposed to explain how the code contribution behaviors of OSS developers change as their levels of knowledge on their projects increase. In this model, discrete hidden states represent the unobserved knowledge levels of developers, and their observed code contribution behaviors are modeled as state dependent. Developers’ knowledge levels evolve as they learn about the projects over time. Two modes of learning are considered: learning-by-doing (code development) and learning through interactions with peers. The model is calibrated using data spanning six years for 25 OSS projects and 251 developers hosted at Sourceforge. The proposed model identifies three knowledge states (high, medium, and low) and estimates the impact of the two modes of learning on the transition of developers between the three knowledge states. The model results suggest that in the low knowledge state developers exhibit the greatest inertia, followed by those in the medium and high states. Both modes of learning are found to have varying impact across the three knowledge states. Interactions with peers appear to be an important source of learning for developers in all states. A developer in the low state learns only through participation in threads started by others. Prior code contribution and starting discussion by initiating threads do not impact the knowledge level of a developer in the low state. Initiating threads, participating in threads started by others, and prior code contributions have positive impacts on the knowledge level of a developer in the medium or high state and, hence, influence his long term code contribution behavior. Explanations for these varying impacts of learning activities on the transitions of developers between the three states are provided. We also find a lack of persistence of knowledge in all states. The HMM better describes the data than a latent class model which would suggests that the learning activities have a long term, dynamic impact, rather than an immediate, static impact on the code contribution behavior of a developer.

[1]  Eric D. Darr,et al.  The Acquisition, Transfer, and Depreciation of Knowledge in Service Organizations: Productivity in Franchises , 1995 .

[2]  Param Vir Singh,et al.  Stability and Efficiency of Communication Networks in Open Source Software Development , 2005 .

[3]  J Graydon,et al.  Specificity and Variability of Practice with Young Children , 1996, Perceptual and motor skills.

[4]  J. Levine,et al.  Shared Cognition in-Organizations: The Management of Knowledge , 1999 .

[5]  John A. Martilla Word-of-Mouth Communication in the Industrial Adoption Process , 1971 .

[6]  Ames,et al.  Hidden Markov Models for Longitudinal Comparisons , 2004 .

[7]  Douglas Polley,et al.  Learning While Innovating , 1992 .

[8]  Alan MacCormack,et al.  Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code , 2006, Manag. Sci..

[9]  Robert E. Ployhart,et al.  Learning by Doing Something Else: Variation, Relatedness, and the Learning Curve , 2003, Manag. Sci..

[10]  R. Moreland Transactive memory: Learning who knows what in work groups and organizations. , 1999 .

[11]  J. Heckman,et al.  Econometric duration analysis , 1984 .

[12]  J. Tirole,et al.  Some Simple Economics of Open Source , 2002 .

[13]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[14]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[15]  M. Rosenzweig,et al.  Learning by Doing and Learning from Others: Human Capital and Technical Change in Agriculture , 1995, Journal of Political Economy.

[16]  David A. Hensher,et al.  A latent class model for discrete choice analysis: contrasts with mixed logit , 2003 .

[17]  Frederick P. Brooks,et al.  No Silver Bullet: Essence and Accidents of Software Engineering , 1987 .

[18]  J. R. Larson,et al.  Groups as problem‐solving units: Toward a new meaning of social cognition , 1993 .

[19]  EppleDennis,et al.  The Acquisition, Transfer, and Depreciation of Knowledge in Service Organizations , 1995 .

[20]  Johan P. Olsen,et al.  THE UNCERTAINTY OF THE PAST: ORGANIZATIONAL LEARNING UNDER AMBIGUITY* , 1975 .

[21]  L. Sproull,et al.  Coordinating Expertise in Software Development Teams , 2000 .

[22]  Andrea Bonaccorsi,et al.  Altruistic Individuals, Selfish Firms? The Structure of Motivation in Open Source Software , 2004, First Monday.

[23]  Sandra Slaughter,et al.  Understanding the Motivations, Participation, and Performance of Open Source Software Developers: A Longitudinal Study of the Apache Projects , 2006, Manag. Sci..

[24]  Oded Netzer,et al.  A Hidden Markov Model of Customer Relationship Dynamics , 2008, Mark. Sci..

[25]  James D. Hamilton A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle , 1989 .

[26]  Ernan Haruvy,et al.  Incentives for Developers’ Contributions and Product Performance Metrics in Open Source Development: An Empirical Exploration , 2005 .

[27]  H. Harlow,et al.  The formation of learning sets. , 1949, Psychological review.

[28]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[29]  Eric A. von Hippel,et al.  How Open Source Software Works: 'Free' User-to-User Assistance? , 2000 .

[30]  S. P. Pederson,et al.  Hidden Markov and Other Models for Discrete-Valued Time Series , 1998 .

[31]  Linda Argote,et al.  An Empirical Investigation of the Microstructure of Knowledge Acquisition and Transfer Through Learning by Doing , 1996, Oper. Res..

[32]  H A Simon,et al.  The theory of learning by doing. , 1979, Psychological review.

[33]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[34]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[35]  Brian Fitzgerald,et al.  Why Hackers Do What They Do: Understanding Motivation and Effort in Free/Open Source Software Projects , 2007 .

[36]  L. Argote Group and organizational learning curves: Individual, system and environmental components , 1993 .

[37]  Rob Cross,et al.  A Relational View of Information Seeking and Learning in Social Networks , 2003, Manag. Sci..