Social Network Analysis of Developers' and Users' Mailing Lists of Some Free Open Source Software

As reported by Kevin Crowston and co-authors in a recent paper, free open source software is a very important social phenomenon that involves nearly one million programmers, a myriad of software development firms, millions of users, and its financial impact is huge since for instance the cost of recreating available free software is estimated in tens of billions of euros. Free open source software projects generally have one mailing list for developers and another one for users. This large number of mailing lists changes constantly and shows a great variety with respect to membership and topics covered. This makes them very difficult to monitor. One way of overcoming this Big Data Challenge is to identify some easily computable global indicators that can be used for instance to detect important events. We illustrate this approach here by making a social network analysis and comparison of developers' and users' mailing lists of four free open source software projects: CentOS, GnuPG, Mailman and Samba. We show that these mailing lists have some common characteristics: the number of messages, the time durations and the interlink times can be fitted using power and lognormal laws with suitable scales and parameters, for the interlink time, the analysis is done using the temporal delta density inspired by the delta density introduced by Viard and Latapy. This similarity between the characteristics of mailing lists also applies to the structure of dominant groups. For the time evolution of the number of messages, GnuPG exhibits a particular behavior. The interpretation of the different parameters gives very interesting insights into the membership and the type of topics covered by the mailing lists. The analysis carried out here and similar studies cited in this paper can therefore be considered as a first step towards the designing of building blocks for monitoring mailing lists.

[1]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[2]  Ahmed E. Hassan,et al.  On the Central Role of Mailing Lists in Open Source Projects: An Exploratory Study , 2009, JSAI-isAI Workshops.

[3]  Premkumar T. Devanbu,et al.  How social Q&A sites are changing knowledge sharing in open source software communities , 2014, CSCW.

[4]  Daniel M. German,et al.  Using software trails to rebuild the evolution of software , 2003 .

[5]  Qinna Wang,et al.  Link prediction and threads in email networks , 2014, 2014 International Conference on Data Science and Advanced Analytics (DSAA).

[6]  Matthieu Latapy,et al.  Les usages épistémiques des réseaux de communication électronique : Le cas de l’Open-Source , 2008 .

[7]  Premkumar T. Devanbu,et al.  Latent social structure in open source projects , 2008, SIGSOFT '08/FSE-16.

[8]  Kouichi Kishida,et al.  Toward an understanding of the motivation of open source software developers , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[9]  Eric A. von Hippel,et al.  How Open Source Software Works: 'Free' User-to-User Assistance? , 2000 .

[10]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[11]  Matthieu Latapy,et al.  Identifying roles in an IP network with temporal and structural density , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[12]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[13]  Vicenç Gómez,et al.  Statistical analysis of the social network and discussion threads in slashdot , 2008, WWW.

[14]  Françoise Détienne,et al.  Cross-participants: fostering design-use mediation in an open source software community , 2007, ECCE '07.

[15]  Arie van Deursen,et al.  Communication in open source software development mailing lists , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[16]  Kevin Crowston,et al.  Free/Libre open-source software development: What we know and what we do not know , 2012, CSUR.

[17]  VojnovicMilan,et al.  Power Law and Exponential Decay of Intercontact Times between Mobile Devices , 2010 .

[18]  Matthieu Latapy,et al.  Multi-level analysis of an interaction network between individuals in a mailing-list , 2007, Ann. des Télécommunications.

[19]  Vicenç Gómez,et al.  Description and Prediction of Slashdot Activity , 2007, 2007 Latin American Web Conference (LA-WEB 2007).

[20]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[21]  Jean-Yves Le Boudec,et al.  Power Law and Exponential Decay of Intercontact Times between Mobile Devices , 2010, IEEE Trans. Mob. Comput..

[22]  Ioannis Stamelos,et al.  Understanding knowledge sharing activities in free/open source software projects: An empirical study , 2008, J. Syst. Softw..

[23]  Patrick Mair,et al.  Content-Based Social Network Analysis of Mailing Lists , 2011 .

[24]  Daniel M. Germán,et al.  Using software trails to reconstruct the evolution of software , 2004, J. Softw. Maintenance Res. Pract..

[25]  Kevin Crowston,et al.  Social dynamics of free and open source team communications , 2006, OSS.