How do open source communities blog?

We report on an exploratory study, which aims at understanding how software communities use blogs compared to conventional development infrastructures. We analyzed the behavior of 1,100 bloggers in four large open source communities, distinguishing between committing bloggers and other community members. We observed that these communities intensively use blogs with one new entry every 8 h. A blog entry includes 14 times more words than a commit message. When analyzing the content of the blogs, we found that committers and others bloggers write about similar topics. Most popular topics in committers’ blogs represent high-level concepts such as features and domain concepts, while source code related topics are discussed in 15% of their posts. Other community members frequently write about community events and conferences as well as configuration and deployment topics. We found that the blogging peak period is usually after the software is released. Moreover, committers are more likely to blog after corrective engineering than after forward engineering and re-engineering activities. Our findings call for a hypothesis-driven research to (a) further understand the role of social media in dissolving the collaboration boundaries between developers and other stakeholders and (b) integrate social media into development processes and tools.

[1]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[2]  Walid Maalej,et al.  From work to word: How do software developers describe their work? , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[3]  Christoph Treude,et al.  Measuring API documentation on the web , 2011, Web2SE '11.

[4]  Walid Maalej,et al.  Can development work describe itself? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[5]  Romain Robbes,et al.  Linking e-mails and source code artifacts , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[6]  Arie van Deursen,et al.  Adinda: a knowledgeable, browser-based IDE , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Walid Maalej,et al.  How do developers blog?: an exploratory study , 2011, MSR '11.

[8]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[9]  Ahmed E. Hassan,et al.  A Lightweight Approach to Uncover Technical Artifacts in Unstructured Data , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[10]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[11]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[12]  Kevin Crowston,et al.  A structurational perspective on leadership in free/libre open source software teams , 2005 .

[13]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[14]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.

[15]  Dimitris Panagiotou,et al.  Towards Effective Management of Software Knowledge Exploiting the Semantic Wiki Paradigm , 2008, Software Engineering.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  Christoph Treude,et al.  How tagging helps bridge the gap between social and technical aspects in software development , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[18]  B. Tseng,et al.  Tomographic Clustering To Visualize Blog Communities as Mountain Views , 2005 .

[19]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[20]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[21]  Arie van Deursen,et al.  Combining micro-blogging and IDE interactions to support developers in their quests , 2010, 2010 IEEE International Conference on Software Maintenance.

[22]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[23]  Thomas Zimmermann,et al.  Extracting structural information from bug reports , 2008, MSR '08.

[24]  Andrew Begel,et al.  Social media for software engineering , 2010, FoSER '10.

[25]  Matthew Hurst,et al.  BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[26]  Michele Lanza,et al.  On the nature of commits , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering - Workshops.

[27]  Yun Chi,et al.  Identifying opinion leaders in the blogosphere , 2007, CIKM '07.

[28]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[29]  Walid Maalej,et al.  On the Socialness of Software , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.