Monitoring sentiment in open source mailing lists: exploratory study on the apache ecosystem

Large software projects, both open and closed source, are constructed and maintained collaboratively by teams of developers and testers, who are typically geographically dispersed. This dispersion creates a distance between team members, hiding feelings of distress or (un)happiness from their manager, which prevents him or her from using remediation techniques for those feelings. This paper evaluates the usage of automatic sentiment analysis to identify distress or happiness in a development team. Since mailing lists are one of the most popular media for discussion in distributed software projects, we extracted sentiment values of the user and developer mailing lists of two of the most successful and mature projects of the Apache software foundation. The results show that (1) user and developer mailing lists carry both positive and negative sentiment and have a slightly different focus, while (2) work is needed to customize automatic sentiment analysis techniques to the domain of software engineering, since they lack precision when facing technical terms

[1]  M. Thelwall Heart and Soul : Sentiment Strength Detection in the Social Web with , 2013 .

[2]  Teresa M. Amabile,et al.  Affect and Creativity at Work , 2005 .

[3]  Allen I. Kraut,et al.  Predicting turnover of employees from measured job attitudes , 1975 .

[4]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[5]  C. Osgood,et al.  The Measurement of Meaning , 1958 .

[6]  Bram Adams,et al.  Do developers feel emotions? an exploratory analysis of emotions in software artifacts , 2014, MSR 2014.

[7]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[8]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[9]  Nidhi Mishra,et al.  Classification of Opinion Mining Techniques , 2012 .

[10]  Srikumar S. Rao Happiness at Work , 2010 .

[11]  B. Fredrickson The role of positive emotions in positive psychology. The broaden-and-build theory of positive emotions. , 2001, The American psychologist.

[12]  Gerardo Canfora,et al.  How changes affect software entropy: an empirical study , 2014, Empirical Software Engineering.

[13]  C. Cooper,et al.  Well-being: Productivity and Happiness at Work , 2011 .

[14]  Berkant Barla Cambazoglu,et al.  A large-scale sentiment analysis for Yahoo! answers , 2012, WSDM '12.

[15]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[16]  Ahmed E. Hassan,et al.  What Can OSS Mailing Lists Tell Us? A Preliminary Psychometric Text Analysis of the Apache Developer Mailing List , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[17]  Romain Robbes,et al.  Linking e-mails and source code artifacts , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[18]  D. Sgroi,et al.  Happiness and Productivity , 2015, Journal of Labor Economics.

[19]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[20]  Charles Song,et al.  SOPS: Stock Prediction Using Web Sentiment , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[21]  Bernd Brügge,et al.  Towards emotional awareness in software development teams , 2013, ESEC/FSE 2013.

[22]  Arie van Deursen,et al.  Communication in open source software development mailing lists , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[23]  Daniel M. German,et al.  Open source software peer review practices , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[24]  Eleni Stroulia,et al.  On the Personality Traits of StackOverflow Users , 2013, 2013 IEEE International Conference on Software Maintenance.

[25]  Scott W. Ambler,et al.  Agile modeling: effective practices for extreme programming and the unified process , 2002 .

[26]  John Grundy,et al.  Collaborative Software Engineering , 2010 .

[27]  Yingying Zhang,et al.  Extracting problematic API features from forum discussions , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[28]  Alberto Bacchelli,et al.  Content classification of development emails , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[29]  Munmun De Choudhury,et al.  Understanding affect in the workplace via social media , 2013, CSCW.

[30]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.