SentiCR: A customized sentiment analysis tool for code review interactions

Sentiment Analysis tools, developed for analyzing social media text or product reviews, work poorly on a Software Engineering (SE) dataset. Since prior studies have found developers expressing sentiments during various SE activities, there is a need for a customized sentiment analysis tool for the SE domain. On this goal, we manually labeled 2000 review comments to build a training dataset and used our dataset to evaluate seven popular sentiment analysis tools. The poor performances of the existing sentiment analysis tools motivated us to build SentiCR, a sentiment analysis tool especially designed for code review comments. We evaluated SentiCR using one hundred 10-fold cross-validations of eight supervised learning algorithms. We found a model, trained using the Gradient Boosting Tree (GBT) algorithm, providing the highest mean accuracy (83%), the highest mean precision (67.8%), and the highest mean recall (58.4%) in identifying negative review comments.

[1]  Bram Adams,et al.  Monitoring sentiment in open source mailing lists: exploratory study on the apache ecosystem , 2014, CASCON.

[2]  Bernd Brügge,et al.  Towards emotional awareness in software development teams , 2013, ESEC/FSE 2013.

[3]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[4]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[5]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[6]  G. Motuzova The Third International Conference , 2011 .

[7]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Estevam R. Hruschka,et al.  Biocom Usp: Tweet Sentiment Analysis with Adaptive Boosting Ensemble , 2014, SemEval@COLING.

[9]  Alexander Serebrenik,et al.  Security and emotion: sentiment analysis of security discussions on GitHub , 2014, MSR 2014.

[10]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[11]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[12]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[13]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[14]  Marcelo Serrano Zanetti,et al.  The Role of Emotions in Contributors Activity: A Case Study on the GENTOO Community , 2013, 2013 International Conference on Cloud and Green Computing.

[15]  Meena Nagarajan,et al.  Proceedings of the Workshop on Languages in Social Media , 2011 .

[16]  Bram Adams,et al.  Do developers feel emotions? an exploratory analysis of emotions in software artifacts , 2014, MSR 2014.

[17]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[18]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[19]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[20]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[21]  Sourabh Joshi,et al.  Comparative Study of Classification Algorithms used in Sentiment Analysis , 2014 .

[22]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[23]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[24]  Martin Porter,et al.  Snowball: A language for stemming algorithms , 2001 .

[25]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[26]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[27]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[28]  Yang Li,et al.  Sentiment analysis of commit comments in GitHub: an empirical study , 2014, MSR 2014.

[29]  Minhaz Fahim Zibran,et al.  Leveraging Automated Sentiment Analysis in Software Engineering , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[30]  Efstathios Stamatatos,et al.  Distinguishing the Popularity between Topics: A System for Up-to-Date Opinion Retrieval and Mining in the Web , 2013, CICLing.

[31]  Geoffrey E. Hinton,et al.  Learning representations of back-propagation errors , 1986 .

[32]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[33]  James R. Curran Proceedings of the COLING/ACL on Interactive presentation sessions , 2006 .

[34]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[35]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[36]  Justin Zhijun Zhan,et al.  Sentiment analysis using product review data , 2015, Journal of Big Data.

[37]  Jeffrey C. Carver,et al.  Impact of Peer Code Review on Peer Impression Formation: A Survey , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[38]  Christoph Treude,et al.  The impact of social media on software engineering practices and tools , 2010, FoSER '10.

[39]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[40]  Romain Robbes,et al.  Linking e-mails and source code artifacts , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[41]  Dino Isa,et al.  Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine , 2008, IEEE Transactions on Knowledge and Data Engineering.

[42]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[43]  Dino Isa,et al.  An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization , 2011, Applied Intelligence.

[44]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[45]  Munmun De Choudhury,et al.  Understanding affect in the workplace via social media , 2013, CSCW.

[46]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[47]  Gregorio Robles,et al.  SENTIMENT ANALYSIS OF FREE/OPEN SOURCE DEVELOPERS: PRELIMINARY FINDINGS FROM A CASE STUDY , 2014, Revista Eletrônica de Sistemas de Informação.

[48]  Alexander Serebrenik,et al.  Choosing your weapons: On sentiment analysis tools for software engineering research , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[49]  Vivek Narayanan,et al.  Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model , 2013, IDEAL.

[50]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[51]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[52]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[53]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.