Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools

Modern software development communities are increasingly social. Popular chat platforms such as Slack host public chat communities that focus on specific development topics such as Python or Ruby-on-Rails. Conversations in these public chats often follow a Q&A format, with someone seeking information and others providing answers in chat form. In this paper, we describe an exploratory study into the potential use-fulness and challenges of mining developer Q&A conversations for supporting software maintenance and evolution tools. We designed the study to investigate the availability of information that has been successfully mined from other developer communications, particularly Stack Overflow. We also analyze characteristics of chat conversations that might inhibit accurate automated analysis. Our results indicate the prevalence of useful information, including API mentions and code snippets with descriptions, and several hurdles that need to be overcome to automate mining that information.

[1]  Cristina V. Lopes,et al.  From Query to Usable Code: An Analysis of Stack Overflow Code Snippets , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[2]  Gerardo Canfora,et al.  Who is going to mentor newcomers in open source projects? , 2012, SIGSOFT FSE.

[3]  Anthony Cleve,et al.  Mining Stack Overflow for discovering error patterns in SQL queries , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[4]  Martin P. Robillard,et al.  Discovering Information Explaining API Types Using Text Classification , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[5]  M. Maia,et al.  Ranking crowd knowledge to assist software development , 2014, ICPC 2014.

[6]  Gerardo Canfora,et al.  Mining source code descriptions from developer communications , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[7]  Bernd Brügge,et al.  Rationale in Development Chat Messages: An Exploratory Study , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[8]  Mohamad Adam Bujang,et al.  A simplified guide to determination of sample size requirements for estimating the value of intraclass correlation coefficient: a review , 2017 .

[9]  Eleni Stroulia,et al.  Crowdsourced bug triaging , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[10]  David W. Aha,et al.  Artificial Intelligence , 2014 .

[11]  Zhenchang Xing,et al.  Mining Analogical Libraries in Q&A Discussions -- Incorporating Relational and Categorical Knowledge into Word Embedding , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[12]  Nicole Novielli,et al.  The challenges of sentiment detection in the social programmer ecosystem , 2015, SSE@SIGSOFT FSE.

[13]  Bernd Brügge,et al.  How do developers discuss rationale? , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[14]  Nicholas A. Kraft,et al.  What information about code snippets is available in different software-related documents? An exploratory study , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[15]  Tao Zhang,et al.  An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[16]  Margaret-Anne D. Storey,et al.  How Software Developers Mitigate Collaboration Friction with Chatbots , 2017, ArXiv.

[17]  Daniel M. Germán,et al.  How the R community creates and curates knowledge: an extended study of stack overflow and mailing lists , 2017, Empirical Software Engineering.

[18]  Philip J. Guo,et al.  Paradise unplugged: identifying barriers for female participation on stack overflow , 2016, SIGSOFT FSE.

[19]  Reid Holmes,et al.  Live API documentation , 2014, ICSE.

[20]  Jie Wang,et al.  Fixing Recurring Crash Bugs via Analyzing Q&A Sites (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[21]  David Lo,et al.  SEWordSim: software-specific word similarity database , 2014, ICSE Companion.

[22]  Leif Singer,et al.  How Social and Communication Channels Shape and Challenge a Participatory Culture in Software Development , 2017, IEEE Transactions on Software Engineering.

[23]  Micha Elsner,et al.  You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement , 2008, ACL.

[24]  Seung-won Hwang,et al.  Enriching Documents with Examples: A Corpus Mining Approach , 2013, TOIS.

[25]  Michael W. Godfrey,et al.  Detecting API usage obstacles: A study of iOS and Android developer questions , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[26]  Christoph Treude,et al.  Augmenting API Documentation with Insights from Stack Overflow , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[27]  Mohammad Ghafari,et al.  ExceptionTracer: A Solution Recommender for Exceptions in an Integrated Development Environment , 2015, 2015 IEEE 23rd International Conference on Program Comprehension.

[28]  Ahmed E. Hassan,et al.  Studying the use of developer IRC meetings in open source projects , 2009, 2009 IEEE International Conference on Software Maintenance.

[29]  Alexander Serebrenik,et al.  Why Developers Are Slacking Off: Understanding How Software Teams Use Slack , 2016, CSCW Companion.

[30]  Kostadin Damevski,et al.  StackInTheFlow: Behavior-Driven Recommendation System for Stack Overflow Posts , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[31]  Gabriele Bavota,et al.  Mining StackOverflow to turn the IDE into a self-confident programming prompter , 2014, MSR 2014.

[32]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[33]  Jan Ole Johanssen,et al.  REACT: An Approach for Capturing Rationale in Chat Messages , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[34]  Nicole Novielli,et al.  Towards discovering the role of emotions in stack overflow , 2014, SSE@SIGSOFT FSE.

[35]  Arilo Claudio Dias-Neto,et al.  What are Software Engineers asking about Android Testing on Stack Overflow? , 2017, SBES.

[36]  Savannah Morgan How are programming questions from women received on stack overflow? a case study of peer parity , 2017, SPLASH.

[37]  Jinqiu Yang,et al.  AutoComment: Mining question and answer sites for automatic comment generation , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[38]  Chanchal Kumar Roy,et al.  RACK: Automatic API Recommendation Using Crowdsourced Knowledge , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[39]  Abram Hindle,et al.  Mining StackOverflow to Filter Out Off-Topic IRC Discussion , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[40]  Sunghun Kim,et al.  Crowd debugging , 2015, ESEC/SIGSOFT FSE.

[41]  André van der Hoek,et al.  A Framework for Understanding Chatbots and Their Future , 2018, 2018 IEEE/ACM 11th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[42]  Kang Zhang,et al.  Who asked what: integrating crowdsourced FAQs into API documentation , 2014, ICSE Companion.

[43]  Zhenchang Xing,et al.  Unsupervised Software-Specific Morphological Forms Inference from Informal Discussions , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[44]  Michele Lanza,et al.  Harnessing Stack Overflow for the IDE , 2012, 2012 Third International Workshop on Recommendation Systems for Software Engineering (RSSE).

[45]  Ling Xu,et al.  Which Non-functional Requirements Do Developers Focus On? An Empirical Study on Stack Overflow Using Topic Analysis , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[46]  Paulo Gomes,et al.  Context-based recommendation to support problem solving in software development , 2012, 2012 Third International Workshop on Recommendation Systems for Software Engineering (RSSE).

[47]  Chanchal Kumar Roy,et al.  Towards a context-aware IDE-based meta search engine for recommendation about programming errors and exceptions , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[48]  Chanchal Kumar Roy,et al.  Recommending insightful comments for source code using crowdsourced knowledge , 2015, 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[49]  Gabriele Bavota,et al.  How Developers' Collaborations Identified from Different Sources Tell Us about Code Changes , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[50]  Srini Ramaswamy,et al.  Communications in Global Software Development: An Empirical Study Using GTK+ OSS Repository , 2011, OTM Workshops.

[51]  Jing Li,et al.  HDSKG: Harvesting domain specific knowledge graph from content of webpages , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).