What information about code snippets is available in different software-related documents? An exploratory study

A large corpora of software-related documents is available on the Web, and these documents offer the unique opportunity to learn from what developers are saying or asking about the code snippets that they are discussing. For example, the natural language in a bug report provides information about what is not functioning properly in a particular code snippet. Previous research has mined information about code snippets from bug reports, emails, and Q&A forums. This paper describes an exploratory study into the kinds of information that is embedded in different software-related documents. The goal of the study is to gain insight into the potential value and difficulty of mining the natural language text associated with the code snippets found in a variety of software-related documents, including blog posts, API documentation, code reviews, and public chats.

[1]  Rebecca Tiarks,et al.  How does a typical tutorial for mobile development look like? , 2014, MSR 2014.

[2]  Gerardo Canfora,et al.  CODES: mining source code descriptions from developers discussions , 2014, ICPC 2014.

[3]  Gerardo Canfora,et al.  Mining source code descriptions from developer communications , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[4]  Martin P. Robillard,et al.  Discovering Information Explaining API Types Using Text Classification , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[5]  Andy Zaidman,et al.  Modern code reviews in open-source projects: which problems do they fix? , 2014, MSR 2014.

[6]  Jinqiu Yang,et al.  AutoComment: Mining question and answer sites for automatic comment generation , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Christoph Treude,et al.  Augmenting API Documentation with Insights from Stack Overflow , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[8]  Alberto Bacchelli,et al.  Extracting Source Code from E-Mails , 2010, 2010 IEEE 18th International Conference on Program Comprehension.