Toward human-like summaries generated from heterogeneous software artefacts

Automatic text summarisation has drawn considerable interest in the field of software engineering. It can improve the efficiency of software developers, enhance the quality of products, and ensure timely delivery. In this paper, we present our initial work towards automatically generating human-like multi-document summaries from heterogeneous software artefacts. Our analysis of the text properties of 545 human-written summaries from 15 software engineering projects will ultimately guide heuristics searches in the automatic generation of human-like summaries.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.

[3]  Lori L. Pollock,et al.  Automatic generation of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[4]  Christoph Treude,et al.  Summarizing and measuring development activity , 2015, ESEC/SIGSOFT FSE.

[5]  Gail C. Murphy,et al.  Automatic Summarization of Bug Reports , 2014, IEEE Transactions on Software Engineering.

[6]  Martin P. Robillard,et al.  Discovering essential code elements in informal documentation , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Mark Harman,et al.  Search-based software engineering , 2001, Inf. Softw. Technol..

[8]  Krys J. Kochut,et al.  Text Summarization Techniques: A Brief Survey , 2017, International Journal of Advanced Computer Science and Applications.

[9]  Markus Wagner,et al.  Evolutionary many-objective optimization: A quick-start guide , 2015 .

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Inderjeet Mani,et al.  The Challenges of Automatic Summarization , 2000, Computer.

[12]  James D. Herbsleb,et al.  Social coding in GitHub: transparency and collaboration in an open software repository , 2012, CSCW.