Software Documentation Issues Unveiled

(Good) Software documentation provides developers and users with a description of what a software system does, how it operates, and how it should be used. For example, technical documentation (e.g., an API reference guide) aids developers during evolution/maintenance activities, while a user manual explains how users are to interact with a system. Despite its intrinsic value, the creation and the maintenance of documentation is often neglected, negatively impacting its quality and usefulness, ultimately leading to a generally unfavourable take on documentation. Previous studies investigating documentation issues have been based on surveying developers, which naturally leads to a somewhat biased view of problems affecting documentation. We present a large scale empirical study, where we mined, analyzed, and categorized 878 documentation-related artifacts stemming from four different sources, namely mailing lists, Stack Overflow discussions, issue repositories, and pull requests. The result is a detailed taxonomy of documentation issues from which we infer a series of actionable proposals both for researchers and practitioners.

[1]  Harald C. Gall,et al.  The impact of test case summaries on bug fixing performance: an empirical investigation , 2016, ICSE 2016.

[2]  Collin McMillan,et al.  TraceLab Components for Generating Extractive Summaries of User Stories , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[3]  Martin P. Robillard,et al.  Creating and evolving developer documentation: understanding the decisions of open source contributors , 2010, FSE '10.

[4]  Nouh Alhindawi,et al.  A Topic Modeling Based Solution for Confirming Software Documentation Quality , 2016 .

[5]  James D. Arthur,et al.  Document quality indicators: A framework for assessing documentation adequacy , 1992, J. Softw. Maintenance Res. Pract..

[6]  Gabriele Bavota,et al.  ARENA: An Approach for the Automated Generation of Release Notes , 2017, IEEE Transactions on Software Engineering.

[7]  Krzysztof Czarnecki,et al.  Modelling the 'Hurried' bug report reading process to summarize bug reports , 2012, ICSM.

[8]  Bogdan Dit,et al.  An exploratory analysis of mobile development issues using stack overflow , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[9]  Collin McMillan,et al.  Portfolio: finding relevant functions and their usage , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[10]  Gabriele Bavota,et al.  Supporting Software Developers with a Holistic Recommender System , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[11]  Martin P. Robillard,et al.  What Makes APIs Hard to Learn? Answers from Developers , 2009, IEEE Software.

[12]  Michele Lanza,et al.  Leveraging Crowd Knowledge for Software Comprehension and Development , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[13]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[14]  Collin McMillan,et al.  Towards Automatic Generation of Short Summaries of Commits , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[15]  Gabriele Bavota,et al.  Mining StackOverflow to turn the IDE into a self-confident programming prompter , 2014, MSR 2014.

[16]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[17]  Martin P. Robillard,et al.  A field study of API learning obstacles , 2011, Empirical Software Engineering.

[18]  Andrian Marcus,et al.  On the Use of Automated Text Summarization Techniques for Summarizing Source Code , 2010, 2010 17th Working Conference on Reverse Engineering.

[19]  Robert J. Walker,et al.  Approximate Structural Context Matching: An Approach to Recommend Relevant Examples , 2006, IEEE Transactions on Software Engineering.

[20]  Denys Poshyvanyk,et al.  A comprehensive model for code readability , 2018, J. Softw. Evol. Process..

[21]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[22]  Martin P. Robillard,et al.  A study of the effectiveness of usage examples in REST API documentation , 2017, 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[23]  Sun-Jen Huang,et al.  An empirical analysis of the impact of software development problem factors on software maintainability , 2009, J. Syst. Softw..

[24]  Collin McMillan,et al.  Improving automated source code summarization via an eye-tracking study of programmers , 2014, ICSE.

[25]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[26]  Martin P. Robillard,et al.  How API Documentation Fails , 2015, IEEE Software.

[27]  Brad A. Myers,et al.  Mica: A Web-Search Tool for Finding API Components and Examples , 2006, Visual Languages and Human-Centric Computing (VL/HCC'06).

[28]  Mira Kajko-Mattsson,et al.  A Survey of Documentation Practice within Corrective Maintenance , 2004, Empirical Software Engineering.

[29]  Boyang Li,et al.  How do Developers Document Database Usages in Source Code? (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[30]  Harald C. Gall,et al.  What would users change in my app? summarizing app reviews for recommending software changes , 2016, SIGSOFT FSE.

[31]  Mario Linares Vásquez,et al.  ChangeScribe: A Tool for Automatically Generating Commit Messages , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[32]  Andreas Dautovic,et al.  Automatic assessment of software documentation quality , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[33]  Gabriele Bavota,et al.  On-demand Developer Documentation , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[34]  Harald C. Gall,et al.  The impact of test case summaries on bug fixing performance: an empirical investigation , 2016, PeerJ Prepr..

[35]  Curtis R. Cook,et al.  Assessing the State of Software Documentation Practices , 2004, PROFES.

[36]  Ahmed E. Hassan,et al.  What Do Mobile App Users Complain About? , 2015, IEEE Software.

[37]  Lori L. Pollock,et al.  Automatic generation of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[38]  Boyang Li,et al.  Automatically Documenting Unit Test Cases , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[39]  Harald C. Gall,et al.  Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[40]  Zhendong Su,et al.  Detecting API documentation errors , 2013, OOPSLA.

[41]  Timothy Lethbridge,et al.  The relevance of software documentation, tools and technologies: a survey , 2002, DocEng '02.

[42]  Collin McMillan,et al.  Automatic Source Code Summarization of Context for Java Methods , 2016, IEEE Transactions on Software Engineering.

[43]  Boyang Li,et al.  Documenting database usages and schema constraints in database-centric applications , 2016, ISSTA.

[44]  Emad Shihab,et al.  What are mobile developers asking about? A large scale study using stack overflow , 2016, Empirical Software Engineering.

[45]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[46]  Vahid Garousi,et al.  Cost, benefits and quality of software development documentation: A systematic mapping , 2015, J. Syst. Softw..

[47]  Gabriele Bavota,et al.  Automatic generation of release notes , 2014, SIGSOFT FSE.

[48]  Gail C. Murphy,et al.  Automatic Summarization of Bug Reports , 2014, IEEE Transactions on Software Engineering.

[49]  Collin McMillan,et al.  Improving topic model source code summarization , 2014, ICPC 2014.

[50]  Collin McMillan,et al.  Automatic documentation generation via source code summarization of method context , 2014, ICPC 2014.

[51]  Mario Linares Vásquez,et al.  On Automatically Generating Commit Messages via Summarization of Source Code Changes , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[52]  Reinhold Plösch,et al.  The Value of Software Documentation Quality , 2014, 2014 14th International Conference on Quality Software.

[53]  Senthil Mani,et al.  AUSUM: approach for unsupervised bug report summarization , 2012, SIGSOFT FSE.

[54]  Gabriele Bavota,et al.  How Can I Use This Method? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[55]  R. Holmes,et al.  Using structural context to recommend source code examples , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[56]  Vahid Garousi,et al.  Usage and usefulness of technical software documentation: An industrial case study , 2015, Inf. Softw. Technol..

[57]  Krzysztof Czarnecki,et al.  Modelling the ‘hurried’ bug report reading process to summarize bug reports , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[58]  Rainer Koschke,et al.  Survey of Research on Software Clones , 2006, Duplication, Redundancy, and Similarity in Software.

[59]  Janice Singer,et al.  How software engineers use documentation: the state of the practice , 2003, IEEE Software.

[60]  Martin P. Robillard,et al.  Code fragment summarization , 2013, ESEC/FSE 2013.

[61]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.