CPC: Automatically Classifying and Propagating Natural Language Comments via Program Analysis

Code comments provide abundant information that have been leveraged to help perform various software engineering tasks, such as bug detection, specification inference, and code synthesis. However, developers are less motivated to write and update comments, making it infeasible and error-prone to leverage comments to facilitate software engineering tasks. In this paper, we propose to leverage program analysis to systematically derive, refine, and propagate comments. For example, by propagation via program analysis, comments can be passed on to code entities that are not commented such that code bugs can be detected leveraging the propagated comments. Developers usually comment on different aspects of code elements like methods, and use comments to describe various contents, such as functionalities and properties. To more effectively utilize comments, a fine-grained and elaborated taxonomy of comments and a reliable classifier to automatically categorize a comment are needed. In this paper, we build a comprehensive taxonomy and propose using program analysis to propagate comments. We develop a prototype CPC, and evaluate it on 5 projects. The evaluation results demonstrate 41573 new comments can be derived by propagation from other code locations with 88% accuracy. Among them, we can derive precise functional comments for 87 native methods that have neither existing comments nor source code. Leveraging the propagated comments, we detect 37 new bugs in open source large projects, 30 of which have been confirmed and fixed by developers, and 304 defects in existing comments (by looking at inconsistencies between existing and propagated comments), including 12 incomplete comments and 292 wrong comments. This demonstrates the effectiveness of our approach. Our user study confirms propagated comments align well with existing comments in terms of quality.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Ted Tenny,et al.  Program Readability: Procedures Versus Comments , 1988, IEEE Trans. Software Eng..

[3]  Elmar Jürgens,et al.  Quality analysis of source code comments , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[4]  Yuanyuan Zhou,et al.  /*icomment: bugs or bad comments?*/ , 2007, SOSP.

[5]  Hridesh Rajan,et al.  Statistical Learning for Inference between Implementations and Documentation , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER).

[6]  Zhendong Su,et al.  Detecting API documentation errors , 2013, OOPSLA.

[7]  Collin McMillan,et al.  Automatically generating commit messages from diffs using neural machine translation , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Alessandra Gorla,et al.  Automatic generation of oracles for exceptional behaviors , 2016, ISSTA.

[9]  Houari A. Sahraoui,et al.  How Good is Your Comment? A Study of Comments in Java Programs , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[10]  Luca Pascarella,et al.  Classifying Code Comments in Java Mobile Applications , 2018, 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[11]  Westley Weimer,et al.  Automatically documenting program changes , 2010, ASE.

[12]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[13]  Christopher D. Manning,et al.  Parsing Three German Treebanks: Lexicalized and Unlexicalized Baselines , 2008 .

[14]  Brad A. Myers,et al.  Jadeite: improving API documentation using usage information , 2009, CHI Extended Abstracts.

[15]  Gabriele Bavota,et al.  Detecting missing information in bug descriptions , 2017, ESEC/SIGSOFT FSE.

[16]  Alvin Cheung,et al.  Summarizing Source Code using a Neural Attention Model , 2016, ACL.

[17]  A. Huberman,et al.  Qualitative Data Analysis: A Methods Sourcebook , 1994 .

[18]  Alessandra Gorla,et al.  Translating code comments to procedure specifications , 2018, ISSTA.

[19]  Nicolas Anquetil,et al.  A study of the documentation essential to software maintenance , 2005, SIGDOC '05.

[20]  Lin Tan,et al.  CloCom: Mining existing source code for automatic comment generation , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[21]  Carl S. Hartzman,et al.  Maintenance productivity: observations based on an experience in a large system environment , 1993, CASCON.

[22]  Premkumar T. Devanbu,et al.  OntoCat: Automatically categorizing knowledge in API Documentation , 2016, ArXiv.

[23]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[24]  Ben Liblit,et al.  Expect the unexpected: error code mismatches between documentation and the real world , 2010, PASTE '10.

[25]  Ahmed E. Hassan,et al.  Examining the evolution of code comments in PostgreSQL , 2006, MSR '06.

[26]  Lori L. Pollock,et al.  JSummarizer: An automatic generator of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[27]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[28]  Xiangyu Zhang,et al.  Automatic Model Generation from Documentation for Java API Functions , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[29]  Chanchal Kumar Roy,et al.  NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[30]  Scott N. Woodfield,et al.  The effect of modularization and comments on program comprehension , 1981, ICSE '81.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[33]  Martin P. Robillard,et al.  Patterns of Knowledge in API Reference Documentation , 2013, IEEE Transactions on Software Engineering.

[34]  Lori L. Pollock,et al.  Generating Parameter Comments and Integrating with Method Summaries , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[35]  Andrian Marcus,et al.  Supporting program comprehension with source code summarization , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[36]  Lori L. Pollock,et al.  Automatic generation of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[37]  Alberto Bacchelli,et al.  Classifying Code Comments in Java Open-Source Software Systems , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[38]  Tao Zhang,et al.  An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[39]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[40]  Seung-won Hwang,et al.  Enriching Documents with Examples: A Corpus Mining Approach , 2013, TOIS.

[41]  Lori L. Pollock,et al.  Automatically detecting and describing high level actions within methods , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[42]  Martin P. Robillard,et al.  Detecting fragile comments , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[43]  Collin McMillan,et al.  Automatic Source Code Summarization of Context for Java Methods , 2016, IEEE Transactions on Software Engineering.

[44]  Charles A. Sutton,et al.  A Convolutional Attention Network for Extreme Summarization of Source Code , 2016, ICML.

[45]  Gail C. Murphy,et al.  Generating natural language summaries for crosscutting source code concerns , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[46]  Gary T. Leavens,et al.  @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[47]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[48]  Viktor Kuncak,et al.  Synthesizing Java expressions from free-form queries , 2015, OOPSLA.

[49]  Song Wang,et al.  DASE: Document-Assisted Symbolic Execution for Improving Automated Software Testing , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[50]  Yu Zhou,et al.  Analyzing APIs Documentation and Code to Detect Directive Defects , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[51]  Martin P. Robillard,et al.  Using Traceability Links to Recommend Adaptive Changes for Documentation Evolution , 2014, IEEE Transactions on Software Engineering.

[52]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[53]  Michael Eichberg,et al.  What should developers be aware of? An empirical study on the directives of API documentation , 2011, Empirical Software Engineering.

[54]  Christoph Treude,et al.  Augmenting API Documentation with Insights from Stack Overflow , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[55]  Kimberly A. Neuendorf,et al.  The Content Analysis Guidebook , 2001 .

[56]  Xiaonan Luo,et al.  Mining Version Control System for Automatically Generating Commit Comment , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[57]  Anh Tuan Nguyen,et al.  Statistical Translation of English Texts to API Code Templates , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[58]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[59]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[60]  Ted Tenny,et al.  Procedures and comments vs. the banker's algorithm , 1985, SGCS.

[61]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[62]  Tao Xie,et al.  Inferring method specifications from natural language API descriptions , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[63]  Collin McMillan,et al.  Automatic documentation generation via source code summarization of method context , 2014, ICPC 2014.

[64]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[65]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.

[66]  Bernardete Ribeiro,et al.  The importance of stop word removal on recall values in text categorization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[67]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[68]  KuncakViktor,et al.  Synthesizing Java expressions from free-form queries , 2015 .

[69]  Yuanyuan Zhou,et al.  aComment: mining annotations from comments and code to detect interrupt related concurrency bugs , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[70]  Jinqiu Yang,et al.  AutoComment: Mining question and answer sites for automatic comment generation , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[71]  Tao Xie,et al.  Inferring Resource Specifications from Natural Language API Documentation , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[72]  Yuanyuan Zhou,et al.  Listening to programmers — Taxonomies and characteristics of comments in operating system code , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[73]  David Lo,et al.  Deep Code Comment Generation , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[74]  Michael Pradel,et al.  Is This Class Thread-Safe? Inferring Documentation using Graph-Based Learning , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[75]  Andrew D. Gordon,et al.  Bimodal Modelling of Source Code and Natural Language , 2015, ICML.

[76]  Reid Holmes,et al.  Live API documentation , 2014, ICSE.

[77]  Andrian Marcus,et al.  On the Use of Automated Text Summarization Techniques for Summarizing Source Code , 2010, 2010 17th Working Conference on Reverse Engineering.

[78]  Jonathan I. Maletic,et al.  Using stereotypes in the automatic generation of natural language summaries for C++ methods , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).