/*icomment: bugs or bad comments?*/

Commenting source code has long been a common practice in software development. Compared to source code, comments are more direct, descriptive and easy-to-understand. Comments and sourcecode provide relatively redundant and independent information regarding a program's semantic behavior. As software evolves, they can easily grow out-of-sync, indicating two problems: (1) bugs -the source code does not follow the assumptions and requirements specified by correct program comments; (2) bad comments - comments that are inconsistent with correct code, which can confuse and mislead programmers to introduce bugs in subsequent versions. Unfortunately, as most comments are written in natural language, no solution has been proposed to automatically analyze commentsand detect inconsistencies between comments and source code. This paper takes the first step in automatically analyzing commentswritten in natural language to extract implicit program rulesand use these rules to automatically detect inconsistencies between comments and source code, indicating either bugs or bad comments. Our solution, iComment, combines Natural Language Processing(NLP), Machine Learning, Statistics and Program Analysis techniques to achieve these goals. We evaluate iComment on four large code bases: Linux, Mozilla, Wine and Apache. Our experimental results show that iComment automatically extracts 1832 rules from comments with 90.8-100% accuracy and detects 60 comment-code inconsistencies, 33 newbugs and 27 bad comments, in the latest versions of the four programs. Nineteen of them (12 bugs and 7 bad comments) have already been confirmed by the corresponding developers while the others are currently being analyzed by the developers.

[1]  William G. Griswold,et al.  Quickly detecting relevant program invariants , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[2]  Dawson R. Engler,et al.  From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[3]  James L. Wright,et al.  Source code that talks: an exploration of Eclipse task comments and their implication to repository mining , 2005, ACM SIGSOFT Softw. Eng. Notes.

[4]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[5]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[6]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[7]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[8]  Daniel A. Keim,et al.  On Knowledge Discovery and Data Mining , 1997 .

[9]  Dawson R. Engler,et al.  A system and language for building system-specific, static analyses , 2002, PLDI '02.

[10]  Sharon E. Perl Performance assertion checking , 1993, SOSP '93.

[11]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[12]  Dawson R. Engler,et al.  RacerX: effective, static detection of race conditions and deadlocks , 2003, SOSP '03.

[13]  Samuel T. King,et al.  Debugging Operating Systems with Time-Traveling Virtual Machines (Awarded General Track Best Paper Award!) , 2005, USENIX Annual Technical Conference, General Track.

[14]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[15]  Dawn Xiaodong Song,et al.  Pi: a path identification mechanism to defend against DDoS attacks , 2003, 2003 Symposium on Security and Privacy, 2003..

[16]  Nicholas Sterling,et al.  WARLOCK - A Static Data Race Analysis Tool , 1993, USENIX Winter.

[17]  David Evans,et al.  Improving Security Using Extensible Lightweight Static Analysis , 2002, IEEE Softw..

[18]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[19]  K. Rustan M. Leino,et al.  Extended static checking , 1998, PROCOMET.

[20]  Dawson R. Engler,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Cmc: a Pragmatic Approach to Model Checking Real Code , 2022 .

[21]  Ahmed E. Hassan,et al.  Examining the evolution of code comments in PostgreSQL , 2006, MSR '06.

[22]  Andrew C. Myers,et al.  Protecting privacy using the decentralized label model , 2000, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[23]  Ian Witten,et al.  Data Mining , 2000 .

[24]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[25]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[26]  Tong Li,et al.  Pulse: A Dynamic Deadlock Detection Mechanism Using Speculative Execution , 2005, USENIX Annual Technical Conference, General Track.

[27]  Dan Roth,et al.  The Necessity of Syntactic Parsing for Semantic Role Labeling , 2005, IJCAI.

[28]  Bei Yu,et al.  A cross-collection mixture model for comparative text mining , 2004, KDD.

[29]  George C. Necula,et al.  SafeDrive: safe and recoverable extensions using language-based techniques , 2006, OSDI '06.

[30]  Jong-Deok Choi,et al.  Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects , 1993, POPL '93.

[31]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[32]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[33]  William E. Howden,et al.  Comments analysis and programming errors , 1990, IEEE Transactions on Software Engineering.

[34]  Scott N. Woodfield,et al.  The effect of modularization and comments on program comprehension , 1981, ICSE '81.

[35]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[36]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.