Bugram: Bug detection with n-gram language models

To improve software reliability, many rule-based techniques have been proposed to infer programming rules and detect violations of these rules as bugs. These rule-based approaches often rely on the highly frequent appearances of certain patterns in a project to infer rules. It is known that if a pattern does not appear frequently enough, rules are not learned, thus missing many bugs. In this paper, we propose a new approach - Bugram - that leverages n-gram language models instead of rules to detect bugs. Bugram models program tokens sequentially, using the n-gram language model. Token sequences from the program are then assessed according to their probability in the learned model, and low probability sequences are marked as potential bugs. The assumption is that low probability token sequences in a program are unusual, which may indicate bugs, bad practices, or unusual/special uses of code of which developers may want to be aware. We evaluate Bugram in two ways. First, we apply Bugram on the latest versions of 16 open source Java projects. Results show that Bugram detects 59 bugs, 42 of which are manually verified as correct, 25 of which are true bugs and 17 are code snippets that should be refactored. Among the 25 true bugs, 23 cannot be detected by PR-Miner. We have reported these bugs to developers, 7 of which have already been confirmed by developers (4 of them have already been fixed), while the rest await confirmation. Second, we further compare Bugram with three additional graph- and rule-based bug detection tools, i.e., JADET, Tikanga, and GrouMiner. We apply Bugram on 14 Java projects evaluated in these three studies. Bugram detects 21 true bugs, at least 10 of which cannot be detected by these three tools. Our results suggest that Bugram is complementary to existing rule-based bug detection approaches.

[1]  Premkumar T. Devanbu,et al.  On the naturalness of software , 2016, Commun. ACM.

[2]  Andy Podgurski,et al.  Extending static analysis by mining project-specific rules , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[3]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[4]  Jonathan I. Maletic,et al.  An approach to mining call-usage patternswith syntactic context , 2007, ASE.

[5]  Eran Yahav,et al.  Static Specification Mining Using Automata-Based Abstractions , 2007, IEEE Transactions on Software Engineering.

[6]  Yuping Wang,et al.  PF-Miner: A New Paired Functions Mining Method for Android Kernel in Error Paths , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[7]  Yan Zhang,et al.  AntMiner: Mining More Bugs by Reducing Noise Interference , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[8]  Chadd C. Williams,et al.  Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[9]  Latifur Khan,et al.  Software Fault Localization Using N-gram Analysis , 2008, WASA.

[10]  Ross J. Anderson,et al.  Rendezvous: A search engine for binary code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[11]  Yuanyuan Zhou,et al.  /*icomment: bugs or bad comments?*/ , 2007, SOSP.

[12]  Mira Mezini,et al.  Detecting missing method calls as violations of the majority rule , 2013, TSEM.

[13]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[14]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[15]  Anh Tuan Nguyen,et al.  A statistical semantic language model for source code , 2013, ESEC/FSE 2013.

[16]  Hoan Anh Nguyen,et al.  Graph-based mining of multiple object usage patterns , 2009, ESEC/FSE '09.

[17]  Martin Fowler,et al.  Refactoring - Improving the Design of Existing Code , 1999, Addison Wesley object technology series.

[18]  Andreas Zeller,et al.  Mining temporal specifications from object usage , 2011, Automated Software Engineering.

[19]  Jian Pei,et al.  MAPO: mining API usages from open source repositories , 2006, MSR '06.

[20]  Premkumar T. Devanbu,et al.  On the localness of software , 2014, SIGSOFT FSE.

[21]  Kai-Yuan Cai,et al.  GUI Software Fault Localization Using N-gram Analysis , 2011, 2011 IEEE 13th International Symposium on High-Assurance Systems Engineering.

[22]  Suman Saha,et al.  Hector: Detecting Resource-Release Omission Faults in error-handling code for systems software , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[23]  Premkumar T. Devanbu,et al.  Will They Like This? Evaluating Code Contributions with Language Models , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[24]  Tomoki Toda,et al.  Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Eran Yahav,et al.  Code completion with statistical language models , 2014, PLDI.

[26]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[27]  Premkumar T. Devanbu,et al.  On the "naturalness" of buggy code , 2015, ICSE.

[28]  Sudheendra Hangal,et al.  Tracking down software bugs using automatic anomaly detection , 2002, ICSE '02.

[29]  Chadd C. Williams,et al.  Recovering system specific rules from software repositories , 2005, MSR '05.

[30]  D. Moher,et al.  An alternative to the hand searching gold standard: validating methodological search filters using relative recall , 2006, BMC medical research methodology.

[31]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[32]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[33]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[34]  K HollingsworthJeffrey,et al.  Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques , 2005 .

[35]  Gregory Fried Conclusion: Where Do We Go from Here? , 2000 .

[36]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[37]  Jun Sun,et al.  Detection and classification of malicious JavaScript via attack behavior modelling , 2015, ISSTA.

[38]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[39]  Jiong Yang,et al.  Finding what's not there: a new approach to revealing neglected conditions in software , 2007, ISSTA '07.

[40]  Lalit R. Bahl,et al.  A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[41]  Tao Xie,et al.  Alattin: mining alternative patterns for defect detection , 2011, Automated Software Engineering.

[42]  David Lo,et al.  An automated approach for finding variable-constant pairing bugs , 2010, ASE '10.

[43]  Rob Miller,et al.  Code Completion from Abbreviated Input , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[44]  Suresh Jagannathan,et al.  Path-Sensitive Inference of Function Precedence Protocols , 2007, 29th International Conference on Software Engineering (ICSE'07).

[45]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[46]  Thomas R. Gross,et al.  Automatic Generation of Object Usage Specifications from Large Method Traces , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[47]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[48]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[49]  Tao Xie,et al.  Mining exception-handling rules as sequence association rules , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[50]  R. Rosenfeld,et al.  Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[51]  Grigore Rosu,et al.  Maximal sound predictive race detection with control flow abstraction , 2014, PLDI.

[52]  William W. Cohen,et al.  Natural Language Models for Predicting Programming Comments , 2013, ACL.

[53]  Satish Narayanasamy,et al.  Using web corpus statistics for program analysis , 2014, OOPSLA.

[54]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[55]  Gary T. Leavens,et al.  @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[56]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[57]  Andreas Zeller,et al.  Learning from 6,000 projects: lightweight cross-project anomaly detection , 2010, ISSTA '10.

[58]  Martin White,et al.  Toward Deep Learning Software Repositories , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[59]  Charles A. Sutton,et al.  Learning natural coding conventions , 2014, SIGSOFT FSE.

[60]  José Nelson Amaral,et al.  Syntax errors just aren't natural: improving error reporting with language models , 2014, MSR 2014.