Find Bugs in Static Bug Finders

Static bug finders have been widely-adopted by developers to find bugs in real-world software projects. They leverage predefined heuristic static analysis rules to scan source code or binary code of a software project, and report violations to these rules as warnings to be verified. However, the advantages of static bug finders are overshadowed by such issues as uncovered obvious bugs, false positives, etc. To improve these tools, many techniques have been proposed to filter out false positives reported or design new static analysis rules. Nevertheless, the under-performance of bug finders can also be caused by the incorrectness of current rules contained in the static bug finders, which is not explored yet. In this work, we propose a differential testing approach to detect bugs in the rules of four widely-used static bug finders, i.e., SonarQube, PMD, SpotBugs, and ErrorProne, and conduct a qualitative study about the bugs found. To retrieve paired rules across static bug finders for differential testing, we design a heuristic-based rule mapping method which combines the similarity in rules’ description and the overlap in warning information reported by the tools. The experiment on 2,728 open source projects reveals 46 bugs in the static bug finders, among which 24 are fixed or confirmed and the left are awaiting confirmation. We also summarize 13 bug patterns in the static analysis rules based on their context and root causes, which can serve as the checklist for designing and implementing other rules and/or in other tools. This study indicates that the commonly-used static bug finders are not as reliable as they might have been envisaged. It not only demonstrates the effectiveness of our approach, but also highlights the need to continue improving the reliability of the static bug finders. ACM Reference Format: JunjieWang, Yuchao Huang, SongWang, and QingWang. 2021. Find Bugs in Static Bug Finders. In Proceedings of ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3377811.3380380 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ISSTA 2021, 12-16 July, 2021, Aarhus, Denmark © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-7121-6/20/05. . . $15.00 https://doi.org/10.1145/3377811.3380380

[1]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[2]  Premkumar T. Devanbu,et al.  Comparing static bug finders and statistical prediction , 2014, ICSE.

[3]  Xinli Yang,et al.  Combining Word Embedding with Information Retrieval to Recommend Similar Bug Reports , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[4]  Cristina V. Lopes,et al.  50K-C: A Dataset of Compilable, and Compiled, Java Projects , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[5]  Chris Parnin,et al.  The Seven Sins: Security Smells in Infrastructure as Code Scripts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[6]  Harald C. Gall,et al.  How developers engage with static analysis tools in different contexts , 2019, Empirical Software Engineering.

[7]  Zhenchang Xing,et al.  Predicting semantically linkable knowledge in developer online forums via convolutional neural network , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Junjie Wang,et al.  Images don't lie: Duplicate crowdtesting reports detection with screenshot information , 2019, Inf. Softw. Technol..

[9]  Emerson Murphy-Hill,et al.  How Developers Diagnose Potential Security Vulnerabilities with a Static Analysis Tool , 2019, IEEE Transactions on Software Engineering.

[10]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[11]  Sarah Smith Heckman,et al.  On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques , 2008, ESEM '08.

[12]  Michael D. Ernst,et al.  Which warnings should I fix first? , 2007, ESEC-FSE '07.

[13]  Song Wang,et al.  Is there a "golden" feature set for static warning identification?: an experimental evaluation , 2018, ESEM.

[14]  Zhendong Su,et al.  Finding and Analyzing Compiler Warning Defects , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[15]  Lin Tan,et al.  On the correctness of electronic documents: studying, finding, and localizing inconsistency bugs in PDF readers and files , 2018, Empirical Software Engineering.

[16]  Michael Pradel,et al.  How Many of All Bugs Do We Find? A Study of Static Bug Detectors , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[17]  Alessandro Orso,et al.  Automated cross-platform inconsistency detection for mobile apps , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Ciera Jaspan,et al.  Lessons from building static analysis tools at Google , 2018, Commun. ACM.

[20]  Premkumar T. Devanbu,et al.  To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools , 2014, Automated Software Engineering.

[21]  Eric Sax,et al.  Integrating Static Code Analysis Toolchains , 2019, 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC).

[22]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[23]  Harald C. Gall,et al.  Continuous Code Quality: Are We (Really) Doing That? , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  Sundaresan Krishnan,et al.  Building Useful Program Analysis Tools Using an Extensible Java Compiler , 2012, 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation.

[25]  Xiangyu Zhang,et al.  Phys: probabilistic physical unit assignment and inconsistency detection , 2018, ESEC/SIGSOFT FSE.

[26]  N. Nagappan,et al.  Static analysis tools as early indicators of pre-release defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[27]  Giuliano Antoniol,et al.  Would static analysis tools help developers with code reviews? , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[28]  Zhendong Su,et al.  Deep Differential Testing of JVM Implementations , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[29]  Laurie A. Williams,et al.  On the value of static analysis for fault detection in software , 2006, IEEE Transactions on Software Engineering.

[30]  Heikki Huttunen,et al.  Are SonarQube Rules Inducing Bugs? , 2019, 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[31]  Lin Tan,et al.  CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[32]  Ali Mesbah,et al.  Discovering bug patterns in JavaScript , 2016, SIGSOFT FSE.

[33]  Gerardo Canfora,et al.  How Open Source Projects Use Static Code Analysis Tools in Continuous Integration Pipelines , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[34]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[35]  Zhen Ming Jiang,et al.  Characterizing and Detecting Anti-Patterns in the Logging Code , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[36]  G. Ann Campbell,et al.  SonarQube in Action , 2013 .

[37]  Tao Xie,et al.  Automatic construction of an effective training set for prioritizing static analysis warnings , 2010, ASE.

[38]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[39]  Alastair F. Donaldson,et al.  Many-core compiler fuzzing , 2015, PLDI.

[40]  Liang Zhao,et al.  Differential Testing of Certificate Validation in SSL/TLS Implementations: An RFC-guided Approach , 2019, ACM Trans. Softw. Eng. Methodol..

[41]  Georgios Gousios,et al.  Developer Testing in the IDE: Patterns, Beliefs, and Behavior , 2019, IEEE Trans. Software Eng..

[42]  Brendan Murphy,et al.  How Do Developers Act on Static Analysis Alerts? An Empirical Study of Coverity Usage , 2019, 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE).

[43]  Ciera Jaspan,et al.  Tricorder: Building a Program Analysis Ecosystem , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[44]  Robert W. Bowdidge,et al.  Why don't software developers use static analysis tools to find bugs? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[45]  Song Wang,et al.  A bug finder refined by a large set of open-source projects , 2019, Inf. Softw. Technol..

[46]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[47]  Baowen Xu,et al.  Hunting for Bugs in Code Coverage Tools via Randomized Differential Testing , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[48]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[49]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[50]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[51]  Sarah Smith Heckman,et al.  A systematic literature review of actionable alert identification techniques for automated static code analysis , 2011, Inf. Softw. Technol..

[52]  Shauvik Roy Choudhary Detecting cross-browser issues in web applications , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[53]  David Hovemeyer,et al.  Finding more null pointer bugs, but not too many , 2007, PASTE '07.

[54]  Wei Dong,et al.  Evaluating and Integrating Diverse Bug Finders for Effective Program Analysis , 2018, SATE.

[55]  Bin Liang,et al.  NAR-miner: discovering negative association rules from code for bug detection , 2018, ESEC/SIGSOFT FSE.

[56]  David Hovemeyer,et al.  Using Static Analysis to Find Bugs , 2008, IEEE Software.

[57]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[58]  Edna Dias Canedo,et al.  Are Static Analysis Violations Really Fixed? A Closer Look at Realistic Usage of SonarQube , 2019, 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC).

[59]  Yue Zhao,et al.  DLFuzz: differential fuzzing testing of deep learning systems , 2018, ESEC/SIGSOFT FSE.

[60]  Johnny Saldaña,et al.  The Coding Manual for Qualitative Researchers , 2009 .