A Framework to Compare Alert Ranking Algorithms

To improve software quality, rule checkers statically check if a software contains violations of good programming practices. On a real sized system, the alerts (rule violations detected by the tool) may be numbered by the thousands. Unfortunately, these tools generate a high proportion of "false alerts", which in the context of a specific software, should not be fixed. Huge numbers of false alerts may render impossible the finding and correction of "true alerts" and dissuade developers from using these tools. In order to overcome this problem, the literature provides different ranking methods that aim at computing the probability of an alert being a "true one". In this paper, we propose a framework for comparing these ranking algorithms and identify the best approach to rank alerts. We have selected six algorithms described in literature. For comparison, we use a benchmark covering two programming languages (Java and Smalltalk) and three rule checkers (Find Bug, PMD, Small Lint). Results show that the best ranking methods are based on the history of past alerts and their location. We could not identify any significant advantage in using statistical tools such as linear regression or Bayesian networks or ad-hoc methods.

[1]  Sarah Smith Heckman,et al.  A systematic literature review of actionable alert identification techniques for automated static code analysis , 2011, Inf. Softw. Technol..

[2]  Ralf Huuck,et al.  Goanna - A Static Model Checker , 2006, FMICS/PDMC.

[3]  N. Nagappan,et al.  Static analysis tools as early indicators of pre-release defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[4]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[5]  Dawson R. Engler,et al.  Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations , 2003, SAS.

[6]  Armin Biere,et al.  Applying static analysis to large-scale, multi-threaded Java programs , 2001, Proceedings 2001 Australian Software Engineering Conference.

[7]  Yuanyuan Zhou,et al.  BugBench: Benchmarks for Evaluating Bug Detection Tools , 2005 .

[8]  Junfeng Yang,et al.  Correlation exploitation in error ranking , 2004, SIGSOFT '04/FSE-12.

[9]  Jianjun Zhao,et al.  EFindBugs: Effective Error Ranking for FindBugs , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[10]  Yannis Smaragdakis,et al.  DSD-Crasher: A hybrid analysis tool for bug finding , 2006, TSEM.

[11]  Eric Wohlstadter,et al.  A static aspect language for checking design rules , 2007, AOSD.

[12]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[13]  Ming Zhu,et al.  ISA: a source code static vulnerability detection system based on data fusion , 2007, Infoscale.

[14]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[15]  Edith Schonberg,et al.  SABER: smart analysis based error reduction , 2004, ISSTA '04.

[16]  Michael D. Ernst,et al.  Which warnings should I fix first? , 2007, ESEC-FSE '07.

[17]  Jeffrey S. Foster,et al.  A comparison of bug finding tools for Java , 2004, 15th International Symposium on Software Reliability Engineering.

[18]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[19]  Marco Tulio Valente,et al.  Study on the relevance of the warnings reported by Java bug-finding tools , 2011, IET Softw..

[20]  Pankaj Jalote,et al.  Integrating Static and Dynamic Analysis for Detecting Vulnerabilities , 2006, 30th Annual International Computer Software and Applications Conference (COMPSAC'06).

[21]  Ralph E. Johnson,et al.  A Refactoring Tool for Smalltalk , 1997, Theory Pract. Object Syst..

[22]  Marco Tulio Valente,et al.  Static correspondence and correlation between field defects and warnings reported by a bug finding tool , 2011, Software Quality Journal.

[23]  Leon Moonen,et al.  Evaluating the relation between coding standard violations and faultswithin and across software versions , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[24]  Dawson R. Engler,et al.  How to write system-specific, static checkers in metal , 2002, PASTE '02.

[25]  Oscar Nierstrasz,et al.  Domain-Specific Program Checking , 2010, TOOLS.

[26]  Stéphane Ducasse,et al.  Dynamic Web Development with Seaside , 2010 .

[27]  L. Moonen,et al.  Prioritizing Software Inspection Results using Static Profiling , 2006, 2006 Sixth IEEE International Workshop on Source Code Analysis and Manipulation.

[28]  Laurie A. Williams,et al.  On the value of static analysis for fault detection in software , 2006, IEEE Transactions on Software Engineering.

[29]  Nicolas Anquetil,et al.  Domain specific warnings: Are they any better? , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[30]  Leon Moonen,et al.  Assessing the value of coding standards: An empirical study , 2008, 2008 IEEE International Conference on Software Maintenance.

[31]  William R. Bush,et al.  A static analyzer for finding dynamic programming errors , 2000, Softw. Pract. Exp..

[32]  W. Basalaj,et al.  Correlation between coding standards compliance and software quality , 2005 .

[33]  Stefan Wagner,et al.  An Evaluation of Two Bug Pattern Tools for Java , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[34]  Qian Wu,et al.  An Approach to Merge Results of Multiple Static Analysis Tools (Short Paper) , 2008, 2008 The Eighth International Conference on Quality Software.

[35]  David Evans,et al.  Improving Security Using Extensible Lightweight Static Analysis , 2002, IEEE Softw..

[36]  Sarah Smith Heckman,et al.  On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques , 2008, ESEM '08.

[37]  Shu Xiao,et al.  Performing high efficiency source code static analysis with intelligent extensions , 2004, 11th Asia-Pacific Software Engineering Conference.