Understanding static code warnings: An incremental AI approach

Knowledge-based systems reason over some knowledge base. Hence, an important issue for such systems is how to acquire the knowledge needed for their inference. This paper assesses active learning methods for acquiring knowledge for "static code warnings". Static code analysis is a widely-used method for detecting bugs and security vulnerabilities in software systems. As software becomes more complex, analysis tools also report lists of increasingly complex warnings that developers need to address on a daily basis. Such static code analysis tools are usually over-cautious; i.e. they often offer many warnings about spurious issues. Previous research work shows that about 35% to 91% of warnings reported as bugs by SA tools are actually unactionable (i.e., warnings that would not be acted on by developers because they are falsely suggested as bugs). Experienced developers know which errors are important and which can be safely ignored. How can we capture that experience? This paper reports on an incremental AI tool that watches humans reading false alarm reports. Using an incremental support vector machine mechanism, this AI tool can quickly learn to distinguish spurious false alarms from more serious matters that deserve further attention. In this work, nine open-source projects are employed to evaluate our proposed model on the features extracted by previous researchers and identify the actionable warnings in a priority order given by our algorithm. We observe that our model can identify over 90% of actionable warnings when our methods tell humans to ignore 70 to 80% of the warnings.

[1]  Nicolas Anquetil,et al.  A Framework to Compare Alert Ranking Algorithms , 2012, 2012 19th Working Conference on Reverse Engineering.

[2]  Maura R. Grossman,et al.  Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review , 2015, ArXiv.

[3]  Giuliano Antoniol,et al.  Would static analysis tools help developers with code reviews? , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[4]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[5]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[6]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[7]  Tim Menzies,et al.  Improving Vulnerability Inspection Efficiency Using Active Learning , 2018, IEEE Transactions on Software Engineering.

[8]  C. Lee Giles,et al.  Active learning for class imbalance problem , 2007, SIGIR.

[9]  Michalis Faloutsos,et al.  Graph-based analysis and prediction for software evolution , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[10]  Tim Menzies,et al.  Total recall, language processing, and software engineering , 2018, NL4SE@ESEC/SIGSOFT FSE.

[11]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[12]  Sarah Smith Heckman,et al.  A systematic literature review of actionable alert identification techniques for automated static code analysis , 2011, Inf. Softw. Technol..

[13]  Song Wang,et al.  Local-based active classification of test report to assist crowdsourced testing , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[14]  James M. Rehg,et al.  Active learning for automatic classification of software behavior , 2004, ISSTA '04.

[15]  Sarah Smith Heckman,et al.  A Model Building Process for Identifying Actionable Static Analysis Alerts , 2009, 2009 International Conference on Software Testing Verification and Validation.

[16]  Lin Tan,et al.  Finding patterns in static analysis alerts: improving actionable alert ranking , 2014, MSR 2014.

[17]  Tim Menzies,et al.  GALE: Geometric Active Learning for Search-Based Software Engineering , 2015, IEEE Transactions on Software Engineering.

[18]  E A Feigenbaum,et al.  Knowledge Engineering , 1984, Annals of the New York Academy of Sciences.

[19]  Gregg Rothermel,et al.  TERMINATOR: better automated UI test case prioritization , 2019, ESEC/SIGSOFT FSE.

[20]  Ayse Basar Bener,et al.  Mining trends and patterns of software vulnerabilities , 2016, J. Syst. Softw..

[21]  William J. Emery,et al.  SVM Active Learning Approach for Image Classification Using Spatial Information , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Ling Xu,et al.  Revisiting the Correlation Between Alerts and Software Defects: A Case Study on MyFaces, Camel, and CXF , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[23]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[24]  Cha Zhang,et al.  Ensemble Machine Learning: Methods and Applications , 2012 .

[25]  Byron C. Wallace,et al.  Meta-Analyst: software for meta-analysis of binary, continuous and diagnostic data , 2009, BMC medical research methodology.

[26]  Premkumar T. Devanbu,et al.  To what extent could we detect field defects? an empirical study of false negatives in static bug finding tools , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[27]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[28]  Robert P. W. Duin,et al.  Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Geoffrey Thomas,et al.  Security Impact Ratings Considered Harmful , 2009, HotOS.

[30]  Rinke Hoekstra The knowledge reengineering bottleneck , 2010, Semantic Web.

[31]  Tim Menzies,et al.  Better Technical Debt Detection via SURVEYing , 2019, ArXiv.

[32]  Tao Xie,et al.  Automatic construction of an effective training set for prioritizing static analysis warnings , 2010, ASE.

[33]  Leon Moonen,et al.  Assessing the value of coding standards: An empirical study , 2008, 2008 IEEE International Conference on Software Maintenance.

[34]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[35]  Munindar P. Singh,et al.  Platys: An Active Learning Framework for Place-Aware Application Development and Its Evaluation , 2015, TSEM.

[36]  Tim Menzies,et al.  How to Read Less: On the Benefit of Active Learning for Primary Study Selection in Systematic Literature Reviews , 2016 .

[37]  Sophia Ananiadou,et al.  Reducing systematic review workload through certainty-based screening , 2014, J. Biomed. Informatics.

[38]  Tim Menzies,et al.  FAST2: An intelligent assistant for finding relevant papers , 2017, Expert Syst. Appl..

[39]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[40]  Oege de Moor,et al.  Tracking Static Analysis Violations over Time to Capture Developer Characteristics , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[41]  Song Wang,et al.  Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[42]  Song Wang,et al.  Is there a "golden" feature set for static warning identification?: an experimental evaluation , 2018, ESEM.

[43]  Premkumar T. Devanbu,et al.  To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools , 2014, Automated Software Engineering.

[44]  Milos Manic,et al.  Mining Bug Databases for Unidentified Software Vulnerabilities , 2012, 2012 5th International Conference on Human System Interactions.

[45]  Michael D. Ernst,et al.  Which warnings should I fix first? , 2007, ESEC-FSE '07.

[46]  Sunghun Kim,et al.  Reducing Features to Improve Bug Prediction , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[47]  Junfeng Yang,et al.  Correlation exploitation in error ranking , 2004, SIGSOFT '04/FSE-12.

[48]  Michael D. Ernst,et al.  Prioritizing Warning Categories by Analyzing Software History , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[49]  Robert W. Bowdidge,et al.  Why don't software developers use static analysis tools to find bugs? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[50]  David Hovemeyer,et al.  Using Static Analysis to Find Bugs , 2008, IEEE Software.

[51]  Amritanshu Agrawal,et al.  The 'BigSE' Project: Lessons Learned from Validating Industrial Text Mining , 2016, 2016 IEEE/ACM 2nd International Workshop on Big Data Software Engineering (BIGDSE).

[52]  Jianjun Zhao,et al.  EFindBugs: Effective Error Ranking for FindBugs , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[53]  Premkumar T. Devanbu,et al.  Comparing static bug finders and statistical prediction , 2014, ICSE.

[54]  Sarah Smith Heckman,et al.  On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques , 2008, ESEM '08.

[55]  Gregg Rothermel,et al.  Searching for Better Test Case Prioritization Schemes: a Case Study of AI-assisted Systematic Literature Review , 2019, ArXiv.

[56]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .