论文信息 - Mining Fix Patterns for FindBugs Violations

Mining Fix Patterns for FindBugs Violations

Several static analysis tools, such as Splint or FindBugs, have been proposed to the software development community to help detect security vulnerabilities or bad programming practices. However, the adoption of these tools is hindered by their high false positive rates. If the false positive rate is too high, developers may get acclimated to violation reports from these tools, causing concrete and severe bugs being overlooked. Fortunately, some violations are actually addressed and resolved by developers. We claim that those violations that are recurrently fixed are likely to be true positives, and an automated approach can learn to repair similar unseen violations. However, there is lack of a systematic way to investigate the distributions on existing violations and fixed ones in the wild, that can provide insights into prioritizing violations for developers, and an effective way to mine code and fix patterns which can help developers easily understand the reasons of leading violations and how to fix them. In this paper, we first collect and track a large number of fixed and unfixed violations across revisions of software. The empirical analyses reveal that there are discrepancies in the distributions of violations that are detected and those that are fixed, in terms of occurrences, spread and categories, which can provide insights into prioritizing violations. To automatically identify patterns in violations and their fixes, we propose an approach that utilizes convolutional neural networks to learn features and clustering to regroup similar instances. We then evaluate the usefulness of the identified fix patterns by applying them to unfixed violations. The results show that developers will accept and merge a majority (69/116) of fixes generated from the inferred fix patterns. It is also noteworthy that the yielded patterns are applicable to four real bugs in the Defects4J major benchmark for software testing and automated repair.

[1] Abhik Roychoudhury,et al. DirectFix: Looking for Simple Program Repairs , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[2] Lin Tan,et al. Finding patterns in static analysis alerts: improving actionable alert ranking , 2014, MSR 2014.

[3] Yuanyuan Zhou,et al. BugBench: Benchmarks for Evaluating Bug Detection Tools , 2005 .

[4] Martin Monperrus,et al. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[5] Gregg Rothermel,et al. Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[6] Tegawendé F. Bissyandé. Harvesting Fix Hints in the History of Bugs , 2015, ArXiv.

[7] Sunghun Kim,et al. Automatically generated patches as debugging aids: a human study , 2014, SIGSOFT FSE.

[8] Dawson R. Engler,et al. Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations , 2003, SAS.

[9] Miryung Kim,et al. Does Automated Refactoring Obviate Systematic Editing? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[10] Zhi Jin,et al. Building Program Vector Representations for Deep Learning , 2014, KSEM.

[11] Andrew D. Gordon,et al. Bimodal Modelling of Source Code and Natural Language , 2015, ICML.

[12] Kwangkeun Yi,et al. Taming False Alarms from a Domain-Unaware C Analyzer by a Bayesian Statistical Post Analysis , 2005, SAS.

[13] Jacques Klein,et al. FixMiner: Mining relevant fix patterns for automated program repair , 2018, Empirical Software Engineering.

[14] Fan Long,et al. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[15] Hiroaki Yoshida,et al. Anti-patterns in search-based program repair , 2016, SIGSOFT FSE.

[16] Michael D. Ernst,et al. Prioritizing Warning Categories by Analyzing Software History , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[17] Abhik Roychoudhury,et al. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[18] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19] Masakazu Matsugu,et al. Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[20] Miryung Kim,et al. Lase: Locating and applying systematic edits by learning from examples , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[21] David Hovemeyer,et al. Tracking defect warnings across versions , 2006, MSR '06.

[22] William Pugh,et al. The Google FindBugs fixit , 2010, ISSTA '10.

[23] Vitaly Shmatikov,et al. RoleCast: finding missing security checks when you do not know what checks are , 2011, OOPSLA '11.

[24] Jaechang Nam,et al. Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[25] Matias Martinez,et al. Fine-grained and accurate source code differencing , 2014, ASE.

[26] Loris D'Antoni,et al. Learning Quick Fixes from Code Repositories , 2018, SBES.

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29] Shan Lu,et al. CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[30] Xiaodong Gu,et al. Deep API learning , 2016, SIGSOFT FSE.

[31] Sumit Gulwani,et al. Learning Syntactic Program Transformations from Examples , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[32] J. David Morgenthaler,et al. Evaluating static analysis defect warnings on production software , 2007, PASTE '07.

[33] Tegawendé F. Bissyandé,et al. LSRepair: Live Search of Fix Ingredients for Automated Program Repair , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[34] Tukaram Muske,et al. Efficient elimination of false positives using static analysis , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[35] D. V. Radhika,et al. An automated approach to detect violations with high confidence in incremental code using a learning system , 2014, ICSE Companion.

[36] Miryung Kim,et al. Systematic editing: generating program transformations from an example , 2011, PLDI '11.

[37] Junfeng Yang,et al. Correlation exploitation in error ranking , 2004, SIGSOFT '04/FSE-12.

[38] Julia L. Lawall,et al. Documenting and automating collateral evolutions in linux device drivers , 2008, Eurosys '08.

[39] Dawson R. Engler,et al. Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[40] Premkumar T. Devanbu,et al. A Survey of Machine Learning for Big Code and Naturalness , 2017, ACM Comput. Surv..

[41] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[42] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[43] Xiaodong Gu,et al. Deep Code Search , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[44] Sarah Smith Heckman,et al. A systematic literature review of actionable alert identification techniques for automated static code analysis , 2011, Inf. Softw. Technol..

[45] Jacques Klein,et al. Impact of tool support in patch construction , 2017, ISSTA.

[46] Maximilian Junker,et al. SMT-Based False Positive Elimination in Static Program Analysis , 2012, ICFEM.

[47] Trong Duc Nguyen,et al. Exploring API Embedding for API Usages and Applications , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[48] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[49] Andreas Zeller,et al. The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[50] Westley Weimer,et al. A human study of patch maintainability , 2012, ISSTA 2012.

[51] Claire Le Goues,et al. Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[52] Sarah Smith Heckman,et al. On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques , 2008, ESEM '08.

[53] Andrew W. Moore,et al. X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[54] Alessandro Orso,et al. MintHint: automated synthesis of repair hints , 2013, ICSE.

[55] Yuriy Brun,et al. Is the cure worse than the disease? overfitting in automated program repair , 2015, ESEC/SIGSOFT FSE.

[56] Chadd C. Williams,et al. Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[57] Fan Long,et al. Automatic inference of code transforms for patch generation , 2017, ESEC/SIGSOFT FSE.

[58] Zhendong Su,et al. An Empirical Study on Real Bug Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[59] Geoffrey I. Webb. Lazy Learning , 2010, Encyclopedia of Machine Learning.

[60] Marco Torchiano,et al. An empirical validation of FindBugs issues related to defects , 2011 .

[61] Michael D. Ernst,et al. Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[62] Martin Monperrus,et al. A critical review of "automatic patch generation learned from human-written patches": essay on the problem statement and the evaluation of automatic software repair , 2014, ICSE.

[63] Premkumar T. Devanbu,et al. On the naturalness of software , 2016, Commun. ACM.

[64] Omer Levy,et al. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[65] Yuriy Brun,et al. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs , 2015, IEEE Transactions on Software Engineering.

[66] Yungbum Jung,et al. Reducing False Alarms from an Industrial-Strength Static Analyzer by SVM , 2014, 2014 21st Asia-Pacific Software Engineering Conference.

[67] Oege de Moor,et al. Tracking Static Analysis Violations over Time to Capture Developer Characteristics , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[68] Hosik Choi,et al. An empirical study on classification methods for alarms from a bug-finding static C analyzer , 2007, Inf. Process. Lett..

[69] Martin P. Robillard,et al. Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[70] Yuriy Brun,et al. The plastic surgery hypothesis , 2014, SIGSOFT FSE.

[71] Song Wang,et al. Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[72] Claire Le Goues,et al. GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[73] Kwang-Moo Choe,et al. Filtering false alarms of buffer overflow analysis using SMT solvers , 2010, Inf. Softw. Technol..

[74] Priyanka Darke,et al. Precise Analysis of Large Industry Code , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[75] Jianjun Zhao,et al. EFindBugs: Effective Error Ranking for FindBugs , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[76] Claire Le Goues,et al. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[77] Matias Martinez,et al. Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset , 2016, Empirical Software Engineering.

[78] Jiachen Zhang,et al. Precise Condition Synthesis for Program Repair , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[79] David Lo,et al. History Driven Program Repair , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[80] Li Li,et al. A Closer Look at Real-World Patches , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[81] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[82] Erica Mealy,et al. BegBunch: benchmarking for C bug detection tools , 2009, DEFECTS '09.

[83] Sunghun Kim,et al. Toward an understanding of bug fix patterns , 2009, Empirical Software Engineering.

[84] Michael D. Ernst,et al. Which warnings should I fix first? , 2007, ESEC-FSE '07.

[85] Terrence J. Sejnowski,et al. Unsupervised Learning , 2018, Encyclopedia of GIS.

[86] Dawei Qi,et al. SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[87] Yijun Yu,et al. Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks , 2017, AAAI Workshops.

[88] Sarah Smith Heckman,et al. A Model Building Process for Identifying Actionable Static Analysis Alerts , 2009, 2009 International Conference on Software Testing Verification and Validation.

[89] David Hovemeyer,et al. Finding bugs is easy , 2004, SIGP.

[90] Andreas Zeller,et al. Generating Fixes from Object Behavior Anomalies , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[91] Collin McMillan,et al. Automatically generating commit messages from diffs using neural machine translation , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[92] Tao Wang,et al. Convolutional Neural Networks over Tree Structures for Programming Language Processing , 2014, AAAI.

[93] Thomas Zimmermann,et al. Extraction of bug localization benchmarks from history , 2007, ASE.