Detection of recurring software vulnerabilities

Software security vulnerabilities are discovered on an almost daily basis and have caused substantial damage. Aiming at supporting early detection and resolution for them, we have conducted an empirical study on thousands of vulnerabilities and found that many of them are recurring due to software reuse. Based on the knowledge gained from the study, we developed SecureSync, an automatic tool to detect recurring software vulnerabilities on the systems that reuse source code or libraries. The core of SecureSync includes two techniques to represent and compute the similarity of vulnerable code across different systems. The evaluation for 60 vulnerabilities on 176 releases of 119 open-source software systems shows that SecureSync is able to detect recurring vulnerabilities with high accuracy and to identify 90 releases having potentially vulnerable code that are not reported or fixed yet, even in mature systems. A couple of cases were actually confirmed by their developers.

[1]  Gio Wiederhold,et al.  Knowledge bases , 1985, Future Gener. Comput. Syst..

[2]  Michael Gegick,et al.  Prioritizing software security fortification throughcode-level metrics , 2008, QoP '08.

[3]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[4]  Daniela E. Damian,et al.  Predicting build failures using social network analysis on developer communication , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[5]  Hoan Anh Nguyen,et al.  Graph-based mining of multiple object usage patterns , 2009, ESEC/FSE '09.

[6]  Anh Tuan Nguyen,et al.  Detecting recurring and similar software vulnerabilities , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[8]  Sunghun Kim,et al.  Memories of bug fixes , 2006, SIGSOFT '06/FSE-14.

[9]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[10]  Jiong Yang,et al.  Discovering Neglected Conditions in Software by Mining Dependence Graphs , 2008, IEEE Transactions on Software Engineering.

[11]  Indrajit Ray,et al.  Measuring, analyzing and predicting security vulnerabilities in software systems , 2007, Comput. Secur..

[12]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[13]  Tao Xie,et al.  Mining exception-handling rules as sequence association rules , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[14]  Thomas Zimmermann,et al.  The Beauty and the Beast: Vulnerabilities in Red Hat's Packages , 2009, USENIX Annual Technical Conference.

[15]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[16]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[17]  Chadd C. Williams,et al.  Automatic mining of source code repositories to improve bug finding techniques , 2005, IEEE Transactions on Software Engineering.

[18]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[19]  Carl K. Chang,et al.  Incremental Latent Semantic Indexing for Automatic Traceability Link Evolution Management , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[20]  Premkumar T. Devanbu,et al.  Latent social structure in open source projects , 2008, SIGSOFT '08/FSE-16.

[21]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[22]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[23]  Hoan Anh Nguyen,et al.  Complete and accurate clone detection in graph-based models , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[24]  Zhendong Su,et al.  Context-based detection of clone-related bugs , 2007, ESEC-FSE '07.

[25]  Yuanyuan Zhou,et al.  CP-Miner: finding copy-paste and related bugs in large-scale software code , 2006, IEEE Transactions on Software Engineering.

[26]  Andreas Zeller,et al.  HATARI: raising risk awareness , 2005, ESEC/FSE-13.

[27]  Hoan Anh Nguyen,et al.  Recurring bug fixes in object-oriented programs , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[28]  Hoan Anh Nguyen,et al.  Accurate and Efficient Structural Characteristic Feature Extraction for Clone Detection , 2009, FASE.

[29]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[30]  Hoan Anh Nguyen,et al.  Operation-Based, Fine-Grained Version Control Model for Tree-Based Representation , 2010, FASE.

[31]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[32]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[33]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[34]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[35]  Flaviu Ghitulescu,et al.  Google Code Search , 2006 .

[36]  Omar H. Alhazmi,et al.  Quantitative vulnerability assessment of systems software , 2005, Annual Reliability and Maintainability Symposium, 2005. Proceedings..

[37]  Andy Podgurski,et al.  Automated Support for Propagating Bug Fixes , 2008, 2008 19th International Symposium on Software Reliability Engineering (ISSRE).

[38]  Qinbao Song,et al.  Software defect association mining and defect correction effort prediction , 2006, IEEE Transactions on Software Engineering.

[39]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.