Clone-hunter: accelerated bound checks elimination via binary code clone detection

Unsafe pointer usage and illegitimate memory accesses are prevalent bugs in software. To ensure memory safety, conditions for array bound checks are inserted into the code to detect out-of-bound memory accesses. Unfortunately, these bound checks contribute to high runtime overheads, and therefore, redundant array bound checks should be removed to improve application performance. In this paper, we propose Clone-Hunter, a practical and scalable framework for redundant bound check elimination in binary executables. Clone-Hunter first uses binary code clone detection, and then employs bound safety verification mechanism (using binary symbolic execution) to ensure sound removal of redundant bound checks. Our results show the Clone-Hunter can swiftly identify redundant bound checks about 90× faster than pure binary symbolic execution, while ensuring zero false positives.

[1]  Konrad Rieck,et al.  Generalized vulnerability extrapolation using abstract syntax trees , 2012, ACSAC '12.

[2]  Guru Venkataramani,et al.  DamGate: Dynamic Adaptive Multi-feature Gating in Program Binaries , 2017, FEAST@CCS.

[3]  Yongbo Li,et al.  StatSym: Vulnerable Path Discovery through Statistics-Guided Symbolic Execution , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[4]  Guru Venkataramani,et al.  Tradeoffs in fine-grained heap memory protection , 2006, ASID '06.

[5]  Gary Cokins,et al.  Performance Management: Finding the Missing Pieces (to Close the Intelligence Gap) , 2004 .

[6]  Guru Prasadh V. Venkataramani,et al.  Low-cost and efficient architectural support for correctness and performance debugging , 2009 .

[7]  Heng Yin,et al.  vfGuard: Strict Protection for Virtual Function Calls in COTS C++ Binaries , 2015, NDSS.

[8]  Guru Venkataramani,et al.  MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[9]  Yi Yang,et al.  Towards Efficient Heap Overflow Discovery , 2017, USENIX Security Symposium.

[10]  Konrad Rieck,et al.  Chucky: exposing missing checks in source code for vulnerability discovery , 2013, CCS.

[11]  Zhiqiang Lin,et al.  Type Inference on Executables , 2016, ACM Comput. Surv..

[12]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[13]  Jiang Ming,et al.  BinSim: Trace-based Semantic Binary Diffing via System Call Sliced Segment Equivalence Checking , 2017, USENIX Security Symposium.

[14]  Christopher Krügel,et al.  SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[15]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[16]  Fan Yao,et al.  JOP-alarm: Detecting jump-oriented programming-based anomalies in applications , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[17]  Debin Gao,et al.  BinHunt: Automatically Finding Semantic Differences in Binary Programs , 2008, ICICS.

[18]  Yongbo Li,et al.  SIMBER: Eliminating Redundant Memory Bound Checks via Statistical Inference , 2017, SEC.

[19]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[20]  Juanru Li,et al.  Binary Code Clone Detection across Architectures and Compiling Configurations , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[21]  Christian Rossow,et al.  Cross-architecture bug search in binary executables , 2015, 2015 IEEE Symposium on Security and Privacy.

[22]  Michael I. Jordan,et al.  Statistical debugging: simultaneous identification of multiple bugs , 2006, ICML.

[23]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[24]  Jürgen Wolff von Gudenberg,et al.  Clone detection in source code by frequent itemset techniques , 2004, Source Code Analysis and Manipulation, Fourth IEEE International Workshop on.

[25]  Jingling Xue,et al.  WPBOUND: Enforcing Spatial Memory Safety Efficiently at Runtime with Weakest Preconditions , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  David A. Powner Information Technology: OMB and Agencies Need to Focus Continued Attention on Implementing Reform Law, Statement of David A. Powner, Director Information Technology Management Issues, Testimony Before the Subcommittees on Government Operations and Information Technology, Committee on Oversight and G , 2016 .

[28]  Dennis J. Kucinich,et al.  Committee on Oversight and Government Reform , 2012 .

[29]  Giovanni Agosta,et al.  rev.ng: a unified binary analysis framework to recover CFGs and function boundaries , 2017, CC.

[30]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[31]  Yongbo Li,et al.  SARRE: Semantics-Aware Rule Recommendation and Enforcement for Event Paths on Android , 2016, IEEE Transactions on Information Forensics and Security.

[32]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[33]  Zhi Jin,et al.  Building Program Vector Representations for Deep Learning , 2014, KSEM.

[34]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[35]  Guru Venkataramani,et al.  MemTracker: An accelerator for memory debugging and monitoring , 2009, TACO.

[36]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[37]  Milo M. K. Martin,et al.  SoftBound: highly compatible and complete spatial memory safety for c , 2009, PLDI '09.

[38]  Dinakar Dhurjati,et al.  Backwards-compatible array bounds checking for C with very low overhead , 2006, ICSE.

[39]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[40]  Karthik Pattabiraman,et al.  Quantifying the Accuracy of High-Level Fault Injection Techniques for Hardware Faults , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[41]  Mingwei Zhang,et al.  Protecting COTS Binaries from Disclosure-guided Code Reuse Attacks , 2017, ACSAC.

[42]  David Brumley,et al.  BYTEWEIGHT: Learning to Recognize Functions in Binary Code , 2014, USENIX Security Symposium.