Exniffer: Learning to Prioritize Crashes by Assessing the Exploitability from Memory Dump

An important component of software reliability is the assurance of certain security guarantees, such as absence of low-level bugs that may result in code exploitation, for example. A program crash is an early indicator of possible errors in the program like memory corruption, access violation or division by zero. In particular, a crash may indicate the presence of safety or security critical errors. A safety-error crash does not result in any exploitable condition, whereas a security-error crash allows an attacker to exploit a vulnerability. However, distinguishing one from the other is a non-trivial task. This exacerbates the problem in cases where we get hundreds of crashes and programmers have to make choices which crash to patch first! In this work, we present a technique to identify security critical crashes by applying machine learning on a set of features derived from core-dump files and runtime information obtained from hardware assisted monitoring such as the last branch record (LBR) register. We implement the proposed technique in a prototype called Exniffer. Our empirical results, obtained by experimenting Exniffer on several crashes on real-world applications show that proposed technique is able to classify a given crash as exploitable or not-exploitable with high accuracy.

[1]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[2]  William K. Robertson,et al.  LAVA: Large-Scale Automated Vulnerability Addition , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[3]  Guanhua Yan,et al.  ExploitMeter: Combining Fuzzing with Machine Learning for Automated Evaluation of Software Exploitability , 2017, 2017 IEEE Symposium on Privacy-Aware Computing (PAC).

[4]  Sooyong Park,et al.  Which Crashes Should I Fix First?: Predicting Top Crashes at an Early Stage to Prioritize Debugging Efforts , 2011, IEEE Transactions on Software Engineering.

[5]  Shan Lu,et al.  Leveraging the short-term memory of hardware to diagnose production-run software failures , 2014, ASPLOS.

[6]  Peng Liu,et al.  CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core Dump , 2016, CCS.

[7]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[8]  John Barnes Gem #30: safe and secure software: introduction , 2009, ALET.

[9]  Sanjay Rawat,et al.  Finding Buffer Overflow Inducing Loops in Binary Executables , 2012, 2012 IEEE Sixth International Conference on Software Security and Reliability.

[10]  George Candea,et al.  Failure sketching: a technique for automated root cause diagnosis of in-production failures , 2015, SOSP.

[11]  Mary Lou Soffa,et al.  THeME: a system for testing by hardware monitoring events , 2012, ISSTA 2012.

[12]  Stephen McCamant,et al.  Crash analysis with BitBlaze , 2010 .

[13]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[14]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[15]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[16]  David Brumley,et al.  All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask) , 2010, 2010 IEEE Symposium on Security and Privacy.

[17]  Eun-Sun Cho,et al.  Automated Crash Filtering for ARM Binary Programs , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[18]  Shih-Kun Huang,et al.  Software Crash Analysis for Automatic Exploit Generation on Binary Programs , 2014, IEEE Transactions on Reliability.

[19]  Dongmei Zhang,et al.  ReBucket: A method for clustering duplicate crash reports based on call stack similarity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[20]  Ali Mesbah,et al.  Works for me! characterizing non-reproducible bug reports , 2014, MSR 2014.

[21]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[22]  Guillermo L. Grinblat,et al.  Toward Large-Scale Vulnerability Discovery using Machine Learning , 2016, CODASPY.

[23]  David Brumley,et al.  Unleashing Mayhem on Binary Code , 2012, 2012 IEEE Symposium on Security and Privacy.

[24]  Wang Xin,et al.  Program Crash Analysis Based on Taint Analysis , 2014, 2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.