DESENSITIZATION: Privacy-Aware and Attack-Preserving Crash Report

Software vendors collect crash reports from endusers to assist in the debugging and testing of their products. However, crash reports may contain users’ private information, like names and passwords, rendering the user hesitant to share the reports with developers. We need a mechanism to protect users’ privacy in crash reports on the client side while keeping sufficient information to support server-side debugging and analysis. In this paper, we propose the DESENSITIZATION technique, which generates privacy-aware and attack-preserving crash reports from crashed executions. Our tool adopts lightweight methods to identify bug-related and attack-related data from the memory, and removes other data to protect users’ privacy. Since a large portion of the desensitized memory contains null bytes, we store crash reports in spare files to save the network bandwidth and the server-side storage. We prototype DESENSITIZATION and apply it to a large number of crashes of real-world programs, like browsers and the JavaScript engine. The result shows that our DESENSITIZATION technique can eliminate 80.9% of nonzero bytes from coredumps, and 49.0% from minidumps. The desensitized crash report can be 50.5% smaller than the original one, which significantly saves resources for report submission and storage. Our DESENSITIZATION technique is a push-button solution for the privacy-aware crash report.

[1]  Davide Balzarotti,et al.  ROPMEMU: A Framework for the Analysis of Complex Code-Reuse Attacks , 2016, AsiaCCS.

[2]  Donald E. Porter,et al.  Improved error reporting for software that uses black-box components , 2007, PLDI '07.

[3]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[4]  Dongmei Zhang,et al.  ReBucket: A method for clustering duplicate crash reports based on call stack similarity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[5]  Niels Provos,et al.  SHELLOS: Enabling Fast Detection and Forensic Analysis of Code Injection Attacks , 2011, USENIX Security Symposium.

[6]  Wenke Lee,et al.  Type Casting Verification: Stopping an Emerging Attack Vector , 2015, USENIX Security Symposium.

[7]  Ben Niu,et al.  REPT: Reverse Debugging of Failures in Deployed Software , 2018, OSDI.

[8]  David A. Wagner,et al.  ROP is Still Dangerous: Breaking Modern Defenses , 2014, USENIX Security Symposium.

[9]  Nitesh Saxena,et al.  Crashing Privacy: An Autopsy of a Web Browser's Leaked Crash Reports , 2018, ArXiv.

[10]  Xuxian Jiang,et al.  Mapping kernel objects to enable systematic integrity checking , 2009, CCS.

[11]  Evangelos P. Markatos,et al.  Emulation-Based Detection of Non-self-contained Polymorphic Shellcode , 2007, RAID.

[12]  Angelos D. Keromytis,et al.  ROP payload detection using speculative code execution , 2011, 2011 6th International Conference on Malicious and Unwanted Software.

[13]  Robert H. Deng,et al.  ROPecker: A Generic and Practical Approach For Defending Against ROP Attacks , 2014, NDSS.

[14]  Alex Aiken,et al.  Building a Better Backtrace: Techniques for Postmortem Program Analysis , 2002 .

[15]  Silviu Andrica,et al.  Mitigating Anonymity Challenges in Automated Testing and Debugging Systems , 2013, ICAC.

[16]  Galen C. Hunt,et al.  Debugging in the (very) large: ten years of implementation and experience , 2009, SOSP '09.

[17]  Miguel Castro,et al.  Better bug reporting with better privacy , 2008, ASPLOS 2008.

[18]  David Lo,et al.  kb-anonymity: a model for anonymized behaviour-preserving test and debugging data , 2011, PLDI '11.

[19]  Peng Liu,et al.  CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core Dump , 2016, CCS.

[20]  Dawson R. Engler,et al.  EXE: A system for automatically generating inputs of death using symbolic execution , 2006, CCS 2006.

[21]  Zhenkai Liang,et al.  Data-Oriented Programming: On the Expressiveness of Non-control Data Attacks , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  Peter M. Broadwell,et al.  Scrash: A System for Generating Secure Crash Information , 2003, USENIX Security Symposium.

[23]  Guy M. Lohman,et al.  Automatically Identifying Known Software Problems , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[24]  Hovav Shacham,et al.  The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86) , 2007, CCS '07.

[25]  Rahul Premraj,et al.  Do stack traces help developers fix bugs? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[26]  Paolo Romano,et al.  MultiPathPrivacy: Enhanced Privacy in Fault Replication , 2012, 2012 Ninth European Dependable Computing Conference.

[27]  David Brumley,et al.  Unleashing Mayhem on Binary Code , 2012, 2012 IEEE Symposium on Security and Privacy.

[28]  Paolo Romano,et al.  Enhancing privacy protection in fault replication systems , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[29]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[30]  Wenke Lee,et al.  Preventing Use-after-free with Dangling Pointers Nullification , 2015, NDSS.

[31]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[32]  Christopher Krügel,et al.  How the ELF Ruined Christmas , 2015, USENIX Security Symposium.

[33]  Yanick Fratantonio,et al.  RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[34]  Zhenkai Liang,et al.  Jump-oriented programming: a new class of code-reuse attack , 2011, ASIACCS '11.

[35]  Alessandro Orso,et al.  Camouflage: automated anonymization of field data , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[36]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[37]  Ahmad-Reza Sadeghi,et al.  Counterfeit Object-oriented Programming: On the Difficulty of Preventing Code Reuse Attacks in C++ Applications , 2015, 2015 IEEE Symposium on Security and Privacy.

[38]  Xuxian Jiang,et al.  SigGraph: Brute Force Scanning of Kernel Data Structure Instances Using Graph-based Signatures , 2011, NDSS.

[39]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[40]  Michael W. Hicks,et al.  Automated detection of persistent kernel control-flow attacks , 2007, CCS '07.

[41]  Paolo Romano,et al.  REAP: Reporting Errors Using Alternative Paths , 2014, ESOP.

[42]  Yi Yang,et al.  Towards Efficient Heap Overflow Discovery , 2017, USENIX Security Symposium.

[43]  Evangelos P. Markatos,et al.  Network-level polymorphic shellcode detection using emulation , 2006, Journal in Computer Virology.

[44]  XiaoFeng Wang,et al.  Panalyst: Privacy-Aware Remote Error Analysis on Commodity Software , 2008, USENIX Security Symposium.

[45]  Peng Liu,et al.  Postmortem Program Analysis with Hardware-Enhanced Post-Crash Artifacts , 2017, USENIX Security Symposium.

[46]  Nachiappan Nagappan,et al.  Crash graphs: An aggregated view of multiple crashes to improve crash triage , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).