Enhancing Server Availability and Security Through Failure-Oblivious Computing

We present a new technique, failure-oblivious computing, that enables servers to execute through memory errors without memory corruption. Our safe compiler for C inserts checks that dynamically detect invalid memory accesses. Instead of terminating or throwing an exception, the generated code simply discards invalid writes and manufactures values to return for invalid reads, enabling the server to continue its normal execution path. We have applied failure-oblivious computing to a set of widely-used servers from the Linux-based open-source computing environment. Our results show that our techniques 1) make these servers invulnerable to known security attacks that exploit memory errors, and 2) enable the servers to continue to operate successfully to service legitimate requests and satisfy the needs of their users even after attacks trigger their memory errors. We observed several reasons for this successful continued execution. When the memory errors occur in irrelevant computations, failure-oblivious computing enables the server to execute through the memory errors to continue on to execute the relevant computation. Even when the memory errors occur in relevant computations, failure-oblivious computing converts requests that trigger unanticipated and dangerous execution paths into anticipated invalid inputs, which the error-handling logic in the server rejects. Because servers tend to have small error propagation distances (localized errors in the computation for one request tend to have little or no effect on the computations for subsequent requests), redirecting reads that would otherwise cause addressing errors and discarding writes that would otherwise corrupt critical data structures (such as the call stack) localizes the effect of the memory errors, prevents addressing exceptions from terminating the computation, and enables the server to continue on to successfully process subsequent requests. The overall result is a substantial extension of the range of requests that the server can successfully process.

[1]  Martin Rinard,et al.  Data size optimizations for java programs , 2003 .

[2]  William R. Bush,et al.  A static analyzer for finding dynamic programming errors , 2000 .

[3]  David M. Weiss,et al.  Auditdraw: generating audits the FAST way , 1997, Proceedings of ISRE '97: 3rd IEEE International Symposium on Requirements Engineering.

[4]  Derek Bruening,et al.  Secure Execution via Program Shepherding , 2002, USENIX Security Symposium.

[5]  Martin Rinard,et al.  Acceptability-oriented computing , 2003, SIGP.

[6]  YangJun,et al.  Frequent value locality and value-centric data cache design , 2000 .

[7]  Michael Rodeh,et al.  CSSV: towards a realistic tool for statically detecting all buffer overflows in C , 2003, PLDI '03.

[8]  B. Latané,et al.  Group inhibition of bystander intervention in emergencies. , 1968, Journal of personality and social psychology.

[9]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[10]  B. Latané,et al.  Bystander intervention in emergencies: diffusion of responsibility. , 1968, Journal of personality and social psychology.

[11]  Angelos D. Keromytis,et al.  Using Execution Transactions To Recover From Buffer Overflow Attacks , 2004 .

[12]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[13]  Dinakar Dhurjati,et al.  Memory safety without runtime checks or garbage collection , 2003, LCTES '03.

[14]  Paul H. J. Kelly,et al.  Backwards-Compatible Bounds Checking for Arrays and Pointers in C Programs , 1997, AADEBUG.

[15]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[16]  Algirdas Avizienis,et al.  Software Fault Tolerance , 1989, IFIP Congress.

[17]  Samiha Mourad,et al.  On the Reliability of the IBM MVS/XA Operating System , 1987, IEEE Transactions on Software Engineering.

[18]  Susan Horwitz,et al.  Protecting C programs from attacks via invalid pointer dereferences , 2003, ESEC/FSE-11.

[19]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[20]  David A. Wagner,et al.  A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities , 2000, NDSS.

[21]  George Candea,et al.  Recursive restartability: turning the reboot sledgehammer into a scalpel , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[22]  Rajiv Gupta,et al.  Optimizing array bound checks using flow analysis , 1993, LOPL.

[23]  Emmett Witchel,et al.  Mondriaan Memory Protection , 2004 .

[24]  James Cheney,et al.  Cyclone: A Safe Dialect of C , 2002, USENIX Annual Technical Conference, General Track.

[25]  Martin C. Rinard,et al.  Symbolic bounds analysis of pointers, array indices, and accessed memory regions , 2005, TOPL.

[26]  George C. Necula,et al.  CCured in the real world , 2003, PLDI '03.

[27]  R. D. Royer,et al.  The 5ESS switching system: Maintenance capabilities , 1985, AT&T Technical Journal.

[28]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[29]  Marvin Solomon,et al.  The evolution of Condor checkpointing , 1999 .

[30]  Martin C. Rinard,et al.  Automatic detection and repair of errors in data structures , 2003, OOPSLA '03.

[31]  Brian N. Bershad,et al.  Improving the reliability of commodity operating systems , 2005, TOCS.

[32]  William R. Bush,et al.  A static analyzer for finding dynamic programming errors , 2000, Softw. Pract. Exp..

[33]  George C. Necula,et al.  CCured: type-safe retrofitting of legacy code , 2002, POPL '02.

[34]  Crispan Cowan,et al.  StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks , 1998, USENIX Security Symposium.

[35]  Willy Zwaenepoel,et al.  Flash: An efficient and portable Web server , 1999, USENIX Annual Technical Conference, General Track.

[36]  Margo I. Seltzer,et al.  Self-monitoring and self-adapting operating systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[37]  Krste Asanovic,et al.  Mondrian memory protection , 2002, ASPLOS X.

[38]  Graham Hamilton,et al.  The Spring Nucleus: A Microkernel for Objects , 1993 .

[39]  Jun Yang,et al.  Frequent value locality and value-centric data cache design , 2000, SIGP.

[40]  E AndersonThomas,et al.  Efficient software-based fault isolation , 1993 .

[41]  Frank Pfenning,et al.  Eliminating array bound checking through dependent types , 1998, PLDI.

[42]  K. Rustan M. Leino,et al.  Declaring and checking non-null types in an object-oriented language , 2003, OOPSLA 2003.

[43]  Dawson R. Engler,et al.  Checking system rules using system-specific, programmer-written compiler extensions , 2000, OSDI.

[44]  W. Edwards Deming,et al.  Out of the Crisis , 1982 .

[45]  Robert O. Hastings,et al.  Fast detection of memory leaks and access errors , 1991 .

[46]  Todd M. Austin,et al.  Efficient detection of all pointer and array access errors , 1994, PLDI '94.

[47]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[48]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[49]  Daniel M. Roy,et al.  A dynamic technique for eliminating buffer overflow vulnerabilities (and other memory errors) , 2004, 20th Annual Computer Security Applications Conference.

[50]  Olatunji Ruwase,et al.  A Practical Dynamic Buffer Overflow Detector , 2004, NDSS.