Finding Error-Handling Bugs in Systems Code Using Static Analysis

Run-time errors are unavoidable whenever software interacts with the physical world. Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and errormanagement bugs at the file system layer can cause silent, unrecoverable data corruption. Furthermore, even when developers have the best of intentions, inaccurate documentation can mislead programmers and cause software to fail in unexpected ways. We use static program analysis to understand and make error handling in large systems more reliable. We apply our analyses to numerous Linux file systems and drivers, finding hundreds of confirmed error-handling bugs that could lead to serious problems such as system crashes, silent data loss and corruption.

[1]  Neeraj Suri,et al.  Assessing inter-modular error propagation in distributed software , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[2]  Somesh Jha,et al.  Weighted pushdown systems and their application to interprocedural dataflow analysis , 2003, Sci. Comput. Program..

[3]  Tarak Goradia Dynamic impact analysis: a cost-effective technique to enforce error-propagation , 1993, ISSTA '93.

[4]  Andrea C. Arpaci-Dusseau,et al.  EIO: Error Handling is Occasionally Correct , 2008, FAST.

[5]  George Candea,et al.  Automatic failure-path inference: a generic introspection technique for Internet applications , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.

[6]  Jørn Lind-Nielsen,et al.  BuDDy : A binary decision diagram package. , 1999 .

[7]  Neeraj Suri,et al.  Error propagation profiling of operating systems , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[8]  Anand R. Tripathi,et al.  Issues with Exception Handling in Object-Oriented Systems , 1997, ECOOP.

[9]  Arie van Deursen,et al.  Discovering faults in idiom-based exception handling , 2006, ICSE '06.

[10]  Ben Liblit,et al.  Expect the unexpected: error code mismatches between documentation and the real world , 2010, PASTE '10.

[11]  Andrea C. Arpaci-Dusseau,et al.  Error propagation analysis for file systems , 2009, PLDI '09.

[12]  Stephen McCamant,et al.  Dynamic inference of abstract types , 2006, ISSTA '06.

[13]  Neeraj Suri,et al.  An approach for analysing the propagation of data errors in software , 2001, 2001 International Conference on Dependable Systems and Networks.

[14]  Thomas Reps,et al.  WPDS++: A C++ library for weighted pushdown systems , 2005 .

[15]  Martin P. Robillard,et al.  Regaining Control of Exception Handling , 1999 .

[16]  Neeraj Suri,et al.  EPIC: profiling the propagation and effect of data errors in software , 2004, IEEE Transactions on Computers.

[17]  Neeraj Suri,et al.  PROPANE: an environment for examining the propagation of errors in software , 2002, ISSTA '02.

[18]  Ben Liblit,et al.  Defective error/pointer interactions in the Linux kernel , 2011, ISSTA '11.

[19]  Westley Weimer,et al.  Automatic documentation inference for exceptions , 2008, ISSTA '08.

[20]  Randal E. Bryant Binary decision diagrams and beyond: enabling technologies for formal verification , 1995, ICCAD.

[21]  Flaviu Cristian,et al.  Exception Handling , 1989 .

[22]  Tayssir Touili,et al.  Abstract Error Projection , 2007, SAS.

[23]  Junfeng Yang,et al.  Using model checking to find serious file system errors , 2004, TOCS.

[24]  Cristina V. Lopes,et al.  A study on exception detection and handling using aspect-oriented programming , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[25]  Kang G. Shin,et al.  Modeling and Measurement of Error Propagation in a Multimodule Computing System , 1988, IEEE Trans. Computers.

[26]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.