Life, death, and the critical transition: finding liveness bugs in systems code

Modern software model checkers find safety violations: breaches where the system enters some bad state. However, we argue that checking liveness properties offers both a richer and more natural way to search for errors, particularly in complex concurrent and distributed systems. Liveness properties specify desirable system behaviors which must be satisfied eventually, but are not always satisfied, perhaps as a result of failure or during system initialization. Existing software model checkers cannot verify liveness because doing so requires finding an infinite execution that does not satisfy a liveness property. We present heuristics to find a large class of liveness violations and the critical transition of the execution. The critical transition is the step in an execution that moves the system from a state that does not currently satisfy some liveness property--but where recovery is possible in the future--to a dead state that can never achieve the liveness property. Our software model checker, MACEMC, isolates complex liveness errors in our implementations of PASTRY, CHORD, a reliable transport protocol, and an overlay tree.

[1]  Patrice Godefroid,et al.  Model checking for programming languages using VeriSoft , 1997, POPL '97.

[2]  Alexander Aiken,et al.  Scalable error detection using boolean satisfiability , 2005, POPL '05.

[3]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[4]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[5]  Amin Vahdat,et al.  Bullet: high bandwidth data dissemination using an overlay mesh , 2003, SOSP '03.

[6]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[7]  Thomas A. Henzinger,et al.  MOCHA: Modularity in Model Checking , 1998, CAV.

[8]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[9]  Junfeng Yang,et al.  Using model checking to find serious file system errors , 2004, TOCS.

[10]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[11]  Andreas Podelski,et al.  Termination proofs for systems code , 2006, PLDI '06.

[12]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[13]  Kenneth L. McMillan,et al.  A methodology for hardware verification using compositional model checking , 2000, Sci. Comput. Program..

[14]  Alex Groce,et al.  What Went Wrong: Explaining Counterexamples , 2003, SPIN.

[15]  Joseph Sifakis,et al.  Specification and verification of concurrent systems in CESAR , 1982, Symposium on Programming.

[16]  Sriram K. Rajamani,et al.  Boolean Programs: A Model and Process for Software Analysis , 2000 .

[17]  Harry Rudin,et al.  Protocol specification, testing, and verification, III : proceedings of the IFIP WG 6.1 Third International Workshop on Protocol Specification, Testing, and Verification, Rüschlikon, Switzerland, 31 May-2 June, 1983 , 1983 .

[18]  Dawson R. Engler,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Cmc: a Pragmatic Approach to Model Checking Real Code , 2022 .

[19]  Ben Y. Zhao,et al.  Towards a Common API for Structured Peer-to-Peer Overlays , 2003, IPTPS.

[20]  David A. Wagner,et al.  MOPS: an infrastructure for examining security properties of software , 2002, CCS '02.

[21]  Gerard J. Holzmann,et al.  The SPIN Model Checker , 2003 .

[22]  Wei Lin,et al.  WiDS Checker: Combating Bugs in Distributed Systems , 2007, NSDI.

[23]  Dawson R. Engler,et al.  Model Checking Large Network Protocol Implementations , 2004, NSDI.

[24]  Thomas A. Henzinger,et al.  Lazy abstraction , 2002, POPL '02.

[25]  Gerard J. Holzmann,et al.  Logic Verification of ANSI-C Code with SPIN , 2000, SPIN.

[26]  Ion Stoica,et al.  Friday: Global Comprehension for Distributed Replay , 2007, NSDI.

[27]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[28]  Klaus Havelund,et al.  Model checking JAVA programs using JAVA PathFinder , 2000, International Journal on Software Tools for Technology Transfer.

[29]  James C. Corbett,et al.  Bandera: extracting finite-state models from Java source code , 2000, ICSE.

[30]  Amin Vahdat,et al.  Mace: language support for building distributed systems , 2007, PLDI '07.

[31]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[32]  Amin Vahdat,et al.  Using Random Subsets to Build Scalable Network Services , 2003, USENIX Symposium on Internet Technologies and Systems.

[33]  James C. Corbett,et al.  Bandera: a source-level interface for model checking Java programs , 2000, ICSE '00.

[34]  Ganesh Gopalakrishnan,et al.  Random Walk Based Heuristic Algorithms for Distributed Memory Model Checking , 2003, PDMC@CAV.

[35]  Jayadev Misra,et al.  Assertion Graphs for Verifying and Synthesizing Programs , 1978 .

[36]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching-Time Temporal Logic , 1981, Logic of Programs.

[37]  Alan J. Hu,et al.  Protocol verification as a hardware design aid , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[38]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[39]  Hassen Saïdi,et al.  Construction of Abstract State Graphs with PVS , 1997, CAV.