Push-Button Verification of File Systems via Crash Refinement

The file system is an essential operating system component for persisting data on storage devices. Writing bug-free file systems is non-trivial, as they must correctly implement and maintain complex on-disk data structures even in the presence of system crashes and reorderings of disk operations. This paper presents Yggdrasil, a toolkit for writing file systems with push-button verification: Yggdrasil requires no manual annotations or proofs about the implementation code, and it produces a counterexample if there is a bug. Yggdrasil achieves this automation through a novel definition of file system correctness called crash refinement, which requires the set of possible disk states produced by an implementation (including states produced by crashes) to be a subset of those allowed by the specification. Crash refinement is amenable to fully automated satisfiability modulo theories (SMT) reasoning, and enables developers to implement file systems in a modular way for verification. With Yggdrasil, we have implemented and verified the Yxv6 journaling file system, the Ycp file copy utility, and the Ylog persistent log. Our experience shows that the ease of proof and counterexample-based debugging support make Yggdrasil practical for building reliable storage applications.

[1]  W. M. McKeeman,et al.  Peephole optimization , 1965, CACM.

[2]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[3]  Hanan Samet,et al.  Proving the correctness of heuristically optimized code , 1978, CACM.

[4]  Robert B. Hagmann,et al.  Reimplementing the Cedar file system using logging and group commit , 1987, SOSP '87.

[5]  T. J. Kowalski,et al.  Fsck—the UNIX file system check program , 1990 .

[6]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[7]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[8]  Yale N. Patt,et al.  Metadata update performance in file systems , 1994, OSDI '94.

[9]  Hugo Herbelin,et al.  The Coq proof assistant : reference manual, version 6.1 , 1997 .

[10]  Amir Pnueli,et al.  Translation Validation , 1998, TACAS.

[11]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[12]  Tobias Nipkow,et al.  A Proof Assistant for Higher-Order Logic , 2002 .

[13]  Andrea C. Arpaci-Dusseau,et al.  Model-based failure analysis of journaling file systems , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[14]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[15]  Gerard J. Holzmann,et al.  A mini challenge: build a verifiable filesystem , 2007, Formal Aspects of Computing.

[16]  Junfeng Yang,et al.  Using model checking to find serious file system errors , 2004, TOCS.

[17]  Junfeng Yang,et al.  EXPLODE: a lightweight, general system for finding serious storage system errors , 2006, OSDI '06.

[18]  Val Henson Reducing fsck time for ext2 file systems , 2006 .

[19]  Junfeng Yang,et al.  Automatically generating malicious disks using symbolic execution , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[20]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[21]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[22]  Andrea C. Arpaci-Dusseau,et al.  Error propagation analysis for file systems , 2009, PLDI '09.

[23]  David Flynn,et al.  DFS: A file system for virtualized flash storage , 2010, TOS.

[24]  Youssef Hamadi,et al.  Efficiently solving quantified bit-vector formulas , 2010, Formal Methods in Computer Aided Design.

[25]  Christophe Calvès,et al.  Faults in linux: ten years later , 2011, ASPLOS XVI.

[26]  Stefan Behnel,et al.  Cython: The Best of Both Worlds , 2011, Computing in Science & Engineering.

[27]  Alberto Griggio,et al.  The MathSAT5 SMT Solver , 2013, TACAS.

[28]  Magnus O. Myreen,et al.  Translation validation for a verified OS kernel , 2013, PLDI.

[29]  Timothy Roscoe,et al.  Arrakis , 2014, OSDI.

[30]  Andrea C. Arpaci-Dusseau,et al.  A Study of Linux File System Evolution , 2013, FAST.

[31]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[32]  Mark Lillibridge,et al.  Torturing Databases for Fun and Profit , 2014, OSDI.

[33]  Andrea C. Arpaci-Dusseau,et al.  All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications , 2014, OSDI.

[34]  Emina Torlak,et al.  A lightweight symbolic virtual machine for solver-aided host languages , 2014, PLDI.

[35]  Armin Biere,et al.  Boolector 2.0 , 2015, J. Satisf. Boolean Model. Comput..

[36]  Gidon Ernst,et al.  Development of a Verified Flash File System , 2014, ABZ.

[37]  Gidon Ernst,et al.  Inside a Verified Flash File System: Transactions and Garbage Collection , 2015, VSTTE.

[38]  Eddie Kohler,et al.  Specifying Crash Safety for Storage Systems , 2015, HotOS.

[39]  Tom Ridge,et al.  SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems , 2015, SOSP.

[40]  Andrea C. Arpaci-Dusseau,et al.  Beyond Storage APIs: Provable Semantics for Storage Stacks , 2015, HotOS.

[41]  Junfeng Yang,et al.  Reducing crash recoverability to reachability , 2016, POPL.

[42]  Xi Wang,et al.  Specifying and Checking File System Crash-Consistency Models , 2016, ASPLOS.

[43]  Sidney Amani,et al.  Cogent: Verifying High-Assurance File System Implementations , 2016, ASPLOS.

[44]  Sidney Amani,et al.  Refinement through restraint: bringing down the cost of verification , 2016, ICFP 2016.

[45]  Adam Chlipala,et al.  Using Crash Hoare logic for certifying the FSCQ file system , 2015, USENIX Annual Technical Conference.

[46]  Eddie Kohler,et al.  The scalable commutativity rule , 2017, Commun. ACM.