Verifying a high-performance crash-safe file system using a tree specification

DFSCQ is the first file system that (1) provides a precise specification for fsync and fdatasync, which allow applications to achieve high performance and crash safety, and (2) provides a machine-checked proof that its implementation meets this specification. DFSCQ's specification captures the behavior of sophisticated optimizations, including log-bypass writes, and DFSCQ's proof rules out some of the common bugs in file-system implementations despite the complex optimizations. The key challenge in building DFSCQ is to write a specification for the file system and its internal implementation without exposing internal file-system details. DFSCQ introduces a metadata-prefix specification that captures the properties of fsync and fdatasync, which roughly follows the behavior of Linux ext4. This specification uses a notion of tree sequences---logical sequences of file-system tree states---for succinct description of the possible states after a crash and to describe how data writes can be reordered with respect to metadata updates. This helps application developers prove the crash safety of their own applications, avoiding application-level bugs such as forgetting to invoke fsync on both the file and the containing directory. An evaluation shows that DFSCQ achieves 103 MB/s on large file writes to an SSD and durably creates small files at a rate of 1,618 files per second. This is slower than Linux ext4 (which achieves 295 MB/s for large file writes and 4,977 files/s for small file creation) but much faster than two recent verified file systems, Yggdrasil and FSCQ. Evaluation results from application-level benchmarks, including TPC-C on SQLite, mirror these microbenchmarks.

[1]  E. K. Gannett,et al.  THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS , 1965 .

[2]  V. Rich Personal communication , 1989, Nature.

[3]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[4]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[5]  Yale N. Patt,et al.  Metadata update performance in file systems , 1994, OSDI '94.

[6]  TU MarkusWenzel Some aspects of Unix file-system security , 2001 .

[7]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[8]  Viktor Kuncak,et al.  Verifying a File System Implementation , 2004, ICFEM.

[9]  Junfeng Yang,et al.  Using model checking to find serious file system errors , 2004, TOCS.

[10]  Junfeng Yang,et al.  EXPLODE: a lightweight, general system for finding serious storage system errors , 2006, OSDI '06.

[11]  Gerard J. Holzmann,et al.  A mini challenge: build a verifiable filesystem , 2007, Formal Aspects of Computing.

[12]  Junfeng Yang,et al.  Automatically generating malicious disks using symbolic execution , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[13]  Daniel Jackson,et al.  Formal Modeling and Analysis of a Flash Filesystem in Alloy , 2008, ABZ.

[14]  Jim Woodcock,et al.  POSIX and the Verification Grand Challenge: A Roadmap , 2008, 13th IEEE International Conference on Engineering of Complex Computer Systems (iceccs 2008).

[15]  Stephen C. Tweedie,et al.  Journaling the Linux ext2fs Filesystem , 2008 .

[16]  José Nuno Oliveira,et al.  An Integrated Formal Methods Tool-Chain and Its Application to Verifying a File System Model , 2009, SBMF.

[17]  Wim H. Hesselink,et al.  Formalizing a hierarchical file system , 2009, Formal Aspects of Computing.

[18]  Andrea C. Arpaci-Dusseau,et al.  Optimistic crash consistency , 2013, SOSP.

[19]  Gidon Ernst,et al.  Verification of a Virtual Filesystem Switch , 2013, VSTTE.

[20]  Austin T. Clements,et al.  The scalable commutativity rule: designing scalable software for multicore processors , 2013, SOSP.

[21]  Andrea C. Arpaci-Dusseau,et al.  A Study of Linux File System Evolution , 2013, FAST.

[22]  Benjamin Grégoire,et al.  Probabilistic relational verification for cryptographic implementations , 2014, POPL.

[23]  Mark Lillibridge,et al.  Torturing Databases for Fun and Profit , 2014, OSDI.

[24]  Asim Kadav,et al.  Blizzard: Fast, Cloud-scale Block Storage for Cloud-oblivious Applications , 2014, NSDI.

[25]  Andrea C. Arpaci-Dusseau,et al.  All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications , 2014, OSDI.

[26]  Adam Wright,et al.  Local Reasoning for the POSIX File System , 2014, ESOP.

[27]  Sidney Amani,et al.  Specifying a Realistic File System , 2015, MARS.

[28]  Tom Ridge,et al.  SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems , 2015, SOSP.

[29]  Junfeng Yang,et al.  Reducing crash recoverability to reachability , 2016, POPL.

[30]  Jian Xu,et al.  NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories , 2016, FAST.

[31]  Xi Wang,et al.  Specifying and Checking File System Crash-Consistency Models , 2016, ASPLOS.

[32]  Sidney Amani,et al.  Cogent: Verifying High-Assurance File System Implementations , 2016, ASPLOS.

[33]  Markus Wenzel Some aspects of Unix file-system security , 2016 .

[34]  Adam Chlipala,et al.  Using Crash Hoare logic for certifying the FSCQ file system , 2015, USENIX Annual Technical Conference.

[35]  Nicolas Christin,et al.  Push-Button Verification of File Systems via Crash Refinement , 2016, USENIX Annual Technical Conference.

[36]  Dongkun Shin,et al.  iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call , 2017, USENIX Annual Technical Conference.

[37]  Andrea C. Arpaci-Dusseau,et al.  Application Crash Consistency and Performance with CCFS , 2017, USENIX Annual Technical Conference.

[38]  M. Frans Kaashoek,et al.  Scaling a file system to many cores using an operation log , 2017, SOSP.