Toward a verified relational database management system

We report on our experience implementing a lightweight, fully verified relational database management system (RDBMS). The functional specification of RDBMS behavior, RDBMS implementation, and proof that the implementation meets the specification are all written and verified in Coq. Our contributions include: (1) a complete specification of the relational algebra in Coq; (2) an efficient realization of that model (B+ trees) implemented with the Ynot extension to Coq; and (3) a set of simple query optimizations proven to respect both semantics and run-time cost. In addition to describing the design and implementation of these artifacts, we highlight the challenges we encountered formalizing them, including the choice of representation for finite relations of typed tuples and the challenges of reasoning about data structures with complex sharing. Our experience shows that though many challenges remain, building fully-verified systems software in Coq is within reach.

[1]  Stephen Brookes A semantics for concurrent separation logic , 2007, Theor. Comput. Sci..

[2]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[3]  U. Norell,et al.  Towards a practical programming language based on dependent type theory , 2007 .

[4]  Conor McBride,et al.  Elimination with a Motive , 2000, TYPES.

[5]  Adam Chlipala,et al.  Effective interactive proofs for higher-order imperative programs , 2009, ICFP.

[6]  Richard Bornat,et al.  Local reasoning, separation and aliasing , 2003 .

[7]  Peter W. O'Hearn,et al.  Local Reasoning about Programs that Alter Data Structures , 2001, CSL.

[8]  Serge Abiteboul,et al.  Foundations of Databases: The Logical Level , 1995 .

[9]  Carsten Sinz System Description: ARA - An Automatic Theorem Prover for Relation Algebras , 2000, CADE.

[10]  J. McKinna FUNCTIONAL PEARL A type-correct , stack-safe , provably correct expression compiler in Epigram , 2006 .

[11]  Lars Birkedal,et al.  Polymorphism and separation in hoare type theory , 2006, ICFP '06.

[12]  C. J. Date An Introduction to Database Systems , 1975 .

[13]  Matthieu Sozeau Program-ing finger trees in Coq , 2007, ICFP '07.

[14]  P. Rajagopalan,et al.  A Generic Algebra for Data Collections Based on Constructive Logic , 1995, AMAST.

[15]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[16]  Pierre Castéran,et al.  Interactive Theorem Proving and Program Development , 2004, Texts in Theoretical Computer Science An EATCS Series.

[17]  Ramez Elmasri,et al.  Fundamentals of Database Systems, 5th Edition , 2006 .

[18]  Margo I. Seltzer,et al.  Data Management for Internet-Scale Single-Sign-On , 2006, WORLDS.

[19]  Stéphane Bressan,et al.  Introduction to Database Systems , 2005 .

[20]  Conor McBride,et al.  The view from the left , 2004, Journal of Functional Programming.

[21]  Peter W. O'Hearn,et al.  Resources, concurrency, and local reasoning , 2007 .

[22]  Calisto Zuzarte,et al.  Exploiting constraint-like data characterizations in query optimization , 2001, SIGMOD '01.

[23]  Lars Birkedal,et al.  Ynot: dependent types for imperative programs , 2008, ICFP.

[24]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[25]  Wouter Swierstra,et al.  The power of Pi , 2008, ICFP.

[26]  Carlos Gonzalía Relations in Dependent Type Theory , 2006 .

[27]  Peter W. O'Hearn,et al.  Resources, Concurrency and Local Reasoning , 2004, CONCUR.

[28]  Matthieu Sozeau,et al.  First-Class Type Classes , 2008, TPHOLs.

[29]  Alan P. Sexton,et al.  Reasoning about B+ Trees with Operational Semantics and Separation Logic , 2008, MFPS.

[30]  Yves Bertot,et al.  Interactive Theorem Proving and Program Development: Coq'Art The Calculus of Inductive Constructions , 2010 .

[31]  J. Gregory Morrisett,et al.  Towards type-theoretic semantics for transactional concurrency , 2009, TLDI '09.