TRX: A Formally Verified Parser Interpreter

Parsing is an important problem in computer science and yet surprisingly little attention has been devoted to its formal verification. In this paper, we present TRX: a parser interpreter formally developed in the proof assistant Coq, capable of producing formally correct parsers. We are using parsing expression grammars (PEGs), a formalism essentially representing recursive descent parsing, which we consider an attractive alternative to context-free grammars (CFGs). From this formalization we can extract a parser for an arbitrary PEG grammar with the warranty of total correctness, i.e., the resulting parser is terminating and correct with respect to its grammar and the semantics of PEGs; both properties formally proven in Coq.

[1]  Nils Anders Danielsson Total parser combinators , 2010, ICFP '10.

[2]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[3]  Roman R. Redziejowski Parsing Expression Grammar as a Primitive Recursive-Descent Parser with Backtracking , 2007, Fundam. Informaticae.

[4]  Bryan Ford,et al.  Packrat parsing:: simple, powerful, lazy, linear time, functional pearl , 2002, ICFP '02.

[5]  Peyton Jones,et al.  Haskell 98 language and libraries : the revised report , 2003 .

[6]  Otto C. Juelich,et al.  On the recursive programming techniques , 1964, CACM.

[7]  Russell W. Quong,et al.  Adding Semantic and Syntactic Predicates To LL(k): pred-LL(k) , 1994, CC.

[8]  Matthieu Sozeau,et al.  First-Class Type Classes , 2008, TPHOLs.

[9]  Michael Norrish,et al.  Verified, Executable Parsing , 2009, ESOP.

[10]  Matthieu Sozeau Program-ing finger trees in Coq , 2007, ICFP.

[11]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[12]  Bryan Ford,et al.  Packet parsing : a practical linear-time algorithm with backtracking , 2002 .

[13]  Bryan Ford,et al.  Parsing expression grammars: a recognition-based syntactic foundation , 2004, POPL '04.

[14]  Jacek Chrząszcz Implementing Modules in the Coq System , 2003, TPHOLs.

[15]  Jean-Christophe Filliâtre,et al.  Functors for Proofs and Programs , 2004, ESOP.

[16]  Graham Hutton,et al.  Higher-order functions for parsing , 1992, Journal of Functional Programming.

[17]  A. David Milner,et al.  Seeing and doing , 1999 .

[18]  Gerald J. Sussman,et al.  Scheme: A Interpreter for Extended Lambda Calculus , 1998, High. Order Symb. Comput..

[19]  James R. Douglass,et al.  Packrat parsers can support left recursion , 2008, PEPM '08.

[20]  Adam Koprowski,et al.  TRX: A Formally Verified Parser Interpreter , 2010, ESOP.

[21]  Luís Cruz-Filipe,et al.  A Large-Scale Experiment in Executing Extracted Programs , 2006, Calculemus.

[22]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[23]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[24]  Pierre Letouzey Extraction in Coq: An Overview , 2008, CiE.

[25]  Yves Bertot,et al.  Interactive Theorem Proving and Program Development: Coq'Art The Calculus of Inductive Constructions , 2010 .

[26]  James Gosling,et al.  The Java Language Specification, 3rd Edition , 2005 .

[27]  Gregory Malecha,et al.  Certified Web Services in Ynot , 2010 .