Parse Table Composition

Module systems, separate compilation, deployment of binary components, and dynamic linking have enjoyed wide acceptance in programming languages and systems. In contrast, the syntax of languages is usually defined in a non-modular way, cannot be compiled separately, cannot easily be combined with the syntax of other languages, and cannot be deployed as a component for later composition. Grammar formalisms that do support modules use whole program compilation. Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination of extensions. The generation of parse tables is expensive, which is a particular problem when the composition configuration is not fixed to enable users to choose language extensions. In this paper we introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components . Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. While the worst-case time complexity of parse table composition is exponential (like the complexity of parse table generation itself), for realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars.

[1]  Martin Odersky,et al.  Scalable component abstractions , 2005, OOPSLA '05.

[2]  Masaru Tomita,et al.  Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems , 1985 .

[3]  Masaru Tomita,et al.  Efficient parsing for natural language , 1985 .

[4]  Reino Kurki-Suonio,et al.  On computing the transitive closure of a relation , 2004, Acta Informatica.

[5]  Martin Bravenboer,et al.  Exercises in Free Syntax. Syntax Definition, Parsing, and Assimilation of Language Conglomerates , 2003 .

[6]  Eric Van Wyk,et al.  Attribute Grammar-Based Language Extensions for Java , 2007, ECOOP.

[7]  Eelco Visser,et al.  Preventing injection attacks with syntax embeddings , 2007, GPCE '07.

[8]  Eelco Visser,et al.  Preventing Injection Attacks with Syntax Embeddings: A Host and Guest Language Independent Approach , 2007 .

[9]  Martín Abadi,et al.  Extensible Syntax with Lexical Scoping , 1994 .

[10]  Jeffrey D. Ullman,et al.  Introduction to automata theory, languages, and computation, 2nd edition , 2001, SIGA.

[11]  Laurence Tratt,et al.  Domain specific language implementation via compile-time meta-programming , 2008, TOPL.

[12]  Amey Karkare,et al.  Heap reference analysis using access graphs , 2006, ACM Trans. Program. Lang. Syst..

[13]  Eric Van Wyk,et al.  Context-aware scanning for parsing extensible languages , 2007, GPCE '07.

[14]  Wilson C. Hsieh,et al.  Maya: multiple-dispatch syntax extension in Java , 2002, PLDI '02.

[15]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[16]  Xin Qi,et al.  J&: nested intersection for scalable software composition , 2006, OOPSLA '06.

[17]  Mark van den Brand,et al.  Repleo: a syntax-safe template engine , 2007, GPCE '07.

[18]  Gordon S. Novak,et al.  Extensible language implementation , 2002 .

[19]  Robert Grimm,et al.  Better extensibility through modular syntax , 2006, PLDI '06.

[20]  Eelco Visser,et al.  Meta-programming with Concrete Object Syntax , 2002, GPCE.

[21]  Torbjörn Ekman,et al.  The jastadd extensible java compiler , 2007, OOPSLA.

[22]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[23]  Alfred V. Aho,et al.  LR Parsing , 1974, ACM Comput. Surv..

[24]  Ondrej Lhoták,et al.  Jedd: a BDD-based relational extension of Java , 2004, PLDI '04.

[25]  Laurence Tratt,et al.  The Converge programming language. , 2005 .

[26]  Paul Klint,et al.  Incremental Generation of Parsers , 1990, IEEE Trans. Software Eng..

[27]  Franklin L. DeRemer,et al.  Simple LR(k) grammars , 1971, Commun. ACM.

[28]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[29]  Todd D. Millstein,et al.  Practical predicate dispatch , 2004, OOPSLA.

[30]  Gertjan van Noord Treatment of Epsilon Moves in Subset Construction , 1998, Computational Linguistics.

[31]  Esko Nuutila,et al.  Efficient transitive closure computation in large digraphs , 1995 .

[32]  Donald E. Knuth,et al.  On the Translation of Languages from Left to Right , 1965, Inf. Control..

[33]  Michael R. Clarkson,et al.  Polyglot: An Extensible Compiler Framework for Java , 2003, CC.

[34]  Erik Ernst,et al.  ECOOP 2007 - Object-Oriented Programming, 21st European Conference, Berlin, Germany, July 30 - August 3, 2007, Proceedings , 2007, ECOOP.

[35]  Gordon V. Cormack,et al.  Scannerless NSLR(1) parsing of programming languages , 1989, PLDI '89.

[36]  R. Nigel Horspool Incremental Generation of LR Parsers , 1990, Comput. Lang..

[37]  Cecilia R. Aragon,et al.  Randomized search trees , 2005, Algorithmica.

[38]  Eric Van Wyk,et al.  Silver: An extensible attribute grammar system , 2008, Sci. Comput. Program..

[39]  Adrian Johnstone,et al.  Evaluating GLR parsing algorithms , 2006, Sci. Comput. Program..

[40]  Adrian Johnstone,et al.  Generalised reduction modified LR parsing for domain specific language prototyping , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[41]  Eelco Visser,et al.  Stratego/XT 0.16: components for transformation systems , 2006, PEPM '06.

[42]  Eelco Visser,et al.  Syntax definition for language prototyping , 1997 .

[43]  J. Rekers,et al.  Parser Generation for Interactive Environments , 1992 .

[44]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[45]  Marvin V. Zelkowitz,et al.  Programming Languages: Design and Implementation , 1975 .

[46]  Eelco Visser,et al.  Declarative, formal, and extensible syntax definition for aspectJ , 2006, OOPSLA '06.

[47]  Yannis Smaragdakis,et al.  JTS: tools for implementing domain-specific languages , 1998, Proceedings. Fifth International Conference on Software Reuse (Cat. No.98TB100203).

[48]  Ceriel J. H. Jacobs,et al.  Parsing Techniques - A Practical Guide , 2007, Monographs in Computer Science.

[49]  Eelco Visser,et al.  Components for Transformation Systems , 2005 .