Generalized Bottom Up Parsers With Reduced Stack Activity

We describe a generalized bottom up parser in which non-embedded recursive rules are handled directly by the underlying automaton, thus limiting stack activity to the activation of rules displaying embedded recursion. Our strategy is motivated by Aycock and Horspool's approach, but uses a different automaton construction and leads to parsers that are correct for all context-free grammars, including those with hidden left recursion. The automaton features edges which directly connnect states containing reduction actions with their associated goto state: hence we call the approach reduction incorporated generalized LR parsing. Our parser constructs shared packed parse forests in a style similar to that of Tomita parsers. We give formal proofs of the correctness of our algorithm, and compare it with Tomita's algorithm in terms of the space and time requirements of the running parsers and the size of the parsers' tables. Experimental results are given for standard grammars for ANSI-C, ISO-Pascal; for a non-deterministic grammar for IBM VS-COBOL, and for a small grammar that triggers asymptotic worst case behaviour in our parser.

[1]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[2]  Adrian Johnstone,et al.  The Grammar Tool Box: A Case Study Comparing GLR Parsing Algorithms , 2004, LDTA@ETAPS.

[3]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[4]  Wojciech Rytter,et al.  Context-Free Recognition via Shortest Paths Computation: A Version of Valiant's Algorithm , 1995, Theor. Comput. Sci..

[5]  Adrian Johnstone,et al.  Generalised reduction modified LR parsing for domain specific language prototyping , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[6]  Bjarne Stroustrup,et al.  The C++ programming language (2nd ed.) , 1991 .

[7]  Peter T. Breuer,et al.  A prettier compiler‐compiler: Generating higher‐order parsers in C , 1995, Softw. Pract. Exp..

[8]  Adrian Johnstone,et al.  Generalised Regular Parsers , 2003, CC.

[9]  Adrian Johnstone,et al.  Generalised Parsing: Some Costs , 2004, CC.

[10]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[11]  Bernard Lang,et al.  The Structure of Shared Forests in Ambiguous Parsing , 1989, ACL.

[12]  Terence Parr Language Translation Using PCCTS and C , 1999 .

[13]  Masaru Tomita,et al.  Efficient parsing for natural language , 1985 .

[14]  Donald E. Knuth,et al.  On the Translation of Languages from Left to Right , 1965, Inf. Control..

[15]  Paul Klint,et al.  Compiling language definitions: the ASF+SDF compiler , 2000, TOPL.

[16]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[17]  Borivoj Melichar,et al.  Even faster generalized LR parsing , 2001, Acta Informatica.

[18]  Bjarne Stroustrup,et al.  The Design and Evolution of C , 1994 .

[19]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[20]  Ralf Lämmel,et al.  Semi‐automatic grammar recovery , 2001, Softw. Pract. Exp..