A Variant of Earley Parsing

The Earley algorithm is a widely used parsing method in natural language processing applications. We introduce a variant of Earley parsing that is based on a “delayed” recognition of constituents. This allows us to start the recognition of a constituent only in cases in which all of its subconstituents have been found within the input string. This is particularly advantageous in several cases in which partial analysis of a constituent cannot be completed and in general in all cases of productions sharing some suffix of their right-hand sides (even for different left-hand side nonterminals). Although the two algorithms result in the same asymptotic time and space complexity, from a practical perspective our algorithm improves the time and space requirements of the original method, as shown by reported experimental results.

[1]  Oliviero Stock,et al.  Parsing with Flexibility, Dynamic Strategies, and Idioms in Mind , 1989, CL.

[2]  Bernard Lang,et al.  The Structure of Shared Forests in Ambiguous Parsing , 1989, ACL.

[3]  James Kilbury,et al.  A Modification of the Earley-Shieber Algorithm for Direct Parsing of ID/LP Grammars , 1984, GWAI.

[4]  Michael A. Harrison,et al.  Parsing of General Context-Free Languages , 1976, Adv. Comput..

[5]  Patrick Shann Experiments with GLR and Chart Parsing , 1991 .

[6]  Mark-Jan Nederhof,et al.  An Optimal Tabular Parsing Algorithm , 1994, ACL.

[7]  Walter L. Ruzzo,et al.  An Improved Context-Free Recognizer , 1980, ACM Trans. Program. Lang. Syst..

[8]  René Leermakers,et al.  How to Cover a Grammar , 1989, ACL.

[9]  Mark-Jan Nederhof,et al.  Efficient generation of random sentences , 1996, Natural Language Engineering.

[10]  David H. D. Warren,et al.  Parsing as Deduction , 1983, ACL.

[11]  René Leermakers,et al.  A Recursive Ascent Earley Parser , 1992, Inf. Process. Lett..

[12]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[13]  Joop M. I. M. Leo A General Context-Free Parsing Algorithm Running in Linear Time on Every LR (k) Grammar Without Using Lookahead , 1991, Theor. Comput. Sci..

[14]  Dale Gerdemann Using Restriction to Optimize Unification Parsing , 1989, IWPT.

[15]  François Andry,et al.  Interleaving Syntax and Semantics in an Effecient Bottom-Up Parser , 1994, ACL.

[16]  Hans Leiß On Kilbury's modification of Earley's algorithm , 1990, TOPL.

[17]  John A. Carroll Practical unification-based parsing of Natural Language , 1993 .

[18]  W. A. Martin,et al.  Parsing , 1980, ACL.

[19]  John Bear,et al.  A Breadth-First Parsing Model , 1983, IJCAI.

[20]  G. Edward Barton,et al.  On the complexity of ID/LP parsing 1 , 1985 .

[21]  Anton Nijholt,et al.  Context-free grammars: Covers, normal forms, and parsing , 1980, Lecture Notes in Computer Science.

[22]  G. E. Barton Jr. On the Complexity of ID/LP Parsing , 1985, CL.

[23]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[24]  René Leermakers Recursive Ascent Parsing: From Earley to Marcus , 1992, Theor. Comput. Sci..

[25]  Giorgio Satta,et al.  Efficient Tabular LR Parsing , 1996, ACL.

[26]  Stuart M. Shieber,et al.  Using Restriction to Extend Parsing Algorithms for Complex-Feature-Based Formalisms , 1985, ACL.

[27]  Mats Wirén,et al.  A Comparison of Rule-Invocation Strategies in Context-Free Chart Parsing , 1987, EACL.