Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

We describe a recognition algorithm for a subset of binary linear context-free rewriting systems (LCFRS) with running time O(nωd) where M(m) = O(mω) is the running time for m × m matrix multiplication and d is the “contact rank” of the LCFRS—the maximal number of combination and non-combination points that appear in the grammar rules. We also show that this algorithm can be used as a subroutine to obtain a recognition algorithm for general binary LCFRS with running time O(nωd+1). The currently best known ω is smaller than 2.38. Our result provides another proof for the best known result for parsing mildly context-sensitive formalisms such as combinatory categorial grammars, head grammars, linear indexed grammars, and tree-adjoining grammars, which can be parsed in time O(n4.76). It also shows that inversion transduction grammars can be parsed in time O(n5.76). In addition, binary LCFRS subsumes many other formalisms and types of grammars, for some of which we also improve the asymptotic complexity of parsing.

[1]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[2]  Laura Kallmeyer,et al.  Parsing Beyond Context-Free Grammars , 2010, Cognitive Technologies.

[3]  Noah A. Smith,et al.  Products of weighted logic programs , 2010, Theory and Practice of Logic Programming.

[4]  Ran Raz,et al.  On the complexity of matrix product , 2002, STOC '02.

[5]  Giorgio Satta,et al.  Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication , 1994, Comput. Linguistics.

[6]  Daniel Gildea,et al.  Binarization of Synchronous Context-Free Grammars , 2009, CL.

[7]  David J. Weir,et al.  The equivalence of four extensions of context-free grammars , 1994, Mathematical systems theory.

[8]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[9]  Noah A. Smith,et al.  Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language , 2005, HLT.

[10]  Aravind K. Joshi,et al.  Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.

[11]  Joan-Andreu Sánchez,et al.  Fast Stochastic Context-Free Parsing: A Stochastic Version of the Valiant Algorithm , 2007, IbPRIA.

[12]  Daniel Gildea,et al.  Grammar Factorization by Tree Decomposition , 2011, Computational Linguistics.

[13]  R. Nakanishi Efficient Recognition Algorithms for Parallel Multiple Context-Free Languages and for Multiple Context-Free Languages , 1998 .

[14]  Amir Abboud,et al.  If the Current Clique Algorithms are Optimal, So is Valiant's Parser , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[15]  John Cocke,et al.  Programming languages and their compilers: Preliminary notes , 1969 .

[16]  V. Strassen Gaussian elimination is not optimal , 1969 .

[17]  Emal Pasarly Time , 2011, Encyclopedia of Evolutionary Psychological Science.

[18]  Giorgio Satta,et al.  Approximate PCFG Parsing Using Tensor Decomposition , 2013, NAACL.

[19]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[20]  Wojciech Rytter,et al.  Context-Free Recognition via Shortest Paths Computation: A Version of Valiant's Algorithm , 1995, Theor. Comput. Sci..

[21]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[22]  Leslie G. Valiant,et al.  General Context-Free Recognition in Less than Cubic Time , 1975, J. Comput. Syst. Sci..

[23]  Stuart M. Shieber,et al.  Evidence against the context-freeness of natural language , 1985 .

[24]  Giorgio Satta,et al.  Recognition of Linear Context-Free Rewriting Systems , 1992, ACL.

[25]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[26]  V. S. Costa,et al.  Theory and Practice of Logic Programming , 2010 .

[27]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[28]  Carl Jesse Pollard,et al.  Generalized phrase structure grammars, head grammars, and natural language , 1984 .

[29]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[30]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[31]  Raffaella Bernardi,et al.  The Syntactic Process: Language, Speech, and Communication, Mark Steedman , 2004, J. Log. Lang. Inf..

[32]  Gerald Gazdar,et al.  Applicability of Indexed Grammars to Natural Languages , 1988 .

[33]  Sanguthevar Rajasekaran,et al.  TAL Recognition in O(M(n»)) Time , 1998, J. Comput. Syst. Sci..

[34]  François Le Gall,et al.  Powers of tensors and fast matrix multiplication , 2014, ISSAC.

[35]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[36]  Alfred V. Aho,et al.  Indexed Grammars—An Extension of Context-Free Grammars , 1967, SWAT.