Fast context-free grammar parsing requires fast boolean matrix multiplication

In 1975, Valiant showed that Boolean matrix multiplication can be used for parsing context-free grammars (CFGs), yielding the asympotically fastest (although not practical) CFG parsing algorithm known. We prove a dual result: any CFG parser with time complexity <i>O</i>(<i>gn</i><sup>3-∈</sup>), where <i>g</i> is the size of the grammar and <i>n</i> is the length of the input string, can be efficiently converted into an algorithm to multiply <i>m</i> × <i>m</i> Boolean matrices in time <i>O</i>(<i>m</i><sup>3-∈/3</sup>). Given that practical, substantially subcubic Boolean matrix multiplication algorithms have been quite difficult to find, we thus explain why there has been little progress in developing practical, substantially subcubic general CFG parsers. In proving this result, we also develop a formalization of the notion of parsing.

[1]  V. Strassen Gaussian elimination is not optimal , 1969 .

[2]  Giorgio Satta,et al.  Efficient Tabular LR Parsing , 1996, ACL.

[3]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[4]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[5]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[8]  J. Hartmanis,et al.  On the Computational Complexity of Algorithms , 1965 .

[9]  Leslie G. Valiant,et al.  General Context-Free Recognition in Less than Cubic Time , 1975, J. Comput. Syst. Sci..

[10]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[11]  Mithuna Thottethodi,et al.  Tuning Strassen's Matrix Multiplication for Memory Efficiency , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[12]  Jan van Leeuwen,et al.  Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity , 1994 .

[13]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[14]  Volker Strassen,et al.  Algebraic Complexity Theory , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[15]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[16]  Walter L. Ruzzo,et al.  An Improved Context-Free Recognizer , 1980, ACM Trans. Program. Lang. Syst..

[17]  Michael A. Harrison,et al.  Introduction to formal language theory , 1978 .

[18]  David H. Bailey,et al.  Extra high speed matrix multiplication on the Cray-2 , 1988 .

[19]  Sanguthevar Rajasekaran,et al.  TAL Recognition in O(M(n»)) Time , 1998, J. Comput. Syst. Sci..

[20]  Wojciech Rytter,et al.  Fast Recognition of Pushdown Automaton and Context-free Languages , 1986, Inf. Control..

[21]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[22]  Ivan M. Havel,et al.  On the Parsing of Deterministic Languages , 1974, JACM.

[23]  Walter L. Ruzzo,et al.  On the Complexity of General Context-Free Language Parsing and Recognition (Extended Abstract) , 1979, ICALP.

[24]  Wojciech Rytter,et al.  Context-Free Recognition via Shortest Paths Computation: A Version of Valiant's Algorithm , 1995, Theor. Comput. Sci..

[25]  Hervé Gallaire,et al.  Recognition Time of Context-Free Languages by On-Line Turing Machines , 1969, Inf. Control..

[26]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[27]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[28]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[29]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[30]  Giorgio Satta,et al.  Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication , 1994, Comput. Linguistics.

[31]  Joel I. Seiferas,et al.  A Simplified Lower Bound for Context-Free-Language Recognition , 1986, Inf. Control..