Complexity in Left-Associative Grammar

This paper presents a mathematical definition of left-associative grammar, and describes its formal properties. Conceptually, LA-grammar is based on the notion of possible continuations, in contrast to more traditional systems such as phrase structure grammar and categorial grammar, which are linguistically motivated in terms of possible substitutions. It is shown that LA-grammar generates all and only the recursive languages. The Chomsky hierarchy of regular, context-free, and context-sensitive languages is reconstructed in LA-grammar by simulating finite-state automata, pushdown automata, and linearly bounded automata, respectively. Using alternative restrictions on LA-grammars, the new language hierarchy of A-LAGs, B-LAGs, C-LAGs is proposed. The class of C-LAGs is divided into three subclasses representing different degrees of ambiguity and associated computational complexity. The class of C-LAGs without recursive ambiguities (called the C1-LAGs) parses in linear time, and includes all deterministic CF-languages, plus CF-languages with non-recursive ambiguities, e.g. anbncmdm ∪ anbmcmdn, plus many context-sensitive languages, such as anbncn, anbncndnen, {anbncn};∗, a2i, akbmck·m and ai!. The class of C-LAGs with recursive “single return” ambiguities (called C2-LAGs) parses in n2, and includes certain nondeterministic CF-languages such as WWR, plus context-sensitive languages like WW, WWW, WWWWW and {WWW}∗. Finally, the class of unrestricted C-LAGs (called C3-LAGs) parses in exponential time and contains CF-languages like Lno and the “hardest context-free language” HCFL, plus context-sensitive languages like N P-complete Subset Sum and SAT.

[1]  P. Stanley Peters,et al.  On Restricting the Base Component of Transformational Grammars , 1971, Inf. Control..

[2]  Roland Hauser NEWCAT: Parsing Natural Language Using Left-Associative Grammar , 1986, Lecture Notes in Computer Science.

[3]  Takeshi Hayashi On Derivation Trees of Indexed Grammars —An Extension of the uvwxy-Theorem— , 1973 .

[4]  Robert C. Berwick,et al.  Computational complexity and natural language , 1987 .

[5]  Jeffrey D. Ullman,et al.  Formal languages and their relation to automata , 1969, Addison-Wesley series in computer science and information processing.

[6]  P. Stanley Peters,et al.  On the generative power of transformational grammars , 1973, Inf. Sci..

[7]  Emil L. Post Finite combinatory processes—formulation , 1936, Journal of Symbolic Logic.

[8]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[9]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[10]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[11]  S. Leśniewski Grundzüge eines neuen Systems der Grundlagen der Mathematik , 1929 .

[12]  Seymour Ginsburg,et al.  Abstract Families of Languages , 1967, SWAT.

[13]  Sheila A. Greibach,et al.  The Hardest Context-Free Language , 1973, SIAM J. Comput..

[14]  Robert C. Berwick,et al.  The Grammatical Basis of Linguistic Performance: Language Use and Acquisition , 1986 .

[15]  Roland Hausser,et al.  Computation of Language , 1989, Symbolic Computation.

[16]  Ronald V. Book,et al.  Formal language theory : perspectives and open problems , 1980 .