Weighted Parsing for Grammar-Based Language Models over Multioperator Monoids

We develop a general framework for weighted parsing which is built on top of grammar-based language models and employs multioperator monoids as weight algebras. It generalizes previous work in that area (semiring parsing, weighted deductive parsing) and also covers applications outside the classical scope of parsing, e.g., algebraic dynamic programming. We show an algorithm for weighted parsing and, for a large class of weighted grammar-based language models, we prove formally that it terminates and is correct.

[1]  Bruno Courcelle,et al.  The Monadic Second-Order Logic of Graphs V: On Closing the Gap Between Definability and Recognizability , 1991, Theor. Comput. Sci..

[2]  Samuel Eilenberg,et al.  Automata, languages, and machines. A , 1974, Pure and applied mathematics.

[3]  Giorgio Satta,et al.  Probabilistic Parsing , 2008, New Developments in Formal Languages and Applications.

[4]  Heiko Vogler,et al.  EM-Training for Weighted Aligned Hypergraph Bimorphisms , 2016, ACL 2016.

[5]  Manfred Droste,et al.  The Chomsky-SCHüTzenberger Theorem for Quantitative Context-Free Languages , 2012, Int. J. Found. Comput. Sci..

[6]  Heiko Vogler,et al.  Weighted parsing for grammar-based language models , 2019, FSMNLP.

[7]  M. Droste,et al.  Semirings and Formal Power Series , 2009 .

[8]  G. Ritter,et al.  Lattice Theory , 2021, Introduction to Lattice Algebra.

[9]  David H. D. Warren,et al.  Parsing as Deduction , 1983, ACL.

[10]  Zoltán Fülöp,et al.  Weighted iterated linear control , 2018, Acta Informatica.

[11]  Walter S. Brainerd,et al.  Tree Generating Regular Systems , 1969, Inf. Control..

[12]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[13]  P. J. Higgins Algebras with a Scheme of Operators , 1963 .

[14]  G. Karner On limits in complete semirings , 1992 .

[15]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[16]  Herbert Klaeren,et al.  A Contructive Method for Abstract Algebraic Software Specification , 1984, Theor. Comput. Sci..

[17]  Zoltán Fülöp,et al.  A Kleene Theorem for Weighted Tree Automata over Distributive Multioperator Monoids , 2007, Theory of Computing Systems.

[18]  Robert Giegerich,et al.  A discipline of dynamic programming over sequence data , 2004, Sci. Comput. Program..

[19]  Mehryar Mohri,et al.  Semiring Frameworks and Algorithms for Shortest-Distance Problems , 2002, J. Autom. Lang. Comb..

[20]  Vladimir Solmon,et al.  The estimation of stochastic context-free grammars using the Inside-Outside algorithm , 2003 .

[21]  Stuart M. Shieber,et al.  Principles and Implementation of Deductive Parsing , 1994, J. Log. Program..

[22]  Heiko Vogler,et al.  Tree Parsing with Synchronous Tree-Adjoining Grammars , 2011, IWPT.

[23]  Berndt Farwer,et al.  ω-automata , 2002 .

[24]  Alexander Koller,et al.  Decomposing TAG Algorithms Using Simple Algebraizations , 2012, TAG.

[25]  J. Golan Semirings and their applications , 1999 .

[26]  Joshua Goodman,et al.  Semiring Parsing , 1999, CL.

[27]  Bernard Lang,et al.  The Structure of Shared Forests in Ambiguous Parsing , 1989, ACL.

[28]  Laura Kallmeyer,et al.  Parsing Beyond Context-Free Grammars , 2010, Cognitive Technologies.

[29]  Annegret Habel,et al.  Some Structural Aspects of Hypergraph Languages Generated by Hyperedge Replacement , 1987, STACS.

[30]  David J. Weir,et al.  Characterizing Structural Descriptions Produced by Various Grammatical Formalisms , 1987, ACL.

[31]  Joseph A. Goguen,et al.  Initial Algebra Semantics and Continuous Algebras , 1977, J. ACM.

[32]  Tadao Kasami,et al.  On Multiple Context-Free Grammars , 1991, Theor. Comput. Sci..

[33]  Bruno Courcelle,et al.  Graph expressions and graph rewritings , 1987, Mathematical systems theory.

[34]  J. Baker Trainable grammars for speech recognition , 1979 .

[35]  M. Nederhof Squibs and Discussions: Weighted Deductive Parsing and Knuth’s Algorithm , 2003, CL.

[36]  Donald E. Knuth,et al.  A Generalization of Dijkstra's Algorithm , 1977, Inf. Process. Lett..

[37]  Bernard Lang,et al.  RECOGNITION CAN BE HARDER THAN PARSING , 1994, Comput. Intell..

[38]  Nabil A. Khabbaz Control Sets on Linear Grammars , 1974, Inf. Control..

[39]  Joost Engelfriet,et al.  Tree Automata and Tree Grammars , 2015, ArXiv.

[40]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[41]  Alexander Koller,et al.  A Generalized View on Parsing and Translation , 2011, IWPT.

[42]  NederhofMark-Jan Weighted deductive parsing and Knuth's algorithm , 2003 .

[43]  Giorgio Satta,et al.  Probabilistic Parsing as Intersection , 2003, IWPT.

[44]  Heiko Vogler,et al.  Tree parsing for tree-adjoining machine translation , 2014, J. Log. Comput..

[45]  Aravind K. Joshi,et al.  Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.

[46]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .