Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency

We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The product model outperforms both components on their respective evaluation metrics, giving the best published figures for unsupervised dependency parsing and unsupervised constituency parsing. We also demonstrate that the combined model works and is robust cross-linguistically, being able to exploit either attachment or distributional regularities that are salient in the data.

[1]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[2]  Steven Abney,et al.  The English Noun Phrase in its Sentential Aspect , 1972 .

[3]  J. Baker Trainable grammars for speech recognition , 1979 .

[4]  Pat Langley,et al.  A Production System Model of First Language Acquisition , 1980, COLING.

[5]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[6]  J. Wolff Learning Syntax and Meanings Through Optimization and Distributional Analysis , 1988 .

[7]  Fernando Pereira,et al.  Inside-Outside Reestimation From Partially Bracketed Corpora , 1992, HLT.

[8]  F. Pereira,et al.  Inside-Outside Reestimation From Partially Bracketed Corpora , 1992, ACL.

[9]  Glenn Carroll,et al.  Two Experiments on Learning Probabilistic Dependency Grammars from Corpora , 1992 .

[10]  Eric Brill,et al.  Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, ACL.

[11]  Andreas Stolcke,et al.  Inducing Probabilistic Grammars by Bayesian Model Merging , 1994, ICGI.

[12]  Hinrich Schütze Distributional Part-of-Speech Tagging , 1995, EACL.

[13]  Hinrich Schütze,et al.  Distributional Part-of-Speech Tagging , 1995, EACL.

[14]  Steven Finch,et al.  Finding structure in language , 1995 .

[15]  Stanley F. Chen,et al.  Bayesian Grammar Induction for Language Modeling , 1995, ACL.

[16]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[17]  Deniz Yuret,et al.  Discovery of linguistic relations using lexical attraction , 1998, ArXiv.

[18]  Alexander Clark,et al.  Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.

[19]  Menno van Zaanen,et al.  ABL: Alignment-Based Learning , 2000, COLING.

[20]  Mark A. Paskin,et al.  Grammatical Bigrams , 2001, NIPS.

[21]  Alexander Clark Unsupervised induction of stochastic context-free grammars using distributional clustering , 2001, CoNLL.

[22]  Eytan Ruppin,et al.  Automatic Acquisition and Efficient Representation of Syntactic Structures , 2002, NIPS.

[23]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[24]  Dan Klein,et al.  A Generative Constituent-Context Model for Improved Grammar Induction , 2002, ACL.

[25]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[26]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.