Concavity and Initialization for Unsupervised Dependency Parsing

We investigate models for unsupervised learning with concave log-likelihood functions. We begin with the most well-known example, IBM Model 1 for word alignment (Brown et al., 1993) and analyze its properties, discussing why other models for unsupervised learning are so seldom concave. We then present concave models for dependency grammar induction and validate them experimentally. We find our concave models to be effective initializers for the dependency model of Klein and Manning (2004) and show that we can encode linguistic knowledge in them for improved performance.

[1]  Ben Taskar,et al.  Posterior Sparsity in Unsupervised Dependency Parsing , 2011, J. Mach. Learn. Res..

[2]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[3]  Valentin I. Spitkovsky,et al.  From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing , 2010, NAACL.

[4]  Yoav Goldberg,et al.  EM Can Find Pretty Good HMM POS-Taggers (When Given a Good Start) , 2008, ACL.

[5]  Vladimir Solmon,et al.  The estimation of stochastic context-free grammars using the Inside-Outside algorithm , 2003 .

[6]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[7]  Noah A. Smith,et al.  Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction , 2009, NAACL.

[8]  Zdeněk Žabokrtský,et al.  Gibbs Sampling with Treeness Constraint in Unsupervised Dependency Parsing , 2011 .

[9]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[10]  Valentin I. Spitkovsky,et al.  Unsupervised Dependency Parsing without Gold Part-of-Speech Tags , 2011, EMNLP.

[11]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[12]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[13]  Phil Blunsom,et al.  Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing , 2010, EMNLP.

[14]  John DeNero,et al.  Painless Unsupervised Learning with Features , 2010, NAACL.

[15]  Kristina Toutanova,et al.  Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity , 2011, ACL.

[16]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[17]  Samuel Brody It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment , 2010, EMNLP.

[18]  Mark Johnson,et al.  Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing , 2009, NAACL.

[19]  Mark Johnson,et al.  Using Universal Linguistic Knowledge to Guide Grammar Induction , 2010, EMNLP.

[20]  Valentin I. Spitkovsky,et al.  Punctuation: Making a Point in Unsupervised Dependency Parsing , 2011, CoNLL.