Using Semantic Cues to Learn Syntax

We present a method for dependency grammar induction that utilizes sparse annotations of semantic relations. This induction set-up is attractive because such annotations provide useful clues about the underlying syntactic structure, and they are readily available in many domains (e.g., info-boxes and HTML markup). Our method is based on the intuition that syntactic realizations of the same semantic predicate exhibit some degree of consistency. We incorporate this intuition in a directed graphical model that tightly links the syntactic and semantic structures. This design enables us to exploit syntactic regularities while still allowing for variations. Another strength of the model lies in its ability to capture non-local dependency relations. Our results demonstrate that even a small amount of semantic annotations greatly improves the accuracy of learned dependencies when tested on both in-domain and out-of-domain texts.

[1]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[2]  Mark Johnson,et al.  Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing , 2009, NAACL.

[3]  Raymond J. Mooney,et al.  Learning a Compositional Semantic Parser using an Existing Syntactic Parser , 2009, ACL.

[4]  Noah D. Goodman,et al.  A Bayesian Model of the Acquisition of Compositional Semantics , 2008 .

[5]  Valentin I. Spitkovsky,et al.  Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing , 2010, ACL.

[6]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[7]  Mark Johnson,et al.  Using Universal Linguistic Knowledge to Guide Grammar Induction , 2010, EMNLP.

[8]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[9]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[10]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.

[11]  Thomas L. Griffiths,et al.  Bayesian Inference for PCFGs via Markov Chain Monte Carlo , 2007, NAACL.

[12]  David Goddeau,et al.  Automatic grammar induction from semantic parsing , 1998, ICSLP.

[13]  Phil Blunsom,et al.  Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing , 2010, EMNLP.

[14]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[15]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[16]  Gideon S. Mann,et al.  Semi-supervised Learning of Dependency Parsers using Generalized Expectation Criteria , 2009, ACL/IJCNLP.

[17]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .