Learning latent variable grammars from complementary perspectives

The corpus for training a parser consists of sentences of heterogeneous grammar usages. Previous parser domain adaptation work has concentrated on adaptation to the shifts in vocabulary rather than grammar usage. In this paper, we focus on exploiting the diversity of training date separately and then accumulates their advantages. We propose an approach that grammar is biased toward relevant syntactic style, and the complementary grammar usage are combined for inference. Multiple grammars with partly complementary points of strength are induced individually. They capture complementary data representation, and we accumulates their advantages in a joint model to assemble the complementary depicting powers. Despite its compatibility with many other methods, out product model achieves 85.20% F1 score on Penn Chinese Treebank, higher than previous systems.

[1]  Kevin Knight,et al.  Combining Constituent Parsers , 2009, NAACL.

[2]  Meng Zhang,et al.  Refining Grammars for Parsing with Hierarchical Semantic Knowledge , 2009, EMNLP.

[3]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[4]  Eugene Charniak,et al.  Reranking and Self-Training for Parser Adaptation , 2006, ACL.

[5]  Barbara Plank,et al.  Subdomain Sensitive Statistical Parsing using Raw Corpora , 2008, LREC.

[6]  Mary P. Harper,et al.  Self-Training PCFG Grammars with Latent Annotations Across Languages , 2009, EMNLP.

[7]  Christopher D. Manning,et al.  Hierarchical Bayesian Domain Adaptation , 2009, NAACL.

[8]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[9]  Mark Johnson Grammars and Topic Models , 2013, MOL.

[10]  Nianwen Xue,et al.  Building a Large-Scale Annotated Chinese Corpus , 2002, COLING.

[11]  Jun'ichi Tsujii,et al.  Probabilistic CFG with Latent Annotations , 2005, ACL.

[12]  Eugene Charniak,et al.  Automatic Domain Adaptation for Parsing , 2010, NAACL.

[13]  Mark Johnson,et al.  Reranking the Berkeley and Brown Parsers , 2010, HLT-NAACL.

[14]  Slav Petrov,et al.  Products of Random Latent Variable Grammars , 2010, NAACL.

[15]  Hiroyuki Shindo,et al.  Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing , 2012, ACL.

[16]  Karl Stratos,et al.  Spectral Learning of Latent-Variable PCFGs , 2012, ACL.

[17]  Haizhou Li,et al.  K-Best Combination of Syntactic Parsers , 2009, EMNLP.