Dependency Parse Reranking with Rich Subtree Features

In pursuing machine understanding of human language, highly accurate syntactic analysis is a crucial step. In this work, we focus on dependency grammar, which models syntax by encoding transparent predicate-argument structures. Recent advances in dependency parsing have shown that employing higher-order subtree structures in graph-based parsers can substantially improve the parsing accuracy. However, the inefficiency of this approach increases with the order of the subtrees. This work explores a new reranking approach for dependency parsing that can utilize complex subtree representations by applying efficient subtree selection methods. We demonstrate the effectiveness of the approach in experiments conducted on the Penn Treebank and the Chinese Treebank. Our system achieves the best performance among known supervised systems evaluated on these datasets, improving the baseline accuracy from 91.88% to 93.42% for English, and from 87.39% to 89.25% for Chinese.

[1]  Kentaro Torisawa,et al.  EXPLOITING SUBTREES IN AUTO‐PARSED DATA TO IMPROVE DEPENDENCY PARSING , 2012, Comput. Intell..

[2]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[3]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[4]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[5]  Joost N. Kok,et al.  Efficient discovery of frequent unordered trees , 2003 .

[6]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[7]  Daisuke Kawahara,et al.  Case Frame Compilation from the Web using High-Performance Computing , 2006, LREC.

[8]  Kun Yu,et al.  Chinese Dependency Parsing with Large Scale Automatically Constructed Case Structures , 2008, COLING.

[9]  Kentaro Torisawa,et al.  Improving Dependency Parsing with Subtrees from Auto-Parsed Data , 2009, EMNLP.

[10]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[11]  Bo Xu,et al.  Probabilistic Models for Action-Based Chinese Dependency Parsing , 2007, ECML.

[12]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[13]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[14]  Yue Zhang,et al.  Semi-Supervised Feature Transformation for Dependency Parsing , 2013, EMNLP.

[15]  Keith Hall,et al.  K-best Spanning Tree Parsing , 2007, ACL.

[16]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[17]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[18]  Xavier Carreras,et al.  An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing , 2009, EMNLP.

[19]  Xiao Chen,et al.  Combine Constituent and Dependency Parsing via Reranking , 2013, IJCAI.

[20]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[21]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[22]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[23]  Yuji Matsumoto,et al.  Third-order Variational Reranking on Packed-Shared Dependency Forests , 2011, EMNLP.

[24]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[25]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[26]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[27]  Daisuke Kawahara,et al.  A Reranking Approach for Dependency Parsing with Variable-sized Subtree Features , 2012, PACLIC.

[28]  S. Kurohashi,et al.  Dependency Parse Reranking Based on Subtree Extraction , 2013 .

[29]  Hiroki Arimura,et al.  Discovering Frequent Substructures in Large Unordered Trees , 2003, Discovery Science.

[30]  Hiroki Arimura,et al.  Optimized Substructure Discovery for Semi-structured Data , 2002, PKDD.

[31]  Joseph Le Roux,et al.  Generative Constituent Parsing and Discriminative Dependency Reranking: Experiments on English and French , 2012, SPMRL@ACL 2012.

[32]  Rens Bod,et al.  A generative re-ranking model for dependency parsing , 2009, IWPT.