论文信息 - Syntactic Parse Fusion

Syntactic Parse Fusion

Model combination techniques have consistently shown state-of-the-art performance across multiple tasks, including syntactic parsing. However, they dramatically increase runtime and can be difficult to employ in practice. We demonstrate that applying constituency model combination techniques to n-best lists instead of n different parsers results in significant parsing accuracy improvements. Parses are weighted by their probabilities and combined using an adapted version of Sagae and Lavie (2006). These accuracy gains come with marginal computational costs and are obtained on top of existing parsing techniques such as discriminative reranking and self-training, resulting in state-of-the-art accuracy: 92.6% on WSJ section 23. On out-of-domain corpora, accuracy is improved by 0.4% on average. We empirically confirm that six well-known n-best parsers benefit from the proposed methods across six domains.

Eugene Charniak | David McClosky | Do Kook Choe

[1] Alexandra Kinyon,et al. Building a Treebank for French , 2000, LREC.

[2] Kevin Knight,et al. Combining Constituent Parsers , 2009, NAACL.

[3] M. Maamouri,et al. The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[4] Dan Klein,et al. Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[5] Robert Dale,et al. Charting Democracy Across Parsers , 2007, ALTA.

[6] Eugene Charniak,et al. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[7] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Sabine Brants,et al. The TIGER Treebank , 2001 .

[9] Mark Johnson,et al. Reranking the Berkeley and Brown Parsers , 2010, HLT-NAACL.

[10] Reut Tsarfaty,et al. Introducing the SPMRL 2014 Shared Task on Parsing Morphologically-rich Languages , 2014 .

[11] Brian Roark,et al. MAP adaptation of stochastic grammars , 2006, Comput. Speech Lang..

[12] Haizhou Li,et al. K-Best Combination of Syntactic Parsers , 2009, EMNLP.

[13] Joakim Nivre,et al. Analyzing and Integrating Dependency Parsers , 2011, CL.

[14] Alon Lavie,et al. Parser Combination by Reparsing , 2006, NAACL.

[15] Liang Huang,et al. Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[16] Mary P. Harper,et al. Self-Training with Products of Latent Variable Grammars , 2010, EMNLP.

[17] Slav Petrov,et al. Products of Random Latent Variable Grammars , 2010, NAACL.

[18] Andrew Y. Ng,et al. Parsing with Compositional Vector Grammars , 2013, ACL.

[19] Christopher D. Manning,et al. Better Arabic Parsing: Baselines, Evaluations, and Analysis , 2010, COLING.