Heterogeneous Parsing via Collaborative Decoding

There often exist multiple corpora for the same natural language processing (NLP) tasks. However, such corpora are generally used independently due to distinctions in annotation standards. For the purpose of full use of readily available human annotations, it is significant to simultaneously utilize multiple corpora of different annotation standards. In this paper, we focus on the challenge of constituent syntactic parsing with treebanks of different annotations and propose a collaborative decoding (or co-decoding) approach to improve parsing accuracy by leveraging bracket structure consensus between multiple parsing decoders trained on individual treebanks. Experimental results show the effectiveness of the proposed approach, which outperforms state-of-the-art baselines, especially on long sentences.

[1]  Kevin Knight,et al.  Combining Constituent Parsers , 2009, NAACL.

[2]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[3]  Nianwen Xue,et al.  Building a Large-Scale Annotated Chinese Corpus , 2002, COLING.

[4]  Jingbo Zhu,et al.  Label correspondence learning for part-of-speech annotation transformation , 2009, CIKM.

[5]  Wen Wang,et al.  Mandarin Part-of-Speech Tagging and Discriminative Reranking , 2007, EMNLP.

[6]  Zheng-Yu Niu,et al.  Exploiting Heterogeneous Treebanks for Parsing , 2009, ACL/IJCNLP.

[7]  Haizhou Li,et al.  K-Best Combination of Syntactic Parsers , 2009, EMNLP.

[8]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[9]  Ming Zhou,et al.  Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders , 2009, ACL/IJCNLP.

[10]  Keh-Yih Su,et al.  An Automatic Treebank Conversion Algorithm for Corpus Sharing , 1994, ACL.

[11]  Yang Feng,et al.  Joint Decoding with Multiple Translation Models , 2009, ACL/IJCNLP.

[12]  Michele Banko,et al.  Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[13]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[14]  Alon Lavie,et al.  Parser Combination by Reparsing , 2006, NAACL.

[15]  Changning Huang,et al.  Better Parser Combination , 2009 .

[16]  John C. Henderson Exploiting Diversity for Natural Language Processing , 1998, AAAI/IAAI.