Charting Democracy Across Parsers

Different parsers trained on the same corpus deliver different results, both in terms of overall performance and in terms of the individual analyses they provide. In particular, for any given sentence, one parser may provide a correct analysis, while another will produce an incorrect analysis; but when faced with a different sentence, the first parser may be in error while the second is correct. In this paper, we leverage this observation by exploring how the results of a number of different parsers may be combined to provide a better performance than any single parser. The method involves constructing a chart that contains edges contributed by a collection of parsers, with a simple voting mechanism to choose the most preferred constituents; this provides a significant improvement in performance over any individual parser. More sophisticated voting mechanisms are also discussed.

[1]  Ari Rappoport,et al.  Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets , 2007, ACL.

[2]  Hopkins UniversityBaltimore Exploiting Diversity in Natural Language Processing: Combining Parsers , 1999 .

[3]  Jennifer Chu-Carroll,et al.  In Question Answering, Two Heads Are Better Than One , 2003, NAACL.

[4]  Alon Lavie,et al.  Parser Combination by Reparsing , 2006, NAACL.

[5]  Eric Brill,et al.  Bagging and Boosting a Treebank Parser , 2000, ANLP.

[6]  Hans van Halteren,et al.  Improving Data Driven Wordclass Tagging by System Combination , 1998, ACL.

[7]  Daniel M. Bikel,et al.  A Distributional Analysis of a Lexicalized Statistical Parsing Model , 2004, EMNLP.

[8]  Andrew B. Clegg,et al.  Evaluating and Integrating Treebank Parsers on a Biomedical Corpus , 2005, ACL 2005.

[9]  Ari Rappoport,et al.  An Ensemble Method for Selection of High Quality Parses , 2007, ACL.

[10]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[11]  Daniel Zeman,et al.  Improving Parsing Accuracy by Combining Diverse Dependency Parsers , 2005, IWPT.

[12]  Ted Pedersen,et al.  A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation , 2000, ANLP.

[13]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[14]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[15]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.