Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation

A number of researchers have recently conducted experiments comparing deep hand-crafted wide-coverage with shallow treebank- and machine-learning-based parsers at the level of dependencies, using simple and automatic methods to convert tree output generated by the shallow parsers into dependencies. In this article, we revisit such experiments, this time using sophisticated automatic LFG f-structure annotation methodologies with surprising results. We compare various PCFG and history-based parsers to find a baseline parsing system that fits best into our automatic dependency structure annotation technique. This combined system of syntactic parser and dependency structure annotation is compared to two hand-crafted, deep constraint-based parsers, RASP and XLE. We evaluate using dependency-based gold standards and use the Approximate Randomization Test to test the statistical significance of the results. Our experiments show that machine-learning-based shallow grammars augmented with sophisticated automatic dependency annotation technology outperform hand-crafted, deep, wide-coverage constraint grammars. Currently our best system achieves an f-score of 82.73% against the PARC 700 Dependency Bank, a statistically significant improvement of 2.18% over the most recent results of 80.55% for the hand-crafted LFG grammar and XLE parsing system and an f-score of 80.23% against the CBS 500 Dependency Bank, a statistically significant 3.66% improvement over the 76.57% achieved by the hand-crafted RASP grammar and parsing system.

[1]  Mary Dalrymple,et al.  The PARC 700 Dependency Bank , 2003, LINC@EACL.

[2]  Julia Hockenmaier Parsing with Generative Models of Predicate-Argument Structure , 2003, ACL.

[3]  Mitchell P. Marcus,et al.  Adding Semantic Annotation to the Penn TreeBank , 1998 .

[4]  Roger Levy,et al.  Deep Dependencies from Context-Free Statistical Parsers: Correcting the Surface Dependency Approximation , 2004, ACL.

[5]  Ted Briscoe,et al.  Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars , 1993, CL.

[6]  Andreas Eisele,et al.  A Lexical Functional Grammar System in Prolog , 1986, COLING.

[7]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[8]  Yusuke Miyao,et al.  Probabilistic modeling of argument structures including non-local dependencies , 2003 .

[9]  Dan Flickinger,et al.  On building a more effcient grammar by exploiting types , 2000, Natural Language Engineering.

[10]  Josef van Genabith,et al.  On Interpreting F-Structures as UDRSs , 1997, ACL.

[11]  Andy Way,et al.  Evaluating machine translation with LFG dependencies , 2007, Machine Translation.

[12]  Amit Dubey,et al.  Antecedent Recovery: Experiments with a Trace Tagger , 2003, EMNLP.

[13]  Ted Briscoe,et al.  High Precision Extraction of Grammatical Relations , 2001, COLING.

[14]  Tsujii Jun'ichi,et al.  Maximum entropy estimation for feature forests , 2002 .

[15]  Jun'ichi Tsujii,et al.  Towards efficient probabilistic HPSG parsing: integrating semantic and syntactic preference to gu , 2004 .

[16]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[17]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[18]  Rens Bod An efficient implementation of a new DOP model , 2003, EACL.

[19]  Ted Briscoe,et al.  Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank , 2006, ACL.

[20]  James R. Curran,et al.  Formalism-Independent Parser Evaluation with CCG and DepBank , 2007, ACL.

[21]  Daniel Gildea,et al.  Identifying Semantic Roles Using Combinatory Categorial Grammar , 2003, EMNLP.

[22]  Judita Preiss Using Grammatical Relations to Compare Parsers , 2003, EACL.

[23]  Stefan Riezler,et al.  Speed and Accuracy in Shallow and Deep Stochastic Parsing , 2004, NAACL.

[24]  Russell V. Lenth,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .

[25]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[26]  Andy Way,et al.  Evaluation of an automatic f-structure annotation algorithm against the PARC 700 dependency bank , 2004 .

[27]  Josef van Genabith,et al.  Direct and Underspecified Interpretations of LFG f-structures , 1996, COLING.

[28]  David M. Magerman Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.

[29]  Hans Uszkoreit,et al.  Proceedings of the Workshop `Beyond PARSEVAL --- Towards improved evaluation measures for parsing systems' at the 3rd International Conference on Language Resources and Evaluation , 2002 .

[30]  Timothy Baldwin,et al.  Road-testing the English Resource Grammar Over the British National Corpus , 2004, LREC.

[31]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[32]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[33]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[34]  Steven P. Abney Stochastic Attribute-Value Grammars , 1996, CL.

[35]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[36]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[37]  Jun'ichi Tsujii,et al.  Deep Linguistic Analysis for the Accurate Identification of Predicate-Argument Relations , 2004, COLING.

[38]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[39]  John T. Maxwell,et al.  Formal issues in lexical-functional grammar , 1998 .

[40]  Mark Johnson,et al.  A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents , 2002, ACL.

[41]  Geoffrey Sampson,et al.  English for the Computer: The SUSANNE Corpus and Analytic Scheme , 1995, Computational Linguistics.

[42]  J. Bresnan Lexical-Functional Syntax , 2000 .

[43]  Stephen Pulman Ellipsis‚ Comparatives‚ and Generation , 1992 .

[44]  Andy Way,et al.  Automatic annotation of the Penn-treebank with LFG f-structureinformation , 2002 .

[45]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[46]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[47]  Mark Johnson,et al.  Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques , 2002, ACL.

[48]  ChinchorNancy,et al.  Evaluating message understanding systems , 1993 .

[49]  Martha Palmer,et al.  Extracting Tree Adjoining Grammars from Bracketed Corpora , 2009 .

[50]  Andy Way,et al.  Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II Treebank , 2004, ACL.

[51]  Andy Way,et al.  Long-Distance Dependency Resolution in Automatically Acquired Wide-Coverage PCFG-Based LFG Approximations , 2004, ACL.

[52]  Brian Roark,et al.  Robust garden path parsing , 2004, Natural Language Engineering.

[53]  Mairéad McCarthy Design and evaluation of the linguistic basis of an automatic F-struture annotation algorithm for the Penn-II treebank , 2003 .

[54]  Robert Gaizauskas,et al.  Investigations into the grammar underlying the Penn Treebank II , 1995 .

[55]  李幼升,et al.  Ph , 1989 .

[56]  Dekang Lin,et al.  A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[57]  Mary Dalrymple,et al.  Lexical Functional Grammar , 2001 .

[58]  Ted Briscoe,et al.  A Formalism and Environment for the Development of a Large Grammar of English , 1987, IJCAI.

[59]  Ronald M. Kaplan,et al.  Lexical Functional Grammar A Formal System for Grammatical Representation , 2004 .

[60]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[61]  Paul R. Cohen,et al.  Empirical methods for artificial intelligence , 1995, IEEE Expert.

[62]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[63]  Daniel M. Bikel,et al.  Design of a multi-lingual, parallel-processing statistical parsing engine , 2002 .

[64]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[65]  Andy Way,et al.  Parsing with PCFGs and automatic f-structure annotation , 2002 .

[66]  Martha Palmer,et al.  Adding predicate argument structure to the Penn TreeBank , 2002 .

[67]  Michael Burke Automatic treebank annotation for the acquisition of LFG resources , 2006 .

[68]  Lynette Hirschman,et al.  Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3) , 1993, CL.

[69]  Aoife Cahill,et al.  Parsing with automatically acquired, wide-coverage, robust, probabilistic LFG approximations , 2004 .

[70]  John A. Carroll,et al.  Beyond PARSEVAL — Towards Improved Evaluation Measures for Parsing Systems , .

[71]  Eugene Charniak,et al.  Tree-Bank Grammars , 1996, AAAI/IAAI, Vol. 2.

[72]  Gertjan van Noord,et al.  Alpino: Wide-coverage Computational Analysis of Dutch , 2000, CLIN.

[73]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[74]  Miriam Butt,et al.  The Parallel Grammar Project , 2002, COLING 2002.