Some notes on the PARC 700 Dependency Bank

The PARC 700 dependency bank is a potentially very useful resource for parser evaluation that has, so to speak, a high barrier to entry, because of tokenisation that is quite different from the source of the data, the Penn Treebank, and because there is no representation of word order, producing an uncertainty factor of some 15%. There is also a small, but perhaps not insignificant, number of errors. When using the dependency bank for evaluation, it seems likely that these things will cause inflated counts for mismatches, so to obtain more accurate measurements, it is desirable to eliminate them. The work reported here consists of an automatic conversion of the dependency bank into a Prolog representation where the word order is explicit, as well as graphical representations of the dependency trees for all 700 sentences, automatically generated from the Prolog data. As a side effect of the transformation, errors were detected and corrected. It is hoped that this work will lead to more widespread use of the PARC 700 dependency bank for parser evaluation.

[1]  S. Marcus Sur la Notion de Projectivité , 1965 .

[2]  Alexander Dikovsky,et al.  DEPENDENCIES ON THE OTHER SIDE OF THE CURTAIN , 2000 .

[3]  Mary Dalrymple,et al.  The PARC 700 Dependency Bank , 2003, LINC@EACL.

[4]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[5]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[6]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[7]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[8]  Ted Briscoe,et al.  Corpus Annotation for Parser Evaluation , 1999, ArXiv.

[9]  Mark Steedman,et al.  CCGbank: User's Manual , 2005 .

[10]  Haim Gaifman,et al.  Dependency Systems and Phrase-Structure Systems , 1965, Inf. Control..

[11]  Stefan Riezler,et al.  A Comparison of Evaluation Metrics for a Broad-Coverage Stochastic Parser , 2003 .

[12]  Marilyn A. Walker,et al.  A Dependency Treebank for English , 2002, LREC.

[13]  Andy Way,et al.  Evaluation of an automatic f-structure annotation algorithm against the PARC 700 dependency bank , 2004 .

[14]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[15]  Ted Briscoe,et al.  Statistical Parser on the PARC DepBank , 2006 .

[16]  Jane J. Robinson Dependency Structures and Transformational Rules , 1970 .

[17]  Stefan Riezler,et al.  Speed and Accuracy in Shallow and Deep Stochastic Parsing , 2004, NAACL.

[18]  Tomas By English Dependency Grammar , 2004, Workshop On Recent Advances In Dependency Grammar.

[19]  Beatrice Santorini,et al.  The Penn Treebank: An Overview , 2003 .

[20]  Richard Hudson,et al.  English word grammar , 1995 .