论文信息 - Memory-based re-engineering of a knowledge-based dependency parser

Memory-based re-engineering of a knowledge-based dependency parser

The emulation of a knowledge-based dependency parser for Dutch by a fast approximation of a memory-based learning algorithm is described. During the development of the original parser, hand-parsed test sentences were collected to offer stochastic guidance in the the parsing process. Training a memory-based parser directly on these collections yields a reasonable but not very accurate emulation. However, when we train the memory-based parser on a much larger collection of texts that were automatically parsed by the knowledge-based parser, it is possible to prolong the learning curve. The resulting re-engineered parser performs at linear speed in function of the length of the input sequence; through brute force, the costly computations of the parser are precompiled into memory, from which retrieval is cheap.

[1] Robert Malouf,et al. Wide Coverage Parsing with Stochastic Attribute Value Grammars , 2004 .

[2] Jeroen Geertzen,et al. Dependency Parsing by Inference over High-recall Dependency Predictions , 2006, CoNLL.

[3] Sabine Buchholz,et al. CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[4] H. J. van den Herik,et al. Informatica en het menselijk blikveld , 1988 .

[5] Walter Daelemans,et al. IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[6] Antal van den Bosch,et al. Shallow Parsing on the Basis of Words Only: A Case Study , 2002, ACL.

[7] Donald E. Knuth,et al. The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[8] David Thomas,et al. The Art in Computer Programming , 2001 .

[9] Miles Osborne,et al. Estimation of Stochastic Attribute-Value Grammars using an Informative Sample , 2000, COLING.

[10] Gertjan van Noord,et al. The Alpino Dependency Treebank , 2001, CLIN.