A Deterministic Word Dependency Analyzer Enhanced With Preference Learning

Word dependency is important in parsing technology. Some applications such as Information Extraction from biological documents benefit from word dependency analysis even without phrase labels. Therefore, we expect an accurate dependency analyzer trainable without using phrase labels is useful. Although such an English word dependency analyzer was proposed by Yamada and Matsumoto, its accuracy is lower than state-of-the-art phrase structure parsers because of the lack of top-down information given by phrase labels. This paper shows that the dependency analyzer can be improved by introducing a Root-Node Finder and a Prepositional-Phrase Attachment Resolver. Experimental results show that these modules based on Preference Learning give better scores than Collins' Model 3 parser for these subproblems. We expect this method is also applicable to phrase structure parsers.

[1]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[2]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[3]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[4]  Ralph Grishman,et al.  An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition , 2003, ACL.

[5]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[6]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[7]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[8]  Jun Suzuki,et al.  Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data , 2003, ACL.

[9]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[10]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[11]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[12]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[13]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[14]  K. Obermayer,et al.  Learning Preference Relations for Information Retrieval , 1998 .

[15]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[16]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[17]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[18]  Hideki Isozaki,et al.  Efficient Support Vector Classifiers for Named Entity Recognition , 2002, COLING.

[19]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[22]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[23]  Aravind K. Joshi,et al.  An SVM-based voting algorithm with application to parse reranking , 2003, CoNLL.