Learning Head-modifier Pairs to Improve Lexicalized Dependency Parsing on a Chinese Treebank

Due to the data sparseness problem, the lexical information from a treebank for a lexicalized parser could be insufficient. This paper proposes an approach to learn head-modifier pairs from a raw corpus, and to integrate them into a lexicalized dependency parser to parse a Chinese Treebank. Experimental results show that this approach not only enlarged the coverage of bi-lexical dependency, but also improved the accuracy of dependency parsing significantly.

[1]  Masaki Murata,et al.  Dependency Model using Posterior Context , 2000, IWPT.

[2]  Kun Yu,et al.  Chinese Word Segmentation and Named Entity Recognition by Character Tagging , 2006, SIGHAN@COLING/ACL.

[3]  Tetsuji Nakagawa,et al.  A Hybrid Approach to Word Segmentation and POS Tagging , 2007, ACL.

[4]  Kun Yu,et al.  A Three-Step Deterministic Parser for Chinese Dependency Parsing , 2007, HLT-NAACL.

[5]  Makoto Nagao,et al.  A Syntactic Analysis Method of Long Japanese Sentences Based on the Detection of Conjunctive Structures , 1994, CL.

[6]  Joakim Nivre,et al.  Discriminative Classifiers for Deterministic Dependency Parsing , 2006, ACL.

[7]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[8]  Dale Schuurmans,et al.  Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization , 2006, CoNLL.

[9]  Ari Rappoport,et al.  An Ensemble Method for Selection of High Quality Parses , 2007, ACL.

[10]  Alan Lee,et al.  Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories , 2006 .

[11]  Daniel M. Bikel,et al.  Intricacies of Collins’ Parsing Model , 2004, CL.

[12]  Dale Schuurmans,et al.  Strictly Lexical Dependency Parsing , 2005, IWPT.

[13]  Daisuke Kawahara,et al.  A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Analysis , 2006, HLT-NAACL.

[14]  Yuji Matsumoto,et al.  Chinese Deterministic Dependency Analyzer: Examining Effects of Global Features and Root Node Finder , 2005, SIGHAN@IJCNLP 2005.

[15]  Chu-Ren Huang,et al.  Sinica Treebank: Design Criteria, Representational Issues and Implementation , 2004 .

[16]  Li,et al.  Two-stage approach to full Chinese parsing , 2005 .

[17]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[18]  李幼升,et al.  Ph , 1989 .

[19]  Andi Wu Learning Verb-Noun Relations to Improve Parsing , 2003, SIGHAN.

[20]  Manabu Okumura,et al.  Japanese Dependency Parsing Using Co-Occurrence Information and a Combination of Case Elements , 2006, ACL.

[21]  Daisuke Kawahara,et al.  Automatic Construction of Nominal Case Frames and its Application to Indirect Anaphora Resolution , 2004, COLING.

[22]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[23]  Jun'ichi Tsujii,et al.  Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles , 2007, EMNLP.

[24]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[25]  Yuji Matsumoto,et al.  Multi-lingual Dependency Parsing at NAIST , 2006, CoNLL.

[26]  Nianwen Xue,et al.  Building a Large-Scale Annotated Chinese Corpus , 2002, COLING.

[27]  Daisuke Kawahara,et al.  A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Analysis (Special Issue : "Collection of Best Annual Papers" Organized for the 20th Anniversary of the Association for Natural Language Processing) , 2006 .

[28]  Oren Etzioni,et al.  Detecting Parser Errors Using Web-based Semantic Filters , 2006, EMNLP.