Generalization of Words for Chinese Dependency Parsing

In this paper, we investigate the influence of generalization of words to the accuracies of Chinese dependency parsing. Specially, in our shift-reduce parser, we use a neural language model based word embedding (NLMWE) method (Bengio et al., 2003) to generate distributed word feature vectors and then perform K-means based word clustering to generate word classes. We designed feature templates by making use of words, part-of-speech (POS) tags, coarse-grained POS (CPOS) tags, NLMWE-based word classes and their combinations. NLMWE-based word classes is shown to be an important supplement of POS-tags, especially when POS-tags are automatically generated. Experiments on a Query treebank, CTB5 and CTB7 show that the combinations of features from CPOS-tags, POS-tags, and NLMWE-based word classes yield the best unlabelled attachment scores (UASs). Our final UAS−p (excluding punctuations) of 86.79% on the CTB5 test set is comparable to state-of-theart results. Our final UAS−p of 86.80% and 87.05% on the CTB7 Stanford dependency test set and original test set is significantly better than three well known open-source dependency parsers.

[1]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[2]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[3]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[4]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[5]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[6]  Yuji Matsumoto,et al.  Efficient Stacked Dependency Parsing by Forest Reranking , 2013, Transactions of the Association for Computational Linguistics.

[7]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[8]  Weiwei Sun,et al.  Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing , 2013, Transactions of the Association for Computational Linguistics.

[9]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[10]  Tiejun Zhao,et al.  Compound Embedding Features for Semi-supervised Learning , 2013, NAACL.

[11]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[12]  Valentin I. Spitkovsky,et al.  A Comparison of Chinese Parsers for Stanford Dependencies , 2012, ACL.

[13]  Hao Zhang,et al.  Generalized Higher-Order Dependency Parsing with Cube Pruning , 2012, EMNLP.

[14]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[15]  Wanxiang Che,et al.  A Separately Passive-Aggressive Training Algorithm for Joint POS Tagging and Dependency Parsing , 2012, COLING.

[16]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[17]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[18]  Jingbo Zhu,et al.  Easy-First POS Tagging and Dependency Parsing with Beam Search , 2013, ACL.

[19]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[20]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[21]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.