Feature Embedding for Dependency Parsing

In this paper, we propose an approach to automatically learning feature embeddings to address the feature sparseness problem for dependency parsing. Inspired by word embeddings, feature embeddings are distributed representations of features that are learned from large amounts of auto-parsed data. Our target is to learn feature embeddings that can not only make full use of well-established hand-designed features but also benefit from the hidden-class representations of features. Based on feature embeddings, we present a set of new features for graph-based dependency parsing models. Experiments on the standard Chinese and English data sets show that the new parser achieves significant performance improvements over a strong baseline.

[1]  Hitoshi Isahara,et al.  An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging , 2009, ACL/IJCNLP.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Tong Zhang,et al.  A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[4]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[5]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[7]  Yoshua Bengio,et al.  Neural net language models , 2008, Scholarpedia.

[8]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[9]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[10]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[11]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[12]  Wanxiang Che,et al.  Improving Chinese POS Tagging with Dependency Parsing , 2011, IJCNLP.

[13]  Joakim Nivre,et al.  Integrating Graph-Based and Transition-Based Dependency Parsers , 2008, ACL.

[14]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[15]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[16]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[17]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[18]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[19]  Lukás Burget,et al.  Neural network based language models for highly inflective languages , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Jie Zhou,et al.  Generalization of Words for Chinese Dependency Parsing , 2013, IWPT.

[21]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[22]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[23]  Wanxiang Che,et al.  A Separately Passive-Aggressive Training Algorithm for Joint POS Tagging and Dependency Parsing , 2012, COLING.

[24]  Yue Zhang,et al.  Semi-Supervised Feature Transformation for Dependency Parsing , 2013, EMNLP.

[25]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[26]  James R. Curran,et al.  Supersense Tagging of Unknown Nouns Using Semantic Similarity , 2005, ACL.

[27]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[28]  Xavier Carreras,et al.  An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing , 2009, EMNLP.

[29]  Kentaro Torisawa,et al.  Improving Dependency Parsing with Subtrees from Auto-Parsed Data , 2009, EMNLP.

[30]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[31]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[32]  Li Cai,et al.  Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing , 2011, ACL.

[33]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[34]  Bo Xu,et al.  Probabilistic Models for Action-Based Chinese Dependency Parsing , 2007, ECML.

[35]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[36]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[37]  Jun'ichi Tsujii,et al.  Incremental Joint POS Tagging and Dependency Parsing in Chinese , 2011, IJCNLP.

[38]  Haizhou Li,et al.  Joint Models for Chinese POS Tagging and Dependency Parsing , 2011, EMNLP.

[39]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[40]  Masaaki Nagata,et al.  Learning Condensed Feature Representations from Large Unsupervised Data Sets for Supervised Learning , 2011, ACL.