Low-Rank Tensors for Scoring Dependency Structures

Accurate scoring of syntactic structures such as head-modifier arcs in dependency parsing typically requires rich, highdimensional feature representations. A small subset of such features is often selected manually. This is problematic when features lack clear linguistic meaning as in embeddings or when the information is blended across features. In this paper, we use tensors to map high-dimensional feature vectors into low dimensional representations. We explicitly maintain the parameters as a low-rank tensor to obtain low dimensional representations of words in their syntactic roles, and to leverage modularity in the tensor for easy training with online algorithms. Our parser consistently outperforms the Turbo and MST parsers across 14 different languages. We also obtain the best published UAS results on 5 languages. 1

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[3]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[4]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[5]  Gal Chechik,et al.  Euclidean Embedding of Co-occurrence Data , 2004, J. Mach. Learn. Res..

[6]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[7]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[8]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[9]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[10]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[11]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[12]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[13]  Joakim Nivre,et al.  Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines , 2006, CoNLL.

[14]  Richard Johansson,et al.  The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies , 2008, CoNLL.

[15]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[16]  Elie Bienenstock,et al.  Sphere Embedding: An Application to Part-of-Speech Induction , 2010, NIPS.

[17]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[18]  Nizar Habash,et al.  Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features , 2010, SPMRL@NAACL-HLT.

[19]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[20]  Pierre Nugues,et al.  Automatic Discovery of Feature Sets for Dependency Parsing , 2010, COLING.

[21]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.

[22]  Noah A. Smith,et al.  Dual Decomposition with Many Overlapping Components , 2011, EMNLP.

[23]  Noah A. Smith,et al.  Structured Sparsity in Structured Prediction , 2011, EMNLP.

[24]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.

[25]  Aswin C. Sankaranarayanan,et al.  SpaRCS: Recovering low-rank and sparse matrices from compressive measurements , 2011, NIPS.

[26]  Rada Mihalcea,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Langu , 2011, ACL 2011.

[27]  Xiaoming Yuan,et al.  Recovering Low-Rank and Sparse Components of Matrices from Incomplete and Noisy Observations , 2011, SIAM J. Optim..

[28]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[29]  Nizar Habash,et al.  Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features , 2011, ACL.

[30]  Dacheng Tao,et al.  GoDec: Randomized Lowrank & Sparse Matrix Decomposition in Noisy Case , 2011, ICML.

[31]  Hao Zhang,et al.  Generalized Higher-Order Dependency Parsing with Cube Pruning , 2012, EMNLP.

[32]  Karl Stratos,et al.  Spectral Learning of Latent-Variable PCFGs , 2012, ACL.

[33]  Alexander M. Rush,et al.  Vine Pruning for Efficient Multi-Pass Dependency Parsing , 2012, NAACL.

[34]  Joakim Nivre,et al.  MaltOptimizer: An Optimization Tool for MaltParser , 2012, EACL.

[35]  Miguel Ballesteros,et al.  Effective Morphological Feature Selection with MaltOptimizer at the SPMRL 2013 Shared Task , 2013, SPMRL@EMNLP.

[36]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[37]  Hao Zhang,et al.  Online Learning for Inexact Hypergraph Search , 2013, EMNLP.

[38]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[39]  Sham M. Kakade,et al.  Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[40]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[41]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[42]  Angeliki Lazaridou,et al.  Fish Transporters and Miracle Homes: How Compositional Distributional Semantics can Help NP Parsing , 2013, EMNLP.

[43]  Aggelos Kiayias,et al.  Resource-based corruptions and the combinatorics of hidden diversity , 2013, ITCS '13.

[44]  Volkan Cirik,et al.  The AI-KU System at the SPMRL 2013 Shared Task : Unsupervised Features for Dependency Parsing , 2013, SPMRL@EMNLP.

[45]  Thierry Poibeau,et al.  A Tensor-based Factorization Model of Semantic Compositionality , 2013, NAACL.

[46]  Regina Barzilay,et al.  Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees , 2014, ACL.