Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing

Accurate multilingual transfer parsing typically relies on careful feature engineering. In this paper, we propose a hierarchical tensor-based approach for this task. This approach induces a compact feature representation by combining atomic features. However, unlike traditional tensor models, it enables us to incorporate prior knowledge about desired feature interactions, eliminating invalid feature combinations. To this end, we use a hierarchical structure that uses intermediate embeddings to capture desired feature combinations. Algebraically, this hierarchical tensor is equivalent to the sum of traditional tensors with shared components, and thus can be effectively trained with standard online algorithms. In both unsupervised and semi-supervised transfer scenarios, our hierarchical tensor consistently improves UAS and LAS over state-of-theart multilingual transfer parsers and the traditional tensor model across 10 different languages. 1

[1]  Dan Klein,et al.  Syntactic Transfer Using a Bilingual Lexicon , 2012, EMNLP-CoNLL.

[2]  David Gil,et al.  The World Atlas of Language Structures , 2005 .

[3]  Ariadna Quattoni,et al.  Low-Rank Regularization for Sparse Conjunctive Feature Spaces: An Application to Named Entity Classification , 2015, ACL.

[4]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[5]  Daniel Fried,et al.  Low-Rank Tensors for Verbs in Compositional Distributional Semantics , 2015, ACL.

[6]  Philip Resnik,et al.  Bootstrapping parsers via syntactic projection across parallel texts , 2005, Natural Language Engineering.

[7]  Manaal Faruqui,et al.  Improving Vector Space Word Representations Using Multilingual Correlation , 2014, EACL.

[8]  Sameer Singh,et al.  Towards Combined Matrix and Tensor Factorization for Universal Schema Relation Extraction , 2015, VS@HLT-NAACL.

[9]  Christopher D. Manning,et al.  Learning Distributed Representations for Structured Output Prediction , 2014, NIPS.

[10]  Regina Barzilay,et al.  Selective Sharing for Multilingual Dependency Parsing , 2012, ACL.

[11]  Ariadna Quattoni,et al.  Spectral Regularization for Max-Margin Sequence Tagging , 2014, ICML.

[12]  Joakim Nivre,et al.  Target Language Adaptation of Discriminative Transfer Parsers , 2013, NAACL.

[13]  Dong Yu,et al.  The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Tiejun Zhao,et al.  Cross-lingual Projections between Languages from Different Families , 2013, ACL.

[15]  Mark Johnson,et al.  Using Universal Linguistic Knowledge to Guide Grammar Induction , 2010, EMNLP.

[16]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[17]  Trevor Cohn,et al.  Cross-lingual Transfer for Unsupervised Dependency Parsing Without Parallel Data , 2015, CoNLL.

[18]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[19]  Noah A. Smith,et al.  Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction , 2009, NAACL.

[20]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[21]  Mark Dredze,et al.  Learning Composition Models for Phrase Embeddings , 2015, TACL.

[22]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[23]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[24]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[25]  Mohammad Sadegh Rasooli,et al.  Density-Driven Cross-Lingual Transfer of Dependency Parsers , 2015, EMNLP.

[26]  Dan Klein,et al.  Phylogenetic Grammar Induction , 2010, ACL.

[27]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[28]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[29]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[30]  Noah A. Smith,et al.  Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance , 2011, EMNLP.

[31]  Valentin I. Spitkovsky,et al.  Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction , 2013, EMNLP.