The Inside-Outside Recursive Neural Network model for Dependency Parsing

We propose the first implementation of an infinite-order generative dependency model. The model is based on a new recursive neural network architecture, the Inside-Outside Recursive Neural Network. This architecture allows information to flow not only bottom-up, as in traditional recursive neural networks, but also topdown. This is achieved by computing content as well as context representations for any constituent, and letting these representations interact. Experimental results on the English section of the Universal Dependency Treebank show that the infinite-order model achieves a perplexity seven times lower than the traditional third-order model using counting, and tends to choose more accurate parses in k-best lists. In addition, reranking with this model achieves state-of-the-art unlabelled attachment scores and unlabelled exact match scores.

[1]  Remko Scha,et al.  Learning from errors: Using vector-based compositional semantics for parse reranking , 2013, CVSM@ACL.

[2]  Federico Sangati,et al.  Accurate Parsing with Compact Tree-Substitution Grammars: Double-DOP , 2011, EMNLP.

[3]  Alan F. Murray,et al.  IEEE International Conference on Neural Networks , 1997 .

[4]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[5]  Yuji Matsumoto,et al.  Efficient Stacked Dependency Parsing by Forest Reranking , 2013, Transactions of the Association for Computational Linguistics.

[6]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[7]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[8]  Gideon Borensztajn,et al.  Episodic grammar: a computational model of the interaction between episodic and semantic memory in language processing , 2011, CogSci.

[9]  Rens Bod,et al.  A generative re-ranking model for dependency parsing , 2009, IWPT.

[10]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[11]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[12]  Ivan Titov,et al.  A Latent Variable Model for Generative Dependency Parsing , 2007, Trends in Parsing Technology.

[13]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[14]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[15]  Jason Eisner,et al.  An Empirical Comparison of Probability Models for Dependency Grammar , 1997, ArXiv.

[16]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[17]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[18]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[21]  Hao Zhang,et al.  Generalized Higher-Order Dependency Parsing with Cube Pruning , 2012, EMNLP.

[22]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[23]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[24]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[25]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[26]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[27]  Rens Bod,et al.  Discontinuous Parsing with an Efficient and Accurate DOP Model , 2013, IWPT.

[28]  Yuji Matsumoto,et al.  Third-order Variational Reranking on Packed-Shared Dependency Forests , 2011, EMNLP.

[29]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[30]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[31]  Jonas Kuhn,et al.  The Best of Both Worlds – A Graph-based Completion Model for Transition-based Parsers , 2012, EACL.

[32]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.