论文信息 - Towards Neural Machine Translation with Latent Tree Attention - 字舞流文

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree structures on both the source and target. When trained on character-level datasets with no explicit segmentation or parse annotation, the model learns a plausible segmentation and shallow parse, obtaining performance close to an attentional baseline.

Richard Socher | James Bradbury | R. Socher | James Bradbury

[1] Rico Sennrich,et al. Predicting Target Language CCG Supertags Improves Neural Machine Translation , 2017, WMT.

[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.

[4] Christopher Potts,et al. A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[5] Noah A. Smith,et al. What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.

[6] Christoph Goller,et al. Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[7] Alexander M. Rush,et al. Structured Attention Networks , 2017, ICLR.

[8] Philipp Koehn,et al. Syntax-aware Neural Machine Translation Using CCG , 2017, ArXiv.

[9] Christopher D. Manning,et al. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[10] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[11] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.

[12] Yoshimasa Tsuruoka,et al. Neural Machine Translation with Source-Side Latent Graph Parsing , 2017, EMNLP.

[13] Noah A. Smith,et al. Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[14] Liang Huang,et al. A Syntax-Directed Translator with Extended Domain of Locality , 2006 .

[15] Rico Sennrich,et al. Linguistic Input Features Improve Neural Machine Translation , 2016, WMT.

[16] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[17] Khalil Sima'an,et al. A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[18] Dekai Wu,et al. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[19] Yoshimasa Tsuruoka,et al. Learning to Parse and Translate Improves Neural Machine Translation , 2017, ACL.

[20] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.

[21] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..

[22] David Chiang,et al. A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[23] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[24] Yoshimasa Tsuruoka,et al. Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.

[25] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.