论文信息 - Neural Machine Translation: A Review

Neural Machine Translation: A Review

The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a short survey of more recent trends in the field.

Felix Stahlberg | Felix Stahlberg

[1] Maosong Sun,et al. THUMT: An Open-Source Toolkit for Neural Machine Translation , 2020, AMTA.

[2] Jeremy M Wolfe,et al. Visual Attention , 2020, Computational Models for Cognitive Vision.

[3] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[4] Chris Dyer,et al. Better Document-Level Machine Translation with Bayes’ Rule , 2019, Transactions of the Association for Computational Linguistics.

[5] Bill Byrne,et al. On NMT Search Errors and Model Errors: Cat Got Your Tongue? , 2019, EMNLP.

[6] Roger Wattenhofer,et al. On the Validity of Self-Attention as Explanation in Transformer Models , 2019, ArXiv.

[7] Marta R. Costa-jussà,et al. Findings of the 2019 Conference on Machine Translation (WMT19) , 2019, WMT.

[8] Mohit Iyyer,et al. Syntactically Supervised Transformers for Faster Neural Machine Translation , 2019, ACL.

[9] Bill Byrne,et al. Domain Adaptive Inference for Neural Machine Translation , 2019, ACL.

[10] Noah A. Smith,et al. Is Attention Interpretable? , 2019, ACL.

[11] Huda Khayrallah,et al. Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation , 2019, NAACL.

[12] Yoshimasa Tsuruoka,et al. Incorporating Source-Side Phrase Structures into Neural Machine Translation , 2019, Computational Linguistics.

[13] Changhan Wang,et al. Levenshtein Transformer , 2019, NeurIPS.

[14] Chenhui Chu,et al. A Survey of Multilingual Neural Machine Translation , 2019, ACM Comput. Surv..

[15] Chenhui Chu,et al. A Brief Survey of Multilingual Neural Machine Translation. , 2019, 1905.05395.

[16] Ho-Gyeong Kim,et al. Knowledge Distillation Using Output Errors for Self-attention End-to-end Models , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17] Alexander J. Smola,et al. Language Models with Transformers , 2019, ArXiv.

[18] Steve Renals,et al. Dynamic Evaluation of Transformer Language Models , 2019, ArXiv.

[19] Jiajun Zhang,et al. End-to-End Speech Translation with Knowledge Distillation , 2019, INTERSPEECH.

[20] Qun Liu,et al. An error analysis for image-based multi-modal neural machine translation , 2019, Machine Translation.

[21] Alexander M. Rush,et al. Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.

[22] Grzegorz Chrupala,et al. Analyzing and interpreting neural networks for NLP: A report on the first BlackboxNLP workshop , 2019, Natural Language Engineering.

[23] Xing Wang,et al. Modeling Recurrence for Transformer , 2019, NAACL.

[24] Mirella Lapata,et al. Text Generation from Knowledge Graphs with Graph Transformers , 2019, NAACL.

[25] Naoaki Okazaki,et al. Positional Encoding to Control Output Sequence Length , 2019, NAACL.

[26] Xin Wang,et al. Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation , 2019, NAACL.

[27] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[28] Sebastian Möller,et al. Train, Sort, Explain: Learning to Diagnose Translation Models , 2019, NAACL.

[29] Barnabás Póczos,et al. Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.

[30] Gholamreza Haffari,et al. Selective Attention for Context-aware Neural Machine Translation , 2019, NAACL.

[31] Hongfei Xu,et al. Neutron: An Implementation of the Transformer Translation Model and its Variants , 2019, ArXiv.

[32] Graham Neubig,et al. On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models , 2019, NAACL.

[33] M. Carl,et al. Post-editing neural machine translation versus phrase-based machine translation for English–Chinese , 2019, Machine Translation.

[34] Sunita Sarawagi,et al. Calibration of Encoder Decoder Models for Neural Machine Translation , 2019, ArXiv.

[35] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.

[36] George F. Foster,et al. Reinforcement Learning based Curriculum Optimization for Neural Machine Translation , 2019, NAACL.

[37] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.

[38] Graham Neubig,et al. Improving Robustness of Machine Translation with Synthetic Noise , 2019, NAACL.

[39] Zheng Zhang,et al. Star-Transformer , 2019, NAACL.

[40] Paul Buitelaar,et al. Augmenting Neural Machine Translation with Knowledge Graphs , 2019, ArXiv.

[41] Tie-Yan Liu,et al. Non-Autoregressive Machine Translation with Auxiliary Regularization , 2019, AAAI.

[42] Gabriel Synnaeve,et al. A Fully Differentiable Beam Search Decoder , 2019, ICML.

[43] Jakob Uszkoreit,et al. Insertion Transformer: Flexible Sequence Generation via Insertion Operations , 2019, ICML.

[44] Kyunghyun Cho,et al. Non-Monotonic Sequential Text Generation , 2019, ICML.

[45] Omer Levy,et al. Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation , 2019, EMNLP.

[46] Qi Liu,et al. Insertion-based Decoding with Automatically Inferred Generation Order , 2019, Transactions of the Association for Computational Linguistics.

[47] John DeNero,et al. Adding Interpretable Attention to Neural Translation Models Improves Word Alignment , 2019, ArXiv.

[48] Dan Liu,et al. Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory , 2019, ArXiv.

[49] Quoc V. Le,et al. The Evolved Transformer , 2019, ICML.

[50] Douwe Kiela,et al. No Training Required: Exploring Random Encoders for Sentence Classification , 2019, ICLR.

[51] Felix Wu,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.

[52] Andrei Popescu-Belis,et al. Context in Neural Machine Translation: A Review of Models and Evaluations , 2019, ArXiv.

[53] Valentin Malykh,et al. Self-Attentive Model for Headline Generation , 2019, ECIR.

[54] Quan Z. Sheng,et al. Generating Textual Adversarial Examples for Deep Learning Models: A Survey , 2019, ArXiv.

[55] Khalil Sima'an,et al. Modeling Latent Sentence Structure in Neural Machine Translation , 2019, ArXiv.

[56] Yoav Goldberg,et al. Assessing BERT's Syntactic Abilities , 2019, ArXiv.

[57] Ming Zhou,et al. Unsupervised Neural Machine Translation with SMT as Posterior Regularization , 2019, AAAI.

[58] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[59] Qiang Zhang,et al. Variational Self-attention Model for Sentence Representation , 2018, ArXiv.

[60] Di He,et al. Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input , 2018, AAAI.

[61] Yonatan Belinkov,et al. NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks , 2018, AAAI.

[62] Yonatan Belinkov,et al. What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models , 2018, AAAI.

[63] Francisco Casacuberta,et al. How Much Does Tokenization Affect Neural Machine Translation? , 2018, CICLing.

[64] Dario Amodei,et al. An Empirical Model of Large-Batch Training , 2018, ArXiv.

[65] Dipankar Das,et al. SMT vs NMT: A Comparison over Hindi and Bengali Simple Sentences , 2018, ICON.

[66] Chao-Hong Liu,et al. The RGNLP Machine Translation Systems for WAT 2018 , 2018, PACLIC.

[67] Min Zhang,et al. Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[68] Shuming Shi,et al. Neural Machine Translation with Adequacy-Oriented Learning , 2018, AAAI.

[69] Hua Wu,et al. Modeling Coherence for Discourse Neural Machine Translation , 2018, AAAI.

[70] Jindrich Libovický,et al. End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification , 2018, EMNLP.

[71] Joelle Pineau,et al. Language GANs Falling Short , 2018, ICLR.

[72] Chong Wang,et al. Neural Phrase-to-Phrase Machine Translation , 2018, ArXiv.

[73] Pushpak Bhattacharyya,et al. Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages , 2018, NAACL.

[74] Minh Le Nguyen,et al. Regularizing Forward and Backward Decoding to Improve Neural Machine Translation , 2018, 2018 10th International Conference on Knowledge and Systems Engineering (KSE).

[75] Dylan Cashman,et al. RNNbow: Visualizing Learning Via Backpropagation Gradients in RNNs , 2018, IEEE Computer Graphics and Applications.

[76] Kaitao Song,et al. Hybrid Self-Attention Network for Machine Translation , 2018, ArXiv.

[77] Deyi Xiong,et al. Two Effective Approaches to Data Reduction for Neural Machine Translation: Static and Dynamic Sentence Selection , 2018, 2018 International Conference on Asian Language Processing (IALP).

[78] Jakob Uszkoreit,et al. Blockwise Parallel Decoding for Deep Autoregressive Models , 2018, NeurIPS.

[79] François Yvon,et al. Using Monolingual Data in Neural Machine Translation: a Systematic Study , 2018, WMT.

[80] Desmond Elliott,et al. Findings of the Third Shared Task on Multimodal Machine Translation , 2018, WMT.

[81] Xiaocheng Feng,et al. Adaptive Multi-pass Decoder for Neural Machine Translation , 2018, EMNLP.

[82] Jugal K. Kalita,et al. Parallel Attention Mechanisms in Neural Machine Translation , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[83] Leonid Sigal,et al. Middle-Out Decoding , 2018, NeurIPS.

[84] Elizabeth Salesky,et al. Optimizing segmentation granularity for neural machine translation , 2018, Machine Translation.

[85] Xing Li,et al. STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency , 2018, ArXiv.

[86] Joakim Nivre,et al. An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation , 2018, WMT.

[87] Satoshi Nakamura,et al. Multi-Source Neural Machine Translation with Data Augmentation , 2018, IWSLT.

[88] Marta R. Costa-jussà,et al. (Self-Attentive) Autoencoder-based Universal Language Representation for Machine Translation , 2018, ArXiv.

[89] Artem Sokolov,et al. Optimally Segmenting Inputs for NMT Shows Preference for Character-Level Processing , 2018 .

[90] Huanbo Luan,et al. Improving the Transformer Translation Model with Document-Level Context , 2018, EMNLP.

[91] Lemao Liu,et al. Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[92] Lucia Specia,et al. Findings of the WMT 2018 Shared Task on Quality Estimation , 2018, WMT.

[93] Li Gong,et al. Tencent Neural Machine Translation Systems for WMT18 , 2018, WMT.

[94] Hermann Ney,et al. Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation , 2018, EMNLP.

[95] Enhong Chen,et al. Bidirectional Generative Adversarial Networks for Neural Machine Translation , 2018, CoNLL.

[96] Gholamreza Haffari,et al. Sequence to Sequence Mixture Model for Diverse Machine Translation , 2018, CoNLL.

[97] Aaron C. Courville,et al. Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.

[98] Yonatan Belinkov,et al. Identifying and Controlling Important Neurons in Neural Machine Translation , 2018, ICLR.

[99] Preslav Nakov,et al. What Is in a Translation Unit? Comparing Character and Subword Representations Beyond Translation , 2018 .

[100] Chris Dyer,et al. Sentence Encoding with Tree-constrained Relation Networks , 2018, ArXiv.

[101] Alex Wang,et al. Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling , 2018, ArXiv.

[102] Huda Khayrallah,et al. Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation , 2018, WMT.

[103] Hermann Ney,et al. On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation , 2018, WMT.

[104] Ahmed Rafea,et al. Enhancing Translation from English to Arabic Using Two-Phase Decoder Translation , 2018, IntelliSys.

[105] James Henderson,et al. Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[106] Veselin Stoyanov,et al. Simple Fusion: Return of the Language Model , 2018, WMT.

[107] Graham Neubig,et al. MTNT: A Testbed for Machine Translation of Noisy Text , 2018, EMNLP.

[108] Marcin Junczys-Dowmunt,et al. Dual Conditional Cross-Entropy Filtering of Noisy Parallel Corpora , 2018, WMT.

[109] Marcin Junczys-Dowmunt,et al. Microsoft’s Submission to the WMT2018 News Translation Task: How I Learned to Stop Worrying and Love the Data , 2018, WMT.

[110] Taro Watanabe,et al. Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection , 2018, WMT.

[111] David Chiang,et al. Correcting Length Bias in Neural Machine Translation , 2018, WMT.

[112] Ankur Bapna,et al. Revisiting Character-Based Neural Machine Translation with Capacity and Compression , 2018, EMNLP.

[113] Mingbo Ma,et al. Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation , 2018, EMNLP.

[114] Adrià de Gispert,et al. The University of Cambridge’s Machine Translation Systems for WMT18 , 2018, WMT.

[115] Lijun Wu,et al. A Study of Reinforcement Learning for Neural Machine Translation , 2018, EMNLP.

[116] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[117] Jiajun Zhang,et al. A Comparable Study on Model Averaging, Ensembling and Reranking in NMT , 2018, NLPCC.

[118] Ji Zhang,et al. Semi-Autoregressive Neural Machine Translation , 2018, EMNLP.

[119] Hai Zhao,et al. Exploring Recombination for Efficient Decoding of Neural Machine Translation , 2018, EMNLP.

[120] Rico Sennrich,et al. Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation , 2018, EMNLP.

[121] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[122] Kevin Knight,et al. Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words , 2018, ArXiv.

[123] M. Zhou,et al. Regularizing Neural Machine Translation by Target-bidirectional Agreement , 2018, AAAI.

[124] Graham Neubig,et al. Rapid Adaptation of Neural Machine Translation to New Languages , 2018, EMNLP.

[125] Matiss Rikters,et al. Debugging Neural Machine Translations , 2018, Doctoral Consortium/Forum@DB&IS.

[126] Graham Neubig,et al. SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation , 2018, EMNLP.

[127] Bill Byrne,et al. An Operation Sequence Model for Explainable Neural Machine Translation , 2018, BlackboxNLP@EMNLP.

[128] Graham Neubig,et al. A Tree-based Decoder for Neural Machine Translation , 2018, EMNLP.

[129] Xu Sun,et al. Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation , 2018, EMNLP.

[130] Kenneth Heafield,et al. Multi-Source Syntactic Neural Machine Translation , 2018, EMNLP.

[131] Hai Zhao,et al. Finding Better Subword Segmentation for Neural Machine Translation , 2018, CCL.

[132] Jing Yang,et al. Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT , 2018, NLPCC.

[133] Francisco Casacuberta,et al. NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning , 2018, Prague Bull. Math. Linguistics.

[134] Atsushi Fujita,et al. Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation , 2018, NMT@ACL.

[135] Tobias Domhan,et al. How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures , 2018, ACL.

[136] Tiejun Zhao,et al. Forest-Based Neural Machine Translation , 2018, ACL.

[137] Huda Khayrallah,et al. Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation , 2018, NMT@ACL.

[138] Gholamreza Haffari,et al. Iterative Back-Translation for Neural Machine Translation , 2018, NMT@ACL.

[139] Jingbo Zhu,et al. A Simple and Effective Approach to Coverage-Aware Neural Machine Translation , 2018, ACL.

[140] Mauro Cettolo,et al. A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation , 2018, COLING.

[141] Ondrej Bojar,et al. Morphological and Language-Agnostic Word Segmentation for NMT , 2018, TSD.

[142] David Barber,et al. Generative Neural Machine Translation , 2018, NeurIPS.

[143] Yong Cheng,et al. Neural Machine Translation with Key-Value Memory-Augmented Attention , 2018, IJCAI.

[144] David Vilar,et al. Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models , 2018, NAACL.

[145] Shuming Ma,et al. Deconvolution-Based Global Decoding for Neural Machine Translation , 2018, COLING.

[146] Rui Wang,et al. A Survey of Domain Adaptation for Neural Machine Translation , 2018, COLING.

[147] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.

[148] A. Hussain,et al. Text Normalization using Memory Augmented Neural Networks , 2018, Speech Commun..

[149] Rico Sennrich,et al. Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[150] Boris Ginsburg,et al. OpenSeq2Seq: Extensible Toolkit for Distributed and Mixed Precision Training of Sequence-to-Sequence Models , 2018, ArXiv.

[151] Naren Ramakrishnan,et al. Deep Reinforcement Learning for Sequence-to-Sequence Models , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[152] Kenneth Heafield,et al. Fast Neural Machine Translation Implementation , 2018, NMT@ACL.

[153] André F. T. Martins,et al. Sparse and Constrained Attention for Neural Machine Translation , 2018, ACL.

[154] Wenhu Chen,et al. Triangular Architecture for Rare Language Translation , 2018, ACL.

[155] Massimo Piccardi,et al. English-Basque Statistical and Neural Machine Translation , 2018, LREC.

[156] Marcello Federico,et al. Deep Neural Machine Translation with Weakly-Recurrent Units , 2018, EAMT.

[157] Mark Fishel,et al. Multi-Domain Neural Machine Translation , 2018, EAMT.

[158] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[159] Huda Khayrallah,et al. On the Impact of Various Types of Noise on Neural Machine Translation , 2018, NMT@ACL.

[160] Yang Liu,et al. Towards Robust Neural Machine Translation , 2018, ACL.

[161] Marine Carpuat,et al. Bi-Directional Neural Machine Translation with Synthetic Parallel Data , 2018, NMT@ACL.

[162] Arianna Bisazza,et al. Neural versus phrase-based MT quality: An in-depth analysis on English-German and English-French , 2018, Comput. Speech Lang..

[163] Adrià de Gispert,et al. Multi-representation ensembles and delayed SGD updates improve syntax-based NMT , 2018, ACL.

[164] Gonzalo Iglesias,et al. Neural Machine Translation Decoding with Terminology Constraints , 2018, NAACL.

[165] Taku Kudo,et al. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.

[166] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[167] Diego Marcheggiani,et al. Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks , 2018, NAACL.

[168] Yichao Lu,et al. A neural interlingua for multilingual machine translation , 2018, WMT.

[169] Yun Chen,et al. A Stable and Effective Learning Strategy for Trainable Greedy Decoding , 2018, EMNLP.

[170] Guillaume Lample,et al. Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.

[171] Shi Feng,et al. Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.

[172] Andy Way,et al. Investigating Backtranslation in Neural Machine Translation , 2018, EAMT.

[173] Marcin Junczys-Dowmunt,et al. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation , 2018, NAACL.

[174] Miguel Ballesteros,et al. Pieces of Eight: 8-bit Neural Machine Translation , 2018, NAACL.

[175] Yidong Chen,et al. Lattice-to-sequence attentional Neural Machine Translation models , 2018, Neurocomputing.

[176] Ondrej Bojar,et al. Training Tips for the Transformer Model , 2018, Prague Bull. Math. Linguistics.

[177] Hong Yu,et al. Sentence Simplification with Memory-Augmented Neural Networks , 2018, NAACL.

[178] Gonzalo Iglesias,et al. Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment , 2018, NAACL.

[179] Yoshua Bengio,et al. Fine-grained attention mechanism for neural machine translation , 2018, Neurocomputing.

[180] Gonzalo Iglesias,et al. Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation , 2018, AMTA.

[181] Samy Bengio,et al. Tensor2Tensor for Neural Machine Translation , 2018, AMTA.

[182] F. Seide,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[183] Christof Monz,et al. The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.

[184] Aurko Roy,et al. Fast Decoding in Sequence Models using Discrete Latent Variables , 2018, ICML.

[185] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.

[186] Andy Way,et al. SMT versus NMT: Preliminary comparisons for Irish , 2018, LoResMT@AMTA.

[187] Ivan Skorokhodov,et al. Semi-Supervised Neural Machine Translation with Language Models , 2018, LoResMT@AMTA.

[188] Atsushi Fujita,et al. A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation , 2018, AMTA.

[189] Enhong Chen,et al. Joint Training for Neural Machine Translation Models with Monolingual Data , 2018, AAAI.

[190] Matthias Sperber,et al. XNMT: The eXtensible Neural Machine Translation Toolkit , 2018, AMTA.

[191] Marc'Aurelio Ranzato,et al. Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.

[192] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[193] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[194] Li Zhao,et al. Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization , 2018, AAAI.

[195] Xiaolin Wang,et al. CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++ , 2018, EMNLP.

[196] Chengqi Zhang,et al. Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling , 2018, IJCAI.

[197] Lior Wolf,et al. Non-Adversarial Unsupervised Word Translation , 2018, EMNLP.

[198] Rongrong Ji,et al. Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[199] Shan Wu,et al. Variational Recurrent Neural Machine Translation , 2018, AAAI.

[200] Michael J. Denkowski,et al. Sockeye: A Toolkit for Neural Machine Translation , 2017, ArXiv.

[201] Vlad Zhukov,et al. Differentiable lower bound for expected BLEU score , 2017, ArXiv.

[202] Hua Wu,et al. Multi-channel Encoder for Neural Machine Translation , 2017, AAAI.

[203] Sungzoon Cho,et al. Distance-based Self-Attention Network for Natural Language Inference , 2017, ArXiv.

[204] Fethi Bougares,et al. Neural Machine Translation by Generating Multiple Linguistic Factors , 2017, SLSP.

[205] Kamel Smaïli,et al. Is statistical machine translation approach dead , 2017 .

[206] Guodong Zhou,et al. Cache-based Document-level Neural Machine Translation , 2017, ArXiv.

[207] Yang Liu,et al. Learning to Remember Translation History with a Continuous Cache , 2017, TACL.

[208] Ole Winther,et al. Recurrent Relational Networks , 2017, NeurIPS.

[209] Jan Niehues,et al. Effective Strategies in Zero-Shot Neural Machine Translation , 2017, IWSLT.

[210] Gholamreza Haffari,et al. Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model , 2017, COLING.

[211] C. Lee Giles,et al. The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations , 2017, ArXiv.

[212] Angela Fan,et al. Controllable Abstractive Summarization , 2017, NMT@ACL.

[213] Marc'Aurelio Ranzato,et al. Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.

[214] Jonathan G. Fiscus,et al. Overview of the NIST 2016 LoReHLT evaluation , 2017, Machine Translation.

[215] Gholamreza Haffari,et al. Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[216] Victor O. K. Li,et al. Non-Autoregressive Neural Machine Translation , 2017, ICLR.

[217] Yonatan Belinkov,et al. Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.

[218] Di He,et al. Decoding with Value Networks for Neural Machine Translation , 2017, NIPS.

[219] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.

[220] Rico Sennrich,et al. Evaluating Discourse Phenomena in Neural Machine Translation , 2017, NAACL.

[221] M. Utiyama,et al. Syntax-Directed Attention for Neural Machine Translation , 2017, AAAI.

[222] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[223] Takenobu Tokunaga,et al. Key-value Attention Mechanism for Neural Machine Translation , 2017, IJCNLP.

[224] Satoshi Nakamura,et al. Improving Neural Machine Translation through Phrase-based Forced Decoding , 2017, IJCNLP.

[225] Raj Dabre,et al. Neural Machine Translation: Basics, Practical Aspects and Recent Trends , 2017, IJCNLP.

[226] Tetsuji Nakagawa,et al. An Empirical Study of Language Relatedness for Transfer Learning in Neural Machine Translation , 2017, PACLIC.

[227] Huda Khayrallah,et al. Neural Lattice Search for Domain Adaptation in Machine Translation , 2017, IJCNLP.

[228] Guillaume Lample,et al. Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[229] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.

[230] Ondrej Bojar,et al. Paying Attention to Multi-Word Expressions in Neural Machine Translation , 2017, MTSUMMIT.

[231] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.

[232] Mark Fishel,et al. Confidence through Attention , 2017, MTSummit.

[233] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.

[234] Thomas Fang Zheng,et al. Enhanced neural machine translation by learning from draft , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[235] David Chiang,et al. Improving Lexical Choice in Neural Machine Translation , 2017, NAACL.

[236] Desmond Elliott,et al. Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description , 2017, WMT.

[237] Christof Monz,et al. What does Attention in Neural Machine Translation Pay Attention to? , 2017, IJCNLP.

[238] Yachao Li,et al. Neural Machine Translation with Phrasal Attention , 2017, CWMT.

[239] Jingbo Zhu,et al. Handling Many-To-One UNK Translation for Neural Machine Translation , 2017, CWMT.

[240] Yufeng Chen,et al. An Unknown Word Processing Method in NMT by Integrating Syntactic Structure and Semantic Concept , 2017, CWMT.

[241] Yufeng Chen,et al. A Method of Unknown Words Processing for Neural Machine Translation Using HowNet , 2017, CWMT.

[242] Abraham Ittycheriah,et al. Neural Machine Translation , 2017, International Journal for Research in Applied Science and Engineering Technology.

[243] Pavel Levin,et al. Toward a full-scale neural machine translation in production: the Booking.com use case , 2017, MTSUMMIT.

[244] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[245] Marta R. Costa-jussà,et al. Coverage for Character Based Neural Machine Translation , 2017, Proces. del Leng. Natural.

[246] Samuel R. Bowman,et al. Do latent tree learning models identify meaningful structure in sentences? , 2017, TACL.

[247] Andrei Popescu-Belis,et al. Self-Attentive Residual Decoder for Neural Machine Translation , 2017, NAACL.

[248] Young-Kil Kim,et al. Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages , 2017, LREC.

[249] Lemao Liu,et al. Instance Weighting for Neural Machine Translation Domain Adaptation , 2017, EMNLP.

[250] Lemao Liu,et al. Neural Machine Translation with Source Dependency Representation , 2017, EMNLP.

[251] Quoc V. Le,et al. Effective Domain Mixing for Neural Machine Translation , 2017, WMT.

[252] Andy Way,et al. Neural Pre-Translation for Hybrid Machine Translation , 2017, MTSUMMIT.

[253] Richard Socher,et al. Towards Neural Machine Translation with Latent Tree Attention , 2017, SPNLP@EMNLP.

[254] Yoshua Bengio,et al. On integrating a language model into neural machine translation , 2017, Comput. Speech Lang..

[255] Andy Way,et al. A Comparative Quality Evaluation of PBSMT and NMT using Professional Translators , 2017, MTSUMMIT.

[256] Philipp Koehn,et al. Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.

[257] Mingbo Ma,et al. When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) , 2017, EMNLP.

[258] Felix Hieber,et al. Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning , 2017, EMNLP.

[259] Marta R. Costa-jussà,et al. Byte-based Neural Machine Translation , 2017, SWCN@EMNLP.

[260] Yoshua Bengio,et al. Multi-way, multilingual neural machine translation , 2017, Comput. Speech Lang..

[261] David Chiang,et al. Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation , 2017, IJCNLP.

[262] Yonatan Belinkov,et al. Neural Machine Translation Training in a Multi-Domain Scenario , 2017, IWSLT.

[263] Marcis Pinnis,et al. Neural Machine Translation for Morphologically Rich Languages with Improved Sub-word Units and Synthetic Data , 2017, TSD.

[264] Marcello Federico,et al. Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors , 2017, INTERSPEECH.

[265] Jörg Tiedemann,et al. Neural machine translation for low-resource languages , 2017, ArXiv.

[266] Antonio Valerio Miceli Barone,et al. The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[267] Graham Neubig,et al. A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models , 2017, AAAI.

[268] George F. Foster,et al. Cost Weighting for Neural Machine Translation Domain Adaptation , 2017, NMT@ACL.

[269] Yang Feng,et al. Memory-augmented Neural Machine Translation , 2017, EMNLP.

[270] Jan Niehues,et al. Analyzing Neural MT Search and Model Performance , 2017, NMT@ACL.

[271] Jörg Tiedemann,et al. Neural Machine Translation with Extended Context , 2017, DiscoMT@EMNLP.

[272] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[273] Shahram Khadivi,et al. Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search , 2017, EMNLP.

[274] Marine Carpuat,et al. Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation , 2017, NMT@ACL.

[275] Christof Monz,et al. Dynamic Data Selection for Neural Machine Translation , 2017, EMNLP.

[276] Rico Sennrich,et al. Regularization techniques for fine-tuning in neural machine translation , 2017, EMNLP.

[277] Bill Byrne,et al. SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies , 2017, EMNLP.

[278] Jingbo Zhu,et al. Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation , 2017, EMNLP.

[279] Tommi S. Jaakkola,et al. A causal framework for explaining the predictions of black-box sequence-to-sequence models , 2017, EMNLP.

[280] Nenghai Yu,et al. Dual Supervised Learning , 2017, ICML.

[281] Yang Liu,et al. Visualizing and Understanding Neural Machine Translation , 2017, ACL.

[282] Eneko Agirre,et al. Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[283] Ming Zhou,et al. Sequence-to-Dependency Neural Machine Translation , 2017, ACL.

[284] Masao Utiyama,et al. Sentence Embedding for Neural Machine Translation Domain Adaptation , 2017, ACL.

[285] Jingtao Yao,et al. Chunk-based Decoder for Neural Machine Translation , 2017, ACL.

[286] Shujian Huang,et al. Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder , 2017, ACL.

[287] Alexander M. Fraser,et al. Modeling Target-Side Inflection in Neural Machine Translation , 2017, WMT.

[288] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[289] Satoshi Nakamura,et al. An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation , 2017, NMT@ACL.

[290] Chong Wang,et al. Towards Neural Phrase-based Machine Translation , 2017, ICLR.

[291] Yoshua Bengio,et al. Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning , 2017, Rep4NLP@ACL.

[292] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[293] Markus Freitag,et al. Attention-based Vocabulary Selection for NMT Decoding , 2017, ArXiv.

[294] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.

[295] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[296] Philipp Koehn,et al. Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[297] Marcello Federico,et al. Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English , 2017, Prague Bull. Math. Linguistics.

[298] Andy Way,et al. Is Neural Machine Translation the New State of the Art? , 2017, Prague Bull. Math. Linguistics.

[299] Stephen Clark,et al. Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs , 2017, Natural Language Engineering.

[300] Colin Raffel,et al. Learning Hard Alignments with Variational Inference , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[301] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[302] Ming Zhou,et al. Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[303] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[304] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[305] Jacob Devlin,et al. Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU , 2017, EMNLP.

[306] Deyi Xiong,et al. A GRU-Gated Attention Model for Neural Machine Translation , 2017, ArXiv.

[307] Pierre Isabelle,et al. A Challenge Set Approach to Evaluating Machine Translation , 2017, EMNLP.

[308] Orhan Firat,et al. Does Neural Machine Translation Benefit from Larger Context? , 2017, ArXiv.

[309] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[310] Yoav Goldberg,et al. Towards String-To-Tree Neural Machine Translation , 2017, ACL.

[311] Khalil Sima'an,et al. Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[312] Andy Way,et al. Exploiting Cross-Sentence Context for Neural Machine Translation , 2017, EMNLP.

[313] Bill Byrne,et al. Unfolding and Shrinking Neural Machine Translation Ensembles , 2017, EMNLP.

[314] James R. Glass,et al. What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[315] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.

[316] Tie-Yan Liu,et al. Adversarial Neural Machine Translation , 2017, ACML.

[317] Jiajun Zhang,et al. Neural System Combination for Machine Translation , 2017, ACL.

[318] Jindrich Libovický,et al. Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[319] Chandra Bhagavatula,et al. Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[320] Matthias Sperber,et al. Neural Lattice-to-Sequence Models for Uncertain Inputs , 2017, EMNLP.

[321] Wei Chen,et al. Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets , 2017, NAACL.

[322] Rico Sennrich,et al. Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[323] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[324] Graham Neubig,et al. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.

[325] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[326] Yoshimasa Tsuruoka,et al. Learning to Parse and Translate Improves Neural Machine Translation , 2017, ACL.

[327] Lemao Liu,et al. Deterministic Attention for Sequence-to-Sequence Constituent Parsing , 2017, AAAI.

[328] Victor O. K. Li,et al. Trainable Greedy Decoding for Neural Machine Translation , 2017, EMNLP.

[329] Markus Freitag,et al. Ensemble Distillation for Neural Machine Translation , 2017, ArXiv.

[330] Markus Freitag,et al. Beam Search Strategies for Neural Machine Translation , 2017, NMT@ACL.

[331] Rico Sennrich,et al. Predicting Target Language CCG Supertags Improves Neural Machine Translation , 2017, WMT.

[332] Anne Marie Macari. train , 2017, The Fairchild Books Dictionary of Fashion.

[333] Alan Ritter,et al. Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[334] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[335] Nadir Durrani,et al. QCRI Machine Translation Systems for IWSLT 16 , 2017, ArXiv.

[336] Antonio Toral,et al. A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions , 2017, EACL.

[337] Gholamreza Haffari,et al. Towards Decoding as Continuous Optimisation in Neural Machine Translation , 2017, EMNLP.

[338] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[339] S. Moss. Listen , 2017 .

[340] Heike Adel,et al. Exploring Different Dimensions of Attention for Uncertainty Detection , 2016, EACL.

[341] Markus Freitag,et al. Fast Domain Adaptation for Neural Machine Translation , 2016, ArXiv.

[342] Jungi Kim,et al. Boosting Neural Machine Translation , 2016, IJCNLP.

[343] Josep Maria Crego,et al. Domain Control for Neural Machine Translation , 2016, RANLP.

[344] Ryan Cotterell,et al. Neural Multi-Source Morphological Reinflection , 2016, EACL.

[345] Rico Sennrich,et al. How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs , 2016, EACL.

[346] Adrià de Gispert,et al. Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices , 2016, EACL.

[347] Yong Zhang,et al. Attention pooling-based convolutional neural network for sentence modelling , 2016, Inf. Sci..

[348] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[349] R. S. Milton,et al. Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages , 2016, ArXiv.

[350] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[351] Wei Chen,et al. A Character-Aware Encoder for Neural Machine Translation , 2016, COLING.

[352] Mikio Yamamoto,et al. Translation of Patent Sentences with a Large Vocabulary of Technical Terms Using Neural Machine Translation , 2016, WAT@COLING.

[353] Toshiaki Nakazawa,et al. Kyoto University Participation to WAT 2016 , 2016, WAT@COLING.

[354] Paris Smaragdis,et al. NoiseOut: A Simple Way to Prune Neural Networks , 2016, ArXiv.

[355] Yang Liu,et al. Joint Training for Pivot-based Neural Machine Translation , 2016, IJCAI.

[356] Jan Niehues,et al. Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder , 2016, IWSLT.

[357] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[358] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[359] Yang Liu,et al. Neural Machine Translation with Reconstruction , 2016, AAAI.

[360] Yann Dauphin,et al. A Convolutional Encoder Model for Neural Machine Translation , 2016, ACL.

[361] Eunsol Choi,et al. Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.

[362] Yoav Goldberg,et al. Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.

[363] Lei Yu,et al. The Neural Noisy Channel , 2016, ICLR.

[364] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[365] Jiajun Zhang,et al. Exploiting Source-side Monolingual Data in Neural Machine Translation , 2016, EMNLP.

[366] Pushpak Bhattacharyya,et al. Faster Decoding for Subword Level Phrase-based SMT between Related Languages , 2016, VarDial@COLING.

[367] Tie-Yan Liu,et al. Dual Learning for Machine Translation , 2016, NIPS.

[368] Navdeep Jaitly,et al. RNN Approaches to Text Normalization: A Challenge , 2016, ArXiv.

[369] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.

[370] Samy Bengio,et al. Can Active Memory Replace Attention? , 2016, NIPS.

[371] Iryna Gurevych,et al. Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks , 2016, COLING.

[372] Jiajun Zhang,et al. Bridging Neural Machine Translation and Bilingual Dictionaries , 2016, ArXiv.

[373] Pushpak Bhattacharyya,et al. Learning variable length units for SMT between related languages via Byte Pair Encoding , 2016, SWCN@EMNLP.

[374] Graham Neubig,et al. Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 , 2016, WAT@COLING.

[375] Ole Winther,et al. Neural Machine Translation with Characters and Hierarchical Encoding , 2016, ArXiv.

[376] Bo Wang,et al. SYSTRAN's Pure Neural Machine Translation Systems , 2016, ArXiv.

[377] Min Zhang,et al. Neural Machine Translation Advised by Statistical Machine Translation , 2016, AAAI.

[378] Jan Niehues,et al. Pre-Translation for Neural Machine Translation , 2016, COLING.

[379] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[380] Jason Lee,et al. Fully Character-Level Neural Machine Translation without Explicit Segmentation , 2016, TACL.

[381] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[382] Ashwin K. Vijayakumar,et al. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[383] Marcin Junczys-Dowmunt,et al. Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions , 2016, IWSLT.

[384] Graham Neubig,et al. Learning to Translate in Real-time with Neural Machine Translation , 2016, EACL.

[385] David Grangier,et al. Vocabulary Selection Strategies for Neural Machine Translation , 2016, ArXiv.

[386] Aaron C. Courville,et al. Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[387] Graham Neubig,et al. Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[388] Quoc V. Le,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[389] Lei Yu,et al. Online Segment to Segment Neural Transduction , 2016, EMNLP.

[390] Weinan Zhang,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[391] Lemao Liu,et al. Neural Machine Translation with Supervised Attention , 2016, COLING.

[392] Rongrong Ji,et al. Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation , 2016, AAAI.

[393] Arianna Bisazza,et al. Neural versus Phrase-Based Machine Translation Quality: a Case Study , 2016, EMNLP.

[394] Karin M. Verspoor,et al. Findings of the 2016 Conference on Machine Translation , 2016, WMT.

[395] Khalil Sima'an,et al. A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[396] Jean Oh,et al. Attention-based Multimodal Neural Machine Translation , 2016, WMT.

[397] Philipp Koehn,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016 .

[398] Kyunghyun Cho,et al. Neural Machine Translation , 2016, ACL.

[399] Zhiguo Wang,et al. Supervised Attentions for Neural Machine Translation , 2016, EMNLP.

[400] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[401] Alexander J. Smola,et al. Neural Machine Translation with Recurrent Attention Modeling , 2016, EACL.

[402] Jiajun Zhang,et al. Towards Zero Unknown Word in Neural Machine Translation , 2016, IJCAI.

[403] Wenhu Chen,et al. Guided Alignment Training for Topic-Aware Neural Machine Translation , 2016, AMTA.

[404] Yoshua Bengio,et al. Context-dependent word representation for neural machine translation , 2016, Comput. Speech Lang..

[405] Alexander M. Rush,et al. Sequence-Level Knowledge Distillation , 2016, EMNLP.

[406] Maosong Sun,et al. Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.

[407] Yaser Al-Onaizan,et al. Zero-Resource Translation with Multi-Lingual Neural Machine Translation , 2016, EMNLP.

[408] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[409] Sunita Sarawagi,et al. Length bias in Encoder Decoder Models and a Case for Global Conditioning , 2016, EMNLP.

[410] Rico Sennrich,et al. Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[411] Rico Sennrich,et al. Linguistic Input Features Improve Neural Machine Translation , 2016, WMT.

[412] Kyunghyun Cho,et al. Can neural machine translation do simultaneous translation? , 2016, ArXiv.

[413] Satoshi Nakamura,et al. Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.

[414] Bill Byrne,et al. The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16 , 2016, WMT.

[415] Yaohua Tang,et al. Neural Machine Translation with External Phrase Memory , 2016, ArXiv.

[416] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[417] Alexander M. Rush,et al. Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[418] Lemao Liu,et al. Agreement on Target-bidirectional Neural Machine Translation , 2016, NAACL.

[419] Qun Liu,et al. Memory-enhanced Decoder for Neural Machine Translation , 2016, EMNLP.

[420] Christopher D. Manning,et al. Compression of Neural Machine Translation Models via Pruning , 2016, CoNLL.

[421] Ted Briscoe,et al. Grammatical error correction using neural machine translation , 2016, NAACL.

[422] Matthias Sperber,et al. Lecture Translator - Speech translation framework for simultaneous lecture translation , 2016, NAACL.

[423] David Chiang,et al. An Attentional Model for Speech Translation Without Transcription , 2016, NAACL.

[424] Yang Liu,et al. Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[425] Min Zhang,et al. Variational Neural Machine Translation , 2016, EMNLP.

[426] Pascal Vincent,et al. Hierarchical Memory Networks , 2016, ArXiv.

[427] Rico Sennrich,et al. The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT , 2016, WMT.

[428] Kyunghyun Cho,et al. Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model , 2016, ArXiv.

[429] Zhiguo Wang,et al. Vocabulary Manipulation for Neural Machine Translation , 2016, ACL.

[430] Zhiguo Wang,et al. Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.

[431] Bill Byrne,et al. Syntactically Guided Neural Machine Translation , 2016, ACL.

[432] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[433] Christopher D. Manning,et al. Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models , 2016, ACL.

[434] Manaal Faruqui,et al. Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.

[435] Bowen Zhou,et al. Pointing the Unknown Words , 2016, ACL.

[436] Tara N. Sainath,et al. Learning compact recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[437] Ian McGraw,et al. On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[438] Yoshua Bengio,et al. A Character-level Decoder without Explicit Segmentation for Neural Machine Translation , 2016, ACL.

[439] Simon Osindero,et al. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[440] José A. R. Fonollosa,et al. Character-based Neural Machine Translation , 2016, ACL.

[441] Yoshimasa Tsuruoka,et al. Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.

[442] Marco Tulio Ribeiro,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[443] Hua Wu,et al. Improved Neural Machine Translation with SMT Features , 2016, AAAI.

[444] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.

[445] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[446] Mirella Lapata,et al. Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[447] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.

[448] Stefan Riezler,et al. Multimodal Pivots for Image Caption Translation , 2016, ACL.

[449] Shi Feng,et al. Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model , 2016, ArXiv.

[450] Gholamreza Haffari,et al. Incorporating Structural Alignment Biases into an Attentional Neural Translation Model , 2016, NAACL.

[451] Mirella Lapata,et al. Language to Logical Form with Neural Attention , 2016, ACL.

[452] Yoshua Bengio,et al. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.

[453] Kevin Knight,et al. Multi-Source Neural Translation , 2016, NAACL.

[454] Daniel Jurafsky,et al. Mutual Information and Diverse Decoding Improve Neural Machine Translation , 2016, ArXiv.

[455] Rui Yan,et al. Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[456] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[457] Yang Liu,et al. Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation , 2015, IJCAI.

[458] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[459] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[460] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[461] William Lewis,et al. Skype Translator: Breaking down language and hearing barriers. A behind the scenes look at near real-time speech translation , 2015, TC.

[462] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[463] S. Chopra,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[464] Marcin Andrychowicz,et al. Neural Random Access Machines , 2015, ERCIM News.

[465] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.

[466] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[467] 奥里奥尔·温亚尔斯,et al. Neural machine translation systems with rare word processing , 2015 .

[468] Satoshi Nakamura,et al. Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015 , 2015, WAT.

[469] Desmond Elliott,et al. Multilingual Image Description with Neural Sequence Models , 2015, 1510.04709.

[470] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[471] Yoav Goldberg,et al. A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[472] Alexander M. Rush,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[473] Yoshua Bengio,et al. Montreal Neural Machine Translation Systems for WMT’15 , 2015, WMT@EMNLP.

[474] John DeNero,et al. Variable-Length Word Encodings for Neural Translation Models , 2015, EMNLP.

[475] Alexandra Birch,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[476] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.

[477] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[478] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[479] R. Venkatesh Babu,et al. Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[480] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[481] Dianhai Yu,et al. Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[482] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[483] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[484] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[485] Phil Blunsom,et al. Learning to Transduce with Unbounded Memory , 2015, NIPS.

[486] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[487] Xinlei Chen,et al. Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[488] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[489] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[490] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[491] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[492] Ole Winther,et al. Convolutional LSTM Networks for Subcellular Localization of Proteins , 2015, AlCoB.

[493] Tomas Mikolov,et al. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[494] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[495] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[496] Christian Szegedy,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[497] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[498] Geoffrey E. Hinton,et al. Grammar as a Foreign Language , 2014, NIPS.

[499] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[500] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.

[501] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[502] Quoc V. Le,et al. Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[503] Chris Dyer. Notes on Noise Contrastive Estimation and Negative Sampling , 2014, ArXiv.

[504] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[505] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.

[506] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[507] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[508] Yoshua Bengio,et al. Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation , 2014, SSST@EMNLP.

[509] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[510] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[511] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[512] Cícero Nogueira dos Santos,et al. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[513] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[514] Aaron C. Courville,et al. Generative adversarial networks , 2014, Commun. ACM.

[515] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[516] Richard M. Schwartz,et al. Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[517] Yongqiang Wang,et al. Efficient lattice rescoring using recurrent neural network language models , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[518] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[519] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[520] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[521] Koray Kavukcuoglu,et al. Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[522] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[523] Peng Li,et al. Recursive Autoencoders for ITG-Based Translation , 2013, EMNLP.

[524] Gregory Shakhnarovich,et al. A Systematic Exploration of Diversity in Machine Translation , 2013, EMNLP.

[525] T. Kathirvalavakumar,et al. Pruning algorithms of neural networks — a comparative study , 2013, Central European Journal of Computer Science.

[526] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[527] Ming Zhou,et al. Machine Translation Detection from Monolingual Web-Text , 2013, ACL.

[528] Ming Zhou,et al. Bilingual Data Cleaning for SMT using Graph-based Random Walk , 2013, ACL.

[529] Philipp Koehn,et al. Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[530] Nando de Freitas,et al. Predicting Parameters in Deep Learning , 2013, NIPS.

[531] Graeme W. Blackwood,et al. N-gram posterior probability confidence measures for statistical machine translation: an empirical study , 2013, Machine Translation.

[532] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[533] Alexander H. Waibel,et al. Training speech translation from audio recordings of interpreter-mediated communication , 2013, Comput. Speech Lang..

[534] J. Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[535] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[536] Holger Schwenk,et al. Continuous Space Translation Models for Phrase-Based Statistical Machine Translation , 2012, COLING.

[537] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.

[538] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[539] Petr Motlícek,et al. Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition , 2012, INTERSPEECH.

[540] Yoshua Bengio,et al. Better Mixing via Deep Representations , 2012, ICML.

[541] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[542] Alexandre Allauzen,et al. Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[543] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[544] Chris Quirk,et al. MT Detection in Web-Scraped Parallel Corpora , 2011, MTSUMMIT.

[545] Jianfeng Gao,et al. Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[546] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[547] Yaser Al-Onaizan,et al. Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[548] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[549] Geoffrey E. Hinton,et al. Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[550] Roland Kuhn,et al. Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation , 2010, EMNLP.

[551] Oleksandr Makeyev,et al. Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[552] Jimmy J. Lin,et al. Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.

[553] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[554] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[555] Alexander H. Waibel,et al. Automatic translation from parallel speech: Simultaneous interpretation as MT training data , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[556] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[557] Shankar Kumar,et al. Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[558] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[559] Bashô Matsuo. Basho: The Complete Haiku , 2008 .

[560] Alexander H. Waibel,et al. Simultaneous translation of lectures and speeches , 2007, Machine Translation.

[561] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[562] Philipp Koehn,et al. Factored Translation Models , 2007, EMNLP.

[563] Rich Caruana,et al. Model compression , 2006, KDD '06.

[564] Holger Schwenk,et al. Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[565] Hermann Ney,et al. Word-Level Confidence Estimation for Machine Translation using Phrase-Based Translation Models , 2005, HLT.

[566] Philipp Koehn,et al. Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[567] Shankar Kumar,et al. Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[568] Noah A. Smith,et al. The Web as a Parallel Corpus , 2003, CL.

[569] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[570] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[571] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[572] Vaibhava Goel,et al. Segmental minimum Bayes-risk ASR voting strategies , 2000, INTERSPEECH.

[573] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[574] Philip Resnik,et al. Mining the Web for Bilingual Text , 1999, ACL.

[575] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[576] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.

[577] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[578] Jerome R. Bellegarda,et al. A latent semantic analysis framework for large-Span language modeling , 1997, EUROSPEECH.

[579] Dekai Wu,et al. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[580] Mikel L. Forcada,et al. Asynchronous translations with recurrent neural nets , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[581] Hermann Ney,et al. HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[582] Philip Gage,et al. A new algorithm for data compression , 1994 .

[583] W. Byrne,et al. Generalization and maximum likelihood from small data sets , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[584] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[585] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[586] Hava T. Siegelmann,et al. On the computational power of neural nets , 1992, COLT '92.

[587] Mark Jurik,et al. Neurocomputing: Foundations of research , 1992 .

[588] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..

[589] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[590] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[591] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[592] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[593] G. Seth. Psychology of Language , 1968, Nature.

[594] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[595] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.