Neural Machine Translation: A Review

The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a short survey of more recent trends in the field.

[1]  Maosong Sun,et al.  THUMT: An Open-Source Toolkit for Neural Machine Translation , 2020, AMTA.

[2]  Jeremy M Wolfe,et al.  Visual Attention , 2020, Computational Models for Cognitive Vision.

[3]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[4]  Chris Dyer,et al.  Better Document-Level Machine Translation with Bayes’ Rule , 2019, Transactions of the Association for Computational Linguistics.

[5]  Bill Byrne,et al.  On NMT Search Errors and Model Errors: Cat Got Your Tongue? , 2019, EMNLP.

[6]  Roger Wattenhofer,et al.  On the Validity of Self-Attention as Explanation in Transformer Models , 2019, ArXiv.

[7]  Marta R. Costa-jussà,et al.  Findings of the 2019 Conference on Machine Translation (WMT19) , 2019, WMT.

[8]  Mohit Iyyer,et al.  Syntactically Supervised Transformers for Faster Neural Machine Translation , 2019, ACL.

[9]  Bill Byrne,et al.  Domain Adaptive Inference for Neural Machine Translation , 2019, ACL.

[10]  Noah A. Smith,et al.  Is Attention Interpretable? , 2019, ACL.

[11]  Huda Khayrallah,et al.  Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation , 2019, NAACL.

[12]  Yoshimasa Tsuruoka,et al.  Incorporating Source-Side Phrase Structures into Neural Machine Translation , 2019, Computational Linguistics.

[13]  Changhan Wang,et al.  Levenshtein Transformer , 2019, NeurIPS.

[14]  Chenhui Chu,et al.  A Survey of Multilingual Neural Machine Translation , 2019, ACM Comput. Surv..

[15]  Chenhui Chu,et al.  A Brief Survey of Multilingual Neural Machine Translation. , 2019, 1905.05395.

[16]  Ho-Gyeong Kim,et al.  Knowledge Distillation Using Output Errors for Self-attention End-to-end Models , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Alexander J. Smola,et al.  Language Models with Transformers , 2019, ArXiv.

[18]  Steve Renals,et al.  Dynamic Evaluation of Transformer Language Models , 2019, ArXiv.

[19]  Jiajun Zhang,et al.  End-to-End Speech Translation with Knowledge Distillation , 2019, INTERSPEECH.

[20]  Qun Liu,et al.  An error analysis for image-based multi-modal neural machine translation , 2019, Machine Translation.

[21]  Alexander M. Rush,et al.  Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.

[22]  Grzegorz Chrupala,et al.  Analyzing and interpreting neural networks for NLP: A report on the first BlackboxNLP workshop , 2019, Natural Language Engineering.

[23]  Xing Wang,et al.  Modeling Recurrence for Transformer , 2019, NAACL.

[24]  Mirella Lapata,et al.  Text Generation from Knowledge Graphs with Graph Transformers , 2019, NAACL.

[25]  Naoaki Okazaki,et al.  Positional Encoding to Control Output Sequence Length , 2019, NAACL.

[26]  Xin Wang,et al.  Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation , 2019, NAACL.

[27]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[28]  Sebastian Möller,et al.  Train, Sort, Explain: Learning to Diagnose Translation Models , 2019, NAACL.

[29]  Barnabás Póczos,et al.  Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.

[30]  Gholamreza Haffari,et al.  Selective Attention for Context-aware Neural Machine Translation , 2019, NAACL.

[31]  Hongfei Xu,et al.  Neutron: An Implementation of the Transformer Translation Model and its Variants , 2019, ArXiv.

[32]  Graham Neubig,et al.  On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models , 2019, NAACL.

[33]  M. Carl,et al.  Post-editing neural machine translation versus phrase-based machine translation for English–Chinese , 2019, Machine Translation.

[34]  Sunita Sarawagi,et al.  Calibration of Encoder Decoder Models for Neural Machine Translation , 2019, ArXiv.

[35]  Orhan Firat,et al.  Massively Multilingual Neural Machine Translation , 2019, NAACL.

[36]  George F. Foster,et al.  Reinforcement Learning based Curriculum Optimization for Neural Machine Translation , 2019, NAACL.

[37]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[38]  Graham Neubig,et al.  Improving Robustness of Machine Translation with Synthetic Noise , 2019, NAACL.

[39]  Zheng Zhang,et al.  Star-Transformer , 2019, NAACL.

[40]  Paul Buitelaar,et al.  Augmenting Neural Machine Translation with Knowledge Graphs , 2019, ArXiv.

[41]  Tie-Yan Liu,et al.  Non-Autoregressive Machine Translation with Auxiliary Regularization , 2019, AAAI.

[42]  Gabriel Synnaeve,et al.  A Fully Differentiable Beam Search Decoder , 2019, ICML.

[43]  Jakob Uszkoreit,et al.  Insertion Transformer: Flexible Sequence Generation via Insertion Operations , 2019, ICML.

[44]  Kyunghyun Cho,et al.  Non-Monotonic Sequential Text Generation , 2019, ICML.

[45]  Omer Levy,et al.  Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation , 2019, EMNLP.

[46]  Qi Liu,et al.  Insertion-based Decoding with Automatically Inferred Generation Order , 2019, Transactions of the Association for Computational Linguistics.

[47]  John DeNero,et al.  Adding Interpretable Attention to Neural Translation Models Improves Word Alignment , 2019, ArXiv.

[48]  Dan Liu,et al.  Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory , 2019, ArXiv.

[49]  Quoc V. Le,et al.  The Evolved Transformer , 2019, ICML.

[50]  Douwe Kiela,et al.  No Training Required: Exploring Random Encoders for Sentence Classification , 2019, ICLR.

[51]  Felix Wu,et al.  Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.

[52]  Andrei Popescu-Belis,et al.  Context in Neural Machine Translation: A Review of Models and Evaluations , 2019, ArXiv.

[53]  Valentin Malykh,et al.  Self-Attentive Model for Headline Generation , 2019, ECIR.

[54]  Quan Z. Sheng,et al.  Generating Textual Adversarial Examples for Deep Learning Models: A Survey , 2019, ArXiv.

[55]  Khalil Sima'an,et al.  Modeling Latent Sentence Structure in Neural Machine Translation , 2019, ArXiv.

[56]  Yoav Goldberg,et al.  Assessing BERT's Syntactic Abilities , 2019, ArXiv.

[57]  Ming Zhou,et al.  Unsupervised Neural Machine Translation with SMT as Posterior Regularization , 2019, AAAI.

[58]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[59]  Qiang Zhang,et al.  Variational Self-attention Model for Sentence Representation , 2018, ArXiv.

[60]  Di He,et al.  Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input , 2018, AAAI.

[61]  Yonatan Belinkov,et al.  NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks , 2018, AAAI.

[62]  Yonatan Belinkov,et al.  What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models , 2018, AAAI.

[63]  Francisco Casacuberta,et al.  How Much Does Tokenization Affect Neural Machine Translation? , 2018, CICLing.

[64]  Dario Amodei,et al.  An Empirical Model of Large-Batch Training , 2018, ArXiv.

[65]  Dipankar Das,et al.  SMT vs NMT: A Comparison over Hindi and Bengali Simple Sentences , 2018, ICON.

[66]  Chao-Hong Liu,et al.  The RGNLP Machine Translation Systems for WAT 2018 , 2018, PACLIC.

[67]  Min Zhang,et al.  Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[68]  Shuming Shi,et al.  Neural Machine Translation with Adequacy-Oriented Learning , 2018, AAAI.

[69]  Hua Wu,et al.  Modeling Coherence for Discourse Neural Machine Translation , 2018, AAAI.

[70]  Jindrich Libovický,et al.  End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification , 2018, EMNLP.

[71]  Joelle Pineau,et al.  Language GANs Falling Short , 2018, ICLR.

[72]  Chong Wang,et al.  Neural Phrase-to-Phrase Machine Translation , 2018, ArXiv.

[73]  Pushpak Bhattacharyya,et al.  Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages , 2018, NAACL.

[74]  Minh Le Nguyen,et al.  Regularizing Forward and Backward Decoding to Improve Neural Machine Translation , 2018, 2018 10th International Conference on Knowledge and Systems Engineering (KSE).

[75]  Dylan Cashman,et al.  RNNbow: Visualizing Learning Via Backpropagation Gradients in RNNs , 2018, IEEE Computer Graphics and Applications.

[76]  Kaitao Song,et al.  Hybrid Self-Attention Network for Machine Translation , 2018, ArXiv.

[77]  Deyi Xiong,et al.  Two Effective Approaches to Data Reduction for Neural Machine Translation: Static and Dynamic Sentence Selection , 2018, 2018 International Conference on Asian Language Processing (IALP).

[78]  Jakob Uszkoreit,et al.  Blockwise Parallel Decoding for Deep Autoregressive Models , 2018, NeurIPS.

[79]  François Yvon,et al.  Using Monolingual Data in Neural Machine Translation: a Systematic Study , 2018, WMT.

[80]  Desmond Elliott,et al.  Findings of the Third Shared Task on Multimodal Machine Translation , 2018, WMT.

[81]  Xiaocheng Feng,et al.  Adaptive Multi-pass Decoder for Neural Machine Translation , 2018, EMNLP.

[82]  Jugal K. Kalita,et al.  Parallel Attention Mechanisms in Neural Machine Translation , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[83]  Leonid Sigal,et al.  Middle-Out Decoding , 2018, NeurIPS.

[84]  Elizabeth Salesky,et al.  Optimizing segmentation granularity for neural machine translation , 2018, Machine Translation.

[85]  Xing Li,et al.  STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency , 2018, ArXiv.

[86]  Joakim Nivre,et al.  An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation , 2018, WMT.

[87]  Satoshi Nakamura,et al.  Multi-Source Neural Machine Translation with Data Augmentation , 2018, IWSLT.

[88]  Marta R. Costa-jussà,et al.  (Self-Attentive) Autoencoder-based Universal Language Representation for Machine Translation , 2018, ArXiv.

[89]  Artem Sokolov,et al.  Optimally Segmenting Inputs for NMT Shows Preference for Character-Level Processing , 2018 .

[90]  Huanbo Luan,et al.  Improving the Transformer Translation Model with Document-Level Context , 2018, EMNLP.

[91]  Lemao Liu,et al.  Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[92]  Lucia Specia,et al.  Findings of the WMT 2018 Shared Task on Quality Estimation , 2018, WMT.

[93]  Li Gong,et al.  Tencent Neural Machine Translation Systems for WMT18 , 2018, WMT.

[94]  Hermann Ney,et al.  Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation , 2018, EMNLP.

[95]  Enhong Chen,et al.  Bidirectional Generative Adversarial Networks for Neural Machine Translation , 2018, CoNLL.

[96]  Gholamreza Haffari,et al.  Sequence to Sequence Mixture Model for Diverse Machine Translation , 2018, CoNLL.

[97]  Aaron C. Courville,et al.  Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.

[98]  Yonatan Belinkov,et al.  Identifying and Controlling Important Neurons in Neural Machine Translation , 2018, ICLR.

[99]  Preslav Nakov,et al.  What Is in a Translation Unit? Comparing Character and Subword Representations Beyond Translation , 2018 .

[100]  Chris Dyer,et al.  Sentence Encoding with Tree-constrained Relation Networks , 2018, ArXiv.

[101]  Alex Wang,et al.  Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling , 2018, ArXiv.

[102]  Huda Khayrallah,et al.  Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation , 2018, WMT.

[103]  Hermann Ney,et al.  On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation , 2018, WMT.

[104]  Ahmed Rafea,et al.  Enhancing Translation from English to Arabic Using Two-Phase Decoder Translation , 2018, IntelliSys.

[105]  James Henderson,et al.  Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[106]  Veselin Stoyanov,et al.  Simple Fusion: Return of the Language Model , 2018, WMT.

[107]  Graham Neubig,et al.  MTNT: A Testbed for Machine Translation of Noisy Text , 2018, EMNLP.

[108]  Marcin Junczys-Dowmunt,et al.  Dual Conditional Cross-Entropy Filtering of Noisy Parallel Corpora , 2018, WMT.

[109]  Marcin Junczys-Dowmunt,et al.  Microsoft’s Submission to the WMT2018 News Translation Task: How I Learned to Stop Worrying and Love the Data , 2018, WMT.

[110]  Taro Watanabe,et al.  Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection , 2018, WMT.

[111]  David Chiang,et al.  Correcting Length Bias in Neural Machine Translation , 2018, WMT.

[112]  Ankur Bapna,et al.  Revisiting Character-Based Neural Machine Translation with Capacity and Compression , 2018, EMNLP.

[113]  Mingbo Ma,et al.  Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation , 2018, EMNLP.

[114]  Adrià de Gispert,et al.  The University of Cambridge’s Machine Translation Systems for WMT18 , 2018, WMT.

[115]  Lijun Wu,et al.  A Study of Reinforcement Learning for Neural Machine Translation , 2018, EMNLP.

[116]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[117]  Jiajun Zhang,et al.  A Comparable Study on Model Averaging, Ensembling and Reranking in NMT , 2018, NLPCC.

[118]  Ji Zhang,et al.  Semi-Autoregressive Neural Machine Translation , 2018, EMNLP.

[119]  Hai Zhao,et al.  Exploring Recombination for Efficient Decoding of Neural Machine Translation , 2018, EMNLP.

[120]  Rico Sennrich,et al.  Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation , 2018, EMNLP.

[121]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[122]  Kevin Knight,et al.  Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words , 2018, ArXiv.

[123]  M. Zhou,et al.  Regularizing Neural Machine Translation by Target-bidirectional Agreement , 2018, AAAI.

[124]  Graham Neubig,et al.  Rapid Adaptation of Neural Machine Translation to New Languages , 2018, EMNLP.

[125]  Matiss Rikters,et al.  Debugging Neural Machine Translations , 2018, Doctoral Consortium/Forum@DB&IS.

[126]  Graham Neubig,et al.  SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation , 2018, EMNLP.

[127]  Bill Byrne,et al.  An Operation Sequence Model for Explainable Neural Machine Translation , 2018, BlackboxNLP@EMNLP.

[128]  Graham Neubig,et al.  A Tree-based Decoder for Neural Machine Translation , 2018, EMNLP.

[129]  Xu Sun,et al.  Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation , 2018, EMNLP.

[130]  Kenneth Heafield,et al.  Multi-Source Syntactic Neural Machine Translation , 2018, EMNLP.

[131]  Hai Zhao,et al.  Finding Better Subword Segmentation for Neural Machine Translation , 2018, CCL.

[132]  Jing Yang,et al.  Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT , 2018, NLPCC.

[133]  Francisco Casacuberta,et al.  NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning , 2018, Prague Bull. Math. Linguistics.

[134]  Atsushi Fujita,et al.  Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation , 2018, NMT@ACL.

[135]  Tobias Domhan,et al.  How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures , 2018, ACL.

[136]  Tiejun Zhao,et al.  Forest-Based Neural Machine Translation , 2018, ACL.

[137]  Huda Khayrallah,et al.  Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation , 2018, NMT@ACL.

[138]  Gholamreza Haffari,et al.  Iterative Back-Translation for Neural Machine Translation , 2018, NMT@ACL.

[139]  Jingbo Zhu,et al.  A Simple and Effective Approach to Coverage-Aware Neural Machine Translation , 2018, ACL.

[140]  Mauro Cettolo,et al.  A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation , 2018, COLING.

[141]  Ondrej Bojar,et al.  Morphological and Language-Agnostic Word Segmentation for NMT , 2018, TSD.

[142]  David Barber,et al.  Generative Neural Machine Translation , 2018, NeurIPS.

[143]  Yong Cheng,et al.  Neural Machine Translation with Key-Value Memory-Augmented Attention , 2018, IJCAI.

[144]  David Vilar,et al.  Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models , 2018, NAACL.

[145]  Shuming Ma,et al.  Deconvolution-Based Global Decoding for Neural Machine Translation , 2018, COLING.

[146]  Rui Wang,et al.  A Survey of Domain Adaptation for Neural Machine Translation , 2018, COLING.

[147]  Myle Ott,et al.  Scaling Neural Machine Translation , 2018, WMT.

[148]  A. Hussain,et al.  Text Normalization using Memory Augmented Neural Networks , 2018, Speech Commun..

[149]  Rico Sennrich,et al.  Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[150]  Boris Ginsburg,et al.  OpenSeq2Seq: Extensible Toolkit for Distributed and Mixed Precision Training of Sequence-to-Sequence Models , 2018, ArXiv.

[151]  Naren Ramakrishnan,et al.  Deep Reinforcement Learning for Sequence-to-Sequence Models , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[152]  Kenneth Heafield,et al.  Fast Neural Machine Translation Implementation , 2018, NMT@ACL.

[153]  André F. T. Martins,et al.  Sparse and Constrained Attention for Neural Machine Translation , 2018, ACL.

[154]  Wenhu Chen,et al.  Triangular Architecture for Rare Language Translation , 2018, ACL.

[155]  Massimo Piccardi,et al.  English-Basque Statistical and Neural Machine Translation , 2018, LREC.

[156]  Marcello Federico,et al.  Deep Neural Machine Translation with Weakly-Recurrent Units , 2018, EAMT.

[157]  Mark Fishel,et al.  Multi-Domain Neural Machine Translation , 2018, EAMT.

[158]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[159]  Huda Khayrallah,et al.  On the Impact of Various Types of Noise on Neural Machine Translation , 2018, NMT@ACL.

[160]  Yang Liu,et al.  Towards Robust Neural Machine Translation , 2018, ACL.

[161]  Marine Carpuat,et al.  Bi-Directional Neural Machine Translation with Synthetic Parallel Data , 2018, NMT@ACL.

[162]  Arianna Bisazza,et al.  Neural versus phrase-based MT quality: An in-depth analysis on English-German and English-French , 2018, Comput. Speech Lang..

[163]  Adrià de Gispert,et al.  Multi-representation ensembles and delayed SGD updates improve syntax-based NMT , 2018, ACL.

[164]  Gonzalo Iglesias,et al.  Neural Machine Translation Decoding with Terminology Constraints , 2018, NAACL.

[165]  Taku Kudo,et al.  Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.

[166]  Ankur Bapna,et al.  The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[167]  Diego Marcheggiani,et al.  Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks , 2018, NAACL.

[168]  Yichao Lu,et al.  A neural interlingua for multilingual machine translation , 2018, WMT.

[169]  Yun Chen,et al.  A Stable and Effective Learning Strategy for Trainable Greedy Decoding , 2018, EMNLP.

[170]  Guillaume Lample,et al.  Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.

[171]  Shi Feng,et al.  Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.

[172]  Andy Way,et al.  Investigating Backtranslation in Neural Machine Translation , 2018, EAMT.

[173]  Marcin Junczys-Dowmunt,et al.  Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation , 2018, NAACL.

[174]  Miguel Ballesteros,et al.  Pieces of Eight: 8-bit Neural Machine Translation , 2018, NAACL.

[175]  Yidong Chen,et al.  Lattice-to-sequence attentional Neural Machine Translation models , 2018, Neurocomputing.

[176]  Ondrej Bojar,et al.  Training Tips for the Transformer Model , 2018, Prague Bull. Math. Linguistics.

[177]  Hong Yu,et al.  Sentence Simplification with Memory-Augmented Neural Networks , 2018, NAACL.

[178]  Gonzalo Iglesias,et al.  Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment , 2018, NAACL.

[179]  Yoshua Bengio,et al.  Fine-grained attention mechanism for neural machine translation , 2018, Neurocomputing.

[180]  Gonzalo Iglesias,et al.  Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation , 2018, AMTA.

[181]  Samy Bengio,et al.  Tensor2Tensor for Neural Machine Translation , 2018, AMTA.

[182]  F. Seide,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[183]  Christof Monz,et al.  The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.

[184]  Aurko Roy,et al.  Fast Decoding in Sequence Models using Discrete Latent Variables , 2018, ICML.

[185]  Ashish Vaswani,et al.  Self-Attention with Relative Position Representations , 2018, NAACL.

[186]  Andy Way,et al.  SMT versus NMT: Preliminary comparisons for Irish , 2018, LoResMT@AMTA.

[187]  Ivan Skorokhodov,et al.  Semi-Supervised Neural Machine Translation with Language Models , 2018, LoResMT@AMTA.

[188]  Atsushi Fujita,et al.  A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation , 2018, AMTA.

[189]  Enhong Chen,et al.  Joint Training for Neural Machine Translation Models with Monolingual Data , 2018, AAAI.

[190]  Matthias Sperber,et al.  XNMT: The eXtensible Neural Machine Translation Toolkit , 2018, AMTA.

[191]  Marc'Aurelio Ranzato,et al.  Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.

[192]  Jason Lee,et al.  Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[193]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[194]  Li Zhao,et al.  Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization , 2018, AAAI.

[195]  Xiaolin Wang,et al.  CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++ , 2018, EMNLP.

[196]  Chengqi Zhang,et al.  Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling , 2018, IJCAI.

[197]  Lior Wolf,et al.  Non-Adversarial Unsupervised Word Translation , 2018, EMNLP.

[198]  Rongrong Ji,et al.  Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[199]  Shan Wu,et al.  Variational Recurrent Neural Machine Translation , 2018, AAAI.

[200]  Michael J. Denkowski,et al.  Sockeye: A Toolkit for Neural Machine Translation , 2017, ArXiv.

[201]  Vlad Zhukov,et al.  Differentiable lower bound for expected BLEU score , 2017, ArXiv.

[202]  Hua Wu,et al.  Multi-channel Encoder for Neural Machine Translation , 2017, AAAI.

[203]  Sungzoon Cho,et al.  Distance-based Self-Attention Network for Natural Language Inference , 2017, ArXiv.

[204]  Fethi Bougares,et al.  Neural Machine Translation by Generating Multiple Linguistic Factors , 2017, SLSP.

[205]  Kamel Smaïli,et al.  Is statistical machine translation approach dead , 2017 .

[206]  Guodong Zhou,et al.  Cache-based Document-level Neural Machine Translation , 2017, ArXiv.

[207]  Yang Liu,et al.  Learning to Remember Translation History with a Continuous Cache , 2017, TACL.

[208]  Ole Winther,et al.  Recurrent Relational Networks , 2017, NeurIPS.

[209]  Jan Niehues,et al.  Effective Strategies in Zero-Shot Neural Machine Translation , 2017, IWSLT.

[210]  Gholamreza Haffari,et al.  Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model , 2017, COLING.

[211]  C. Lee Giles,et al.  The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations , 2017, ArXiv.

[212]  Angela Fan,et al.  Controllable Abstractive Summarization , 2017, NMT@ACL.

[213]  Marc'Aurelio Ranzato,et al.  Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.

[214]  Jonathan G. Fiscus,et al.  Overview of the NIST 2016 LoReHLT evaluation , 2017, Machine Translation.

[215]  Gholamreza Haffari,et al.  Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[216]  Victor O. K. Li,et al.  Non-Autoregressive Neural Machine Translation , 2017, ICLR.

[217]  Yonatan Belinkov,et al.  Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.

[218]  Di He,et al.  Decoding with Value Networks for Neural Machine Translation , 2017, NIPS.

[219]  Richard Socher,et al.  Weighted Transformer Network for Machine Translation , 2017, ArXiv.

[220]  Rico Sennrich,et al.  Evaluating Discourse Phenomena in Neural Machine Translation , 2017, NAACL.

[221]  M. Utiyama,et al.  Syntax-Directed Attention for Neural Machine Translation , 2017, AAAI.

[222]  Quoc V. Le,et al.  Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[223]  Takenobu Tokunaga,et al.  Key-value Attention Mechanism for Neural Machine Translation , 2017, IJCNLP.

[224]  Satoshi Nakamura,et al.  Improving Neural Machine Translation through Phrase-based Forced Decoding , 2017, IJCNLP.

[225]  Raj Dabre,et al.  Neural Machine Translation: Basics, Practical Aspects and Recent Trends , 2017, IJCNLP.

[226]  Tetsuji Nakagawa,et al.  An Empirical Study of Language Relatedness for Transfer Learning in Neural Machine Translation , 2017, PACLIC.

[227]  Huda Khayrallah,et al.  Neural Lattice Search for Domain Adaptation in Machine Translation , 2017, IJCNLP.

[228]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[229]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[230]  Ondrej Bojar,et al.  Paying Attention to Multi-Word Expressions in Neural Machine Translation , 2017, MTSUMMIT.

[231]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[232]  Mark Fishel,et al.  Confidence through Attention , 2017, MTSummit.

[233]  Suyog Gupta,et al.  To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.

[234]  Thomas Fang Zheng,et al.  Enhanced neural machine translation by learning from draft , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[235]  David Chiang,et al.  Improving Lexical Choice in Neural Machine Translation , 2017, NAACL.

[236]  Desmond Elliott,et al.  Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description , 2017, WMT.

[237]  Christof Monz,et al.  What does Attention in Neural Machine Translation Pay Attention to? , 2017, IJCNLP.

[238]  Yachao Li,et al.  Neural Machine Translation with Phrasal Attention , 2017, CWMT.

[239]  Jingbo Zhu,et al.  Handling Many-To-One UNK Translation for Neural Machine Translation , 2017, CWMT.

[240]  Yufeng Chen,et al.  An Unknown Word Processing Method in NMT by Integrating Syntactic Structure and Semantic Concept , 2017, CWMT.

[241]  Yufeng Chen,et al.  A Method of Unknown Words Processing for Neural Machine Translation Using HowNet , 2017, CWMT.

[242]  Abraham Ittycheriah,et al.  Neural Machine Translation , 2017, International Journal for Research in Applied Science and Engineering Technology.

[243]  Pavel Levin,et al.  Toward a full-scale neural machine translation in production: the Booking.com use case , 2017, MTSUMMIT.

[244]  Tao Shen,et al.  DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[245]  Marta R. Costa-jussà,et al.  Coverage for Character Based Neural Machine Translation , 2017, Proces. del Leng. Natural.

[246]  Samuel R. Bowman,et al.  Do latent tree learning models identify meaningful structure in sentences? , 2017, TACL.

[247]  Andrei Popescu-Belis,et al.  Self-Attentive Residual Decoder for Neural Machine Translation , 2017, NAACL.

[248]  Young-Kil Kim,et al.  Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages , 2017, LREC.

[249]  Lemao Liu,et al.  Instance Weighting for Neural Machine Translation Domain Adaptation , 2017, EMNLP.

[250]  Lemao Liu,et al.  Neural Machine Translation with Source Dependency Representation , 2017, EMNLP.

[251]  Quoc V. Le,et al.  Effective Domain Mixing for Neural Machine Translation , 2017, WMT.

[252]  Andy Way,et al.  Neural Pre-Translation for Hybrid Machine Translation , 2017, MTSUMMIT.

[253]  Richard Socher,et al.  Towards Neural Machine Translation with Latent Tree Attention , 2017, SPNLP@EMNLP.

[254]  Yoshua Bengio,et al.  On integrating a language model into neural machine translation , 2017, Comput. Speech Lang..

[255]  Andy Way,et al.  A Comparative Quality Evaluation of PBSMT and NMT using Professional Translators , 2017, MTSUMMIT.

[256]  Philipp Koehn,et al.  Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.

[257]  Mingbo Ma,et al.  When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) , 2017, EMNLP.

[258]  Felix Hieber,et al.  Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning , 2017, EMNLP.

[259]  Marta R. Costa-jussà,et al.  Byte-based Neural Machine Translation , 2017, SWCN@EMNLP.

[260]  Yoshua Bengio,et al.  Multi-way, multilingual neural machine translation , 2017, Comput. Speech Lang..

[261]  David Chiang,et al.  Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation , 2017, IJCNLP.

[262]  Yonatan Belinkov,et al.  Neural Machine Translation Training in a Multi-Domain Scenario , 2017, IWSLT.

[263]  Marcis Pinnis,et al.  Neural Machine Translation for Morphologically Rich Languages with Improved Sub-word Units and Synthetic Data , 2017, TSD.

[264]  Marcello Federico,et al.  Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors , 2017, INTERSPEECH.

[265]  Jörg Tiedemann,et al.  Neural machine translation for low-resource languages , 2017, ArXiv.

[266]  Antonio Valerio Miceli Barone,et al.  The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[267]  Graham Neubig,et al.  A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models , 2017, AAAI.

[268]  George F. Foster,et al.  Cost Weighting for Neural Machine Translation Domain Adaptation , 2017, NMT@ACL.

[269]  Yang Feng,et al.  Memory-augmented Neural Machine Translation , 2017, EMNLP.

[270]  Jan Niehues,et al.  Analyzing Neural MT Search and Model Performance , 2017, NMT@ACL.

[271]  Jörg Tiedemann,et al.  Neural Machine Translation with Extended Context , 2017, DiscoMT@EMNLP.

[272]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[273]  Shahram Khadivi,et al.  Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search , 2017, EMNLP.

[274]  Marine Carpuat,et al.  Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation , 2017, NMT@ACL.

[275]  Christof Monz,et al.  Dynamic Data Selection for Neural Machine Translation , 2017, EMNLP.

[276]  Rico Sennrich,et al.  Regularization techniques for fine-tuning in neural machine translation , 2017, EMNLP.

[277]  Bill Byrne,et al.  SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies , 2017, EMNLP.

[278]  Jingbo Zhu,et al.  Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation , 2017, EMNLP.

[279]  Tommi S. Jaakkola,et al.  A causal framework for explaining the predictions of black-box sequence-to-sequence models , 2017, EMNLP.

[280]  Nenghai Yu,et al.  Dual Supervised Learning , 2017, ICML.

[281]  Yang Liu,et al.  Visualizing and Understanding Neural Machine Translation , 2017, ACL.

[282]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[283]  Ming Zhou,et al.  Sequence-to-Dependency Neural Machine Translation , 2017, ACL.

[284]  Masao Utiyama,et al.  Sentence Embedding for Neural Machine Translation Domain Adaptation , 2017, ACL.

[285]  Jingtao Yao,et al.  Chunk-based Decoder for Neural Machine Translation , 2017, ACL.

[286]  Shujian Huang,et al.  Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder , 2017, ACL.

[287]  Alexander M. Fraser,et al.  Modeling Target-Side Inflection in Neural Machine Translation , 2017, WMT.

[288]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[289]  Satoshi Nakamura,et al.  An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation , 2017, NMT@ACL.

[290]  Chong Wang,et al.  Towards Neural Phrase-based Machine Translation , 2017, ICLR.

[291]  Yoshua Bengio,et al.  Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning , 2017, Rep4NLP@ACL.

[292]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[293]  Markus Freitag,et al.  Attention-based Vocabulary Selection for NMT Decoding , 2017, ArXiv.

[294]  Lukasz Kaiser,et al.  Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.

[295]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[296]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[297]  Marcello Federico,et al.  Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English , 2017, Prague Bull. Math. Linguistics.

[298]  Andy Way,et al.  Is Neural Machine Translation the New State of the Art? , 2017, Prague Bull. Math. Linguistics.

[299]  Stephen Clark,et al.  Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs , 2017, Natural Language Engineering.

[300]  Colin Raffel,et al.  Learning Hard Alignments with Variational Inference , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[301]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[302]  Ming Zhou,et al.  Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[303]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[304]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[305]  Jacob Devlin,et al.  Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU , 2017, EMNLP.

[306]  Deyi Xiong,et al.  A GRU-Gated Attention Model for Neural Machine Translation , 2017, ArXiv.

[307]  Pierre Isabelle,et al.  A Challenge Set Approach to Evaluating Machine Translation , 2017, EMNLP.

[308]  Orhan Firat,et al.  Does Neural Machine Translation Benefit from Larger Context? , 2017, ArXiv.

[309]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[310]  Yoav Goldberg,et al.  Towards String-To-Tree Neural Machine Translation , 2017, ACL.

[311]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[312]  Andy Way,et al.  Exploiting Cross-Sentence Context for Neural Machine Translation , 2017, EMNLP.

[313]  Bill Byrne,et al.  Unfolding and Shrinking Neural Machine Translation Ensembles , 2017, EMNLP.

[314]  James R. Glass,et al.  What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[315]  Colin Raffel,et al.  Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.

[316]  Tie-Yan Liu,et al.  Adversarial Neural Machine Translation , 2017, ACML.

[317]  Jiajun Zhang,et al.  Neural System Combination for Machine Translation , 2017, ACL.

[318]  Jindrich Libovický,et al.  Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[319]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[320]  Matthias Sperber,et al.  Neural Lattice-to-Sequence Models for Uncertain Inputs , 2017, EMNLP.

[321]  Wei Chen,et al.  Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets , 2017, NAACL.

[322]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[323]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[324]  Graham Neubig,et al.  Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.

[325]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[326]  Yoshimasa Tsuruoka,et al.  Learning to Parse and Translate Improves Neural Machine Translation , 2017, ACL.

[327]  Lemao Liu,et al.  Deterministic Attention for Sequence-to-Sequence Constituent Parsing , 2017, AAAI.

[328]  Victor O. K. Li,et al.  Trainable Greedy Decoding for Neural Machine Translation , 2017, EMNLP.

[329]  Markus Freitag,et al.  Ensemble Distillation for Neural Machine Translation , 2017, ArXiv.

[330]  Markus Freitag,et al.  Beam Search Strategies for Neural Machine Translation , 2017, NMT@ACL.

[331]  Rico Sennrich,et al.  Predicting Target Language CCG Supertags Improves Neural Machine Translation , 2017, WMT.

[332]  Anne Marie Macari train , 2017, The Fairchild Books Dictionary of Fashion.

[333]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[334]  Geoffrey E. Hinton,et al.  Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[335]  Nadir Durrani,et al.  QCRI Machine Translation Systems for IWSLT 16 , 2017, ArXiv.

[336]  Antonio Toral,et al.  A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions , 2017, EACL.

[337]  Gholamreza Haffari,et al.  Towards Decoding as Continuous Optimisation in Neural Machine Translation , 2017, EMNLP.

[338]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[339]  S. Moss Listen , 2017 .

[340]  Heike Adel,et al.  Exploring Different Dimensions of Attention for Uncertainty Detection , 2016, EACL.

[341]  Markus Freitag,et al.  Fast Domain Adaptation for Neural Machine Translation , 2016, ArXiv.

[342]  Jungi Kim,et al.  Boosting Neural Machine Translation , 2016, IJCNLP.

[343]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[344]  Ryan Cotterell,et al.  Neural Multi-Source Morphological Reinflection , 2016, EACL.

[345]  Rico Sennrich,et al.  How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs , 2016, EACL.

[346]  Adrià de Gispert,et al.  Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices , 2016, EACL.

[347]  Yong Zhang,et al.  Attention pooling-based convolutional neural network for sentence modelling , 2016, Inf. Sci..

[348]  Navdeep Jaitly,et al.  Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[349]  R. S. Milton,et al.  Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages , 2016, ArXiv.

[350]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[351]  Wei Chen,et al.  A Character-Aware Encoder for Neural Machine Translation , 2016, COLING.

[352]  Mikio Yamamoto,et al.  Translation of Patent Sentences with a Large Vocabulary of Technical Terms Using Neural Machine Translation , 2016, WAT@COLING.

[353]  Toshiaki Nakazawa,et al.  Kyoto University Participation to WAT 2016 , 2016, WAT@COLING.

[354]  Paris Smaragdis,et al.  NoiseOut: A Simple Way to Prune Neural Networks , 2016, ArXiv.

[355]  Yang Liu,et al.  Joint Training for Pivot-based Neural Machine Translation , 2016, IJCAI.

[356]  Jan Niehues,et al.  Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder , 2016, IWSLT.

[357]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[358]  Quoc V. Le,et al.  Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[359]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[360]  Yann Dauphin,et al.  A Convolutional Encoder Model for Neural Machine Translation , 2016, ACL.

[361]  Eunsol Choi,et al.  Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.

[362]  Yoav Goldberg,et al.  Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.

[363]  Lei Yu,et al.  The Neural Noisy Channel , 2016, ICLR.

[364]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[365]  Jiajun Zhang,et al.  Exploiting Source-side Monolingual Data in Neural Machine Translation , 2016, EMNLP.

[366]  Pushpak Bhattacharyya,et al.  Faster Decoding for Subword Level Phrase-based SMT between Related Languages , 2016, VarDial@COLING.

[367]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[368]  Navdeep Jaitly,et al.  RNN Approaches to Text Normalization: A Challenge , 2016, ArXiv.

[369]  Alex Graves,et al.  Neural Machine Translation in Linear Time , 2016, ArXiv.

[370]  Samy Bengio,et al.  Can Active Memory Replace Attention? , 2016, NIPS.

[371]  Iryna Gurevych,et al.  Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks , 2016, COLING.

[372]  Jiajun Zhang,et al.  Bridging Neural Machine Translation and Bilingual Dictionaries , 2016, ArXiv.

[373]  Pushpak Bhattacharyya,et al.  Learning variable length units for SMT between related languages via Byte Pair Encoding , 2016, SWCN@EMNLP.

[374]  Graham Neubig,et al.  Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 , 2016, WAT@COLING.

[375]  Ole Winther,et al.  Neural Machine Translation with Characters and Hierarchical Encoding , 2016, ArXiv.

[376]  Bo Wang,et al.  SYSTRAN's Pure Neural Machine Translation Systems , 2016, ArXiv.

[377]  Min Zhang,et al.  Neural Machine Translation Advised by Statistical Machine Translation , 2016, AAAI.

[378]  Jan Niehues,et al.  Pre-Translation for Neural Machine Translation , 2016, COLING.

[379]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[380]  Jason Lee,et al.  Fully Character-Level Neural Machine Translation without Explicit Segmentation , 2016, TACL.

[381]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[382]  Ashwin K. Vijayakumar,et al.  Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[383]  Marcin Junczys-Dowmunt,et al.  Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions , 2016, IWSLT.

[384]  Graham Neubig,et al.  Learning to Translate in Real-time with Neural Machine Translation , 2016, EACL.

[385]  David Grangier,et al.  Vocabulary Selection Strategies for Neural Machine Translation , 2016, ArXiv.

[386]  Aaron C. Courville,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[387]  Graham Neubig,et al.  Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[388]  Quoc V. Le,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[389]  Lei Yu,et al.  Online Segment to Segment Neural Transduction , 2016, EMNLP.

[390]  Weinan Zhang,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[391]  Lemao Liu,et al.  Neural Machine Translation with Supervised Attention , 2016, COLING.

[392]  Rongrong Ji,et al.  Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation , 2016, AAAI.

[393]  Arianna Bisazza,et al.  Neural versus Phrase-Based Machine Translation Quality: a Case Study , 2016, EMNLP.

[394]  Karin M. Verspoor,et al.  Findings of the 2016 Conference on Machine Translation , 2016, WMT.

[395]  Khalil Sima'an,et al.  A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[396]  Jean Oh,et al.  Attention-based Multimodal Neural Machine Translation , 2016, WMT.

[397]  Philipp Koehn,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016 .

[398]  Kyunghyun Cho,et al.  Neural Machine Translation , 2016, ACL.

[399]  Zhiguo Wang,et al.  Supervised Attentions for Neural Machine Translation , 2016, EMNLP.

[400]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[401]  Alexander J. Smola,et al.  Neural Machine Translation with Recurrent Attention Modeling , 2016, EACL.

[402]  Jiajun Zhang,et al.  Towards Zero Unknown Word in Neural Machine Translation , 2016, IJCAI.

[403]  Wenhu Chen,et al.  Guided Alignment Training for Topic-Aware Neural Machine Translation , 2016, AMTA.

[404]  Yoshua Bengio,et al.  Context-dependent word representation for neural machine translation , 2016, Comput. Speech Lang..

[405]  Alexander M. Rush,et al.  Sequence-Level Knowledge Distillation , 2016, EMNLP.

[406]  Maosong Sun,et al.  Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.

[407]  Yaser Al-Onaizan,et al.  Zero-Resource Translation with Multi-Lingual Neural Machine Translation , 2016, EMNLP.

[408]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[409]  Sunita Sarawagi,et al.  Length bias in Encoder Decoder Models and a Case for Global Conditioning , 2016, EMNLP.

[410]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[411]  Rico Sennrich,et al.  Linguistic Input Features Improve Neural Machine Translation , 2016, WMT.

[412]  Kyunghyun Cho,et al.  Can neural machine translation do simultaneous translation? , 2016, ArXiv.

[413]  Satoshi Nakamura,et al.  Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.

[414]  Bill Byrne,et al.  The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16 , 2016, WMT.

[415]  Yaohua Tang,et al.  Neural Machine Translation with External Phrase Memory , 2016, ArXiv.

[416]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[417]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[418]  Lemao Liu,et al.  Agreement on Target-bidirectional Neural Machine Translation , 2016, NAACL.

[419]  Qun Liu,et al.  Memory-enhanced Decoder for Neural Machine Translation , 2016, EMNLP.

[420]  Christopher D. Manning,et al.  Compression of Neural Machine Translation Models via Pruning , 2016, CoNLL.

[421]  Ted Briscoe,et al.  Grammatical error correction using neural machine translation , 2016, NAACL.

[422]  Matthias Sperber,et al.  Lecture Translator - Speech translation framework for simultaneous lecture translation , 2016, NAACL.

[423]  David Chiang,et al.  An Attentional Model for Speech Translation Without Transcription , 2016, NAACL.

[424]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[425]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[426]  Pascal Vincent,et al.  Hierarchical Memory Networks , 2016, ArXiv.

[427]  Rico Sennrich,et al.  The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT , 2016, WMT.

[428]  Kyunghyun Cho,et al.  Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model , 2016, ArXiv.

[429]  Zhiguo Wang,et al.  Vocabulary Manipulation for Neural Machine Translation , 2016, ACL.

[430]  Zhiguo Wang,et al.  Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.

[431]  Bill Byrne,et al.  Syntactically Guided Neural Machine Translation , 2016, ACL.

[432]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[433]  Christopher D. Manning,et al.  Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models , 2016, ACL.

[434]  Manaal Faruqui,et al.  Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.

[435]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[436]  Tara N. Sainath,et al.  Learning compact recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[437]  Ian McGraw,et al.  On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[438]  Yoshua Bengio,et al.  A Character-level Decoder without Explicit Segmentation for Neural Machine Translation , 2016, ACL.

[439]  Simon Osindero,et al.  Recursive Recurrent Nets with Attention Modeling for OCR in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[440]  José A. R. Fonollosa,et al.  Character-based Neural Machine Translation , 2016, ACL.

[441]  Yoshimasa Tsuruoka,et al.  Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.

[442]  Marco Tulio Ribeiro,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[443]  Hua Wu,et al.  Improved Neural Machine Translation with SMT Features , 2016, AAAI.

[444]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[445]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[446]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[447]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[448]  Stefan Riezler,et al.  Multimodal Pivots for Image Caption Translation , 2016, ACL.

[449]  Shi Feng,et al.  Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model , 2016, ArXiv.

[450]  Gholamreza Haffari,et al.  Incorporating Structural Alignment Biases into an Attentional Neural Translation Model , 2016, NAACL.

[451]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[452]  Yoshua Bengio,et al.  Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.

[453]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[454]  Daniel Jurafsky,et al.  Mutual Information and Diverse Decoding Improve Neural Machine Translation , 2016, ArXiv.

[455]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[456]  Jian Cheng,et al.  Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[457]  Yang Liu,et al.  Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation , 2015, IJCAI.

[458]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[459]  Yang Liu,et al.  Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[460]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[461]  William Lewis,et al.  Skype Translator: Breaking down language and hearing barriers. A behind the scenes look at near real-time speech translation , 2015, TC.

[462]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[463]  S. Chopra,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[464]  Marcin Andrychowicz,et al.  Neural Random Access Machines , 2015, ERCIM News.

[465]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[466]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[467]  奥里奥尔·温亚尔斯,et al.  Neural machine translation systems with rare word processing , 2015 .

[468]  Satoshi Nakamura,et al.  Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015 , 2015, WAT.

[469]  Desmond Elliott,et al.  Multilingual Image Description with Neural Sequence Models , 2015, 1510.04709.

[470]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[471]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[472]  Alexander M. Rush,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[473]  Yoshua Bengio,et al.  Montreal Neural Machine Translation Systems for WMT’15 , 2015, WMT@EMNLP.

[474]  John DeNero,et al.  Variable-Length Word Encodings for Neural Translation Models , 2015, EMNLP.

[475]  Alexandra Birch,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[476]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[477]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[478]  Quoc V. Le,et al.  Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[479]  R. Venkatesh Babu,et al.  Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[480]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[481]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[482]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[483]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[484]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[485]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[486]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[487]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[488]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[489]  Yoshua Bengio,et al.  On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[490]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[491]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[492]  Ole Winther,et al.  Convolutional LSTM Networks for Subcellular Localization of Proteins , 2015, AlCoB.

[493]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[494]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[495]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[496]  Christian Szegedy,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[497]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[498]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[499]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[500]  Yoshua Bengio,et al.  End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.

[501]  Steve Renals,et al.  Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[502]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[503]  Chris Dyer Notes on Noise Contrastive Estimation and Negative Sampling , 2014, ArXiv.

[504]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[505]  Roi Livni,et al.  On the Computational Efficiency of Training Neural Networks , 2014, NIPS.

[506]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[507]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[508]  Yoshua Bengio,et al.  Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation , 2014, SSST@EMNLP.

[509]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[510]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[511]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[512]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[513]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[514]  Aaron C. Courville,et al.  Generative adversarial networks , 2014, Commun. ACM.

[515]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[516]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[517]  Yongqiang Wang,et al.  Efficient lattice rescoring using recurrent neural network language models , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[518]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[519]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[520]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[521]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[522]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[523]  Peng Li,et al.  Recursive Autoencoders for ITG-Based Translation , 2013, EMNLP.

[524]  Gregory Shakhnarovich,et al.  A Systematic Exploration of Diversity in Machine Translation , 2013, EMNLP.

[525]  T. Kathirvalavakumar,et al.  Pruning algorithms of neural networks — a comparative study , 2013, Central European Journal of Computer Science.

[526]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[527]  Ming Zhou,et al.  Machine Translation Detection from Monolingual Web-Text , 2013, ACL.

[528]  Ming Zhou,et al.  Bilingual Data Cleaning for SMT using Graph-based Random Walk , 2013, ACL.

[529]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[530]  Nando de Freitas,et al.  Predicting Parameters in Deep Learning , 2013, NIPS.

[531]  Graeme W. Blackwood,et al.  N-gram posterior probability confidence measures for statistical machine translation: an empirical study , 2013, Machine Translation.

[532]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[533]  Alexander H. Waibel,et al.  Training speech translation from audio recordings of interpreter-mediated communication , 2013, Comput. Speech Lang..

[534]  J. Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[535]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[536]  Holger Schwenk,et al.  Continuous Space Translation Models for Phrase-Based Statistical Machine Translation , 2012, COLING.

[537]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[538]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[539]  Petr Motlícek,et al.  Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition , 2012, INTERSPEECH.

[540]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[541]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[542]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[543]  Mike Schuster,et al.  Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[544]  Chris Quirk,et al.  MT Detection in Web-Scraped Parallel Corpora , 2011, MTSUMMIT.

[545]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[546]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[547]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[548]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[549]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[550]  Roland Kuhn,et al.  Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation , 2010, EMNLP.

[551]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[552]  Jimmy J. Lin,et al.  Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.

[553]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[554]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[555]  Alexander H. Waibel,et al.  Automatic translation from parallel speech: Simultaneous interpretation as MT training data , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[556]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[557]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[558]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[559]  Bashô Matsuo Basho: The Complete Haiku , 2008 .

[560]  Alexander H. Waibel,et al.  Simultaneous translation of lectures and speeches , 2007, Machine Translation.

[561]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[562]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[563]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[564]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[565]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation using Phrase-Based Translation Models , 2005, HLT.

[566]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[567]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[568]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[569]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[570]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[571]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[572]  Vaibhava Goel,et al.  Segmental minimum Bayes-risk ASR voting strategies , 2000, INTERSPEECH.

[573]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[574]  Philip Resnik,et al.  Mining the Web for Bilingual Text , 1999, ACL.

[575]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[576]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[577]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[578]  Jerome R. Bellegarda,et al.  A latent semantic analysis framework for large-Span language modeling , 1997, EUROSPEECH.

[579]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[580]  Mikel L. Forcada,et al.  Asynchronous translations with recurrent neural nets , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[581]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[582]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[583]  W. Byrne,et al.  Generalization and maximum likelihood from small data sets , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[584]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[585]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[586]  Hava T. Siegelmann,et al.  On the computational power of neural nets , 1992, COLT '92.

[587]  Mark Jurik,et al.  Neurocomputing: Foundations of research , 1992 .

[588]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[589]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[590]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[591]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[592]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[593]  G. Seth Psychology of Language , 1968, Nature.

[594]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[595]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[596]  Adrià de Gispert,et al.  CUED@WMT19:EWC&LMs , 2019, WMT.

[597]  Chenhui Chu,et al.  A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation , 2018, J. Inf. Process..

[598]  Wei Wu,et al.  Phrase-level Self-Attention Networks for Universal Sentence Encoding , 2018, EMNLP.

[599]  Di He,et al.  Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.

[600]  Philipp Koehn,et al.  An Analysis of Source Context Dependency in Neural Machine Translation , 2018, EAMT.

[601]  Kenny Q. Zhu,et al.  Controlling Length in Abstractive Summarization Using a Convolutional Neural Network , 2018, EMNLP.

[602]  Hermann Ney,et al.  The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task , 2018, WMT.

[603]  J. Crego,et al.  Analyzing Knowledge Distillation in Neural Machine Translation , 2018, IWSLT.

[604]  Tanja Schmidt,et al.  How to Move to Neural Machine Translation for Enterprise-Scale Programs - An Early Adoption Case Study , 2018, EAMT.

[605]  Pierrette Bouillon,et al.  Neural Machine Translation : A Comparison of MTH and DeepL at Swiss Post ' s Language Service , 2018 .

[606]  Dakwale,et al.  Fine-Tuning for Neural Machine Translation with Limited Degradation across In- and Out-of-Domain Data , 2017, MTSUMMIT.

[607]  Alexander M. Fraser,et al.  Target-side Word Segmentation Strategies for Neural Machine Translation , 2017, WMT.

[608]  Masashi Toyoda,et al.  A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size , 2017, WAT@IJCNLP.

[609]  Wei Chen,et al.  Sogou Neural Machine Translation Systems for WMT17 , 2017, WMT.

[610]  Mauro Cettolo,et al.  Overview of the IWSLT 2017 Evaluation Campaign , 2017, IWSLT.

[611]  Hermann Ney,et al.  Biasing Attention-Based Recurrent Neural Networks Using External Alignment Information , 2017, WMT.

[612]  Kenneth Heafield,et al.  Copied Monolingual Data Improves Low-Resource Neural Machine Translation , 2017, WMT.

[613]  Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data , 2017, Lecture Notes in Computer Science.

[614]  Fethi Bougares,et al.  Factored Neural Machine Translation Architectures , 2016, IWSLT.

[615]  Hans Uszkoreit,et al.  Deeper Machine Translation and Evaluation for German , 2016, DMTW.

[616]  Hermann Ney,et al.  Alignment-Based Neural Machine Translation , 2016, WMT.

[617]  Mark J. F. Gales,et al.  Sequence Student-Teacher Training of Deep Neural Networks , 2016, INTERSPEECH.

[618]  Hwidong Na,et al.  An Effective Diverse Decoding Scheme for Robust Synonymous Sentence Translation , 2016, AMTA.

[619]  He He,et al.  Interpretese vs. Translationese: The Uniqueness of Human Strategies in Simultaneous Interpretation , 2016, NAACL.

[620]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[621]  Dan Klein,et al.  When and why are log-linear models self-normalizing? , 2015, NAACL.

[622]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[623]  Tomoki Toda,et al.  Speed or accuracy? a study in evaluation of simultaneous speech translation , 2015, INTERSPEECH.

[624]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[625]  Jordan L. Boyd-Graber,et al.  Don't Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation , 2014, EMNLP.

[626]  Yifan Gong,et al.  Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.

[627]  Yoshua Bengio,et al.  Audio Chord Recognition with Recurrent Neural Networks , 2013, ISMIR.

[628]  Shahram Khadivi,et al.  Parallel Corpus Refinement as an Outlier Detection Algorithm , 2011, MTSUMMIT.

[629]  Lucia Specia,et al.  Exploiting Objective Annotations for Measuring Translation Post-editing Effort , 2011 .

[630]  Holger Schwenk,et al.  N-gram-based machine translation enhanced with neural networks , 2010, IWSLT.

[631]  Holger Schwenk,et al.  Investigations on large-scale lightly-supervised training for statistical machine translation. , 2008, IWSLT.

[632]  Jane Reichhold,et al.  Basho : the complete haiku , 2008 .

[633]  Yoshua Bengio,et al.  Neural Probabilistic Language Models , 2006 .

[634]  Alex Waibel,et al.  Adaptation of the translation model for statistical machine translation based on information retrieval , 2005, EAMT.

[635]  Hermann Ney,et al.  Statistical multi-source translation , 2001, MTSUMMIT.

[636]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[637]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[638]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[639]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[640]  Andy Davis,et al.  This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning , 2022 .

[641]  Kenneth Heafield,et al.  Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering , 2018, WMT.

[642]  Ondrej Bojar,et al.  Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.