暂无分享,去创建一个
Yann Dauphin | Felix Wu | Alexei Baevski | Angela Fan | Michael Auli | Felix Wu | Michael Auli | Angela Fan | Alexei Baevski | Y. Dauphin
[1] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.
[2] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.
[3] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[4] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.
[5] Chengqi Zhang,et al. Fast Directional Self-Attention Mechanism , 2018, ArXiv.
[6] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[7] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[8] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[10] Yann Dauphin,et al. A Convolutional Encoder Model for Neural Machine Translation , 2016, ACL.
[11] Di He,et al. Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input , 2018, AAAI.
[12] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[14] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[15] Lawrence Carin,et al. Learning Context-Sensitive Convolutional Filters for Text Processing , 2017 .
[16] Yejin Choi,et al. Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.
[17] Victor O. K. Li,et al. Non-Autoregressive Neural Machine Translation , 2017, ICLR.
[18] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.
[19] Marc'Aurelio Ranzato,et al. Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.
[20] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[21] Aurko Roy,et al. Fast Decoding in Sequence Models using Discrete Latent Variables , 2018, ICML.
[22] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[23] Deyi Xiong,et al. Accelerating Neural Transformer via an Average Attention Network , 2018, ACL.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Chengqi Zhang,et al. Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling , 2018, ICLR.
[27] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[28] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[29] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[30] Laurent Besacier,et al. Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction , 2018, CoNLL.
[31] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[32] Tara N. Sainath,et al. Locally-connected and convolutional neural networks for small footprint speaker recognition , 2015, INTERSPEECH.
[33] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.
[34] Xuanjing Huang,et al. Convolutional Interaction Network for Natural Language Inference , 2018, EMNLP.
[35] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[36] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.
[37] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[38] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[39] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.
[40] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[41] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[42] Zhengyang Wang,et al. Smoothed dilated convolutions for improved dense prediction , 2018, Data Mining and Knowledge Discovery.
[43] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[44] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[45] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[46] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.
[47] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[48] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[49] Di He,et al. Hint-based Training for Non-Autoregressive Translation , 2018 .
[50] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[51] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[52] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[53] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[54] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.
[55] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[56] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.
[57] Alexander M. Rush,et al. Latent Alignment and Variational Attention , 2018, NeurIPS.