暂无分享,去创建一个
[1] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[2] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[3] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[4] Kilian Q. Weinberger,et al. Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.
[5] Marc'Aurelio Ranzato,et al. Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.
[6] Yejin Choi,et al. Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.
[7] Xin Wang,et al. SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.
[8] Dmitry P. Vetrov,et al. Probabilistic Adaptive Computation Time , 2017, Bulletin of the Polish Academy of Sciences Technical Sciences.
[9] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[10] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[11] Myle Ott,et al. Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.
[12] Venkatesh Saligrama,et al. Adaptive Neural Networks for Efficient Inference , 2017, ICML.
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[15] Marcello Federico,et al. Report on the 11th IWSLT evaluation campaign , 2014, IWSLT.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[18] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[19] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[20] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[21] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[22] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.