暂无分享,去创建一个
[1] J. P. Lewis,et al. Fast Template Matching , 2009 .
[2] Yejin Choi,et al. Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.
[3] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[4] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.
[5] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.
[6] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[7] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[8] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.
[9] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.
[10] Paul A. Viola,et al. Robust Real-time Object Detection , 2001 .
[11] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[12] Deyi Xiong,et al. Accelerating Neural Transformer via an Average Attention Network , 2018, ACL.
[13] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[16] Noah A. Smith,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.
[17] Mirella Lapata,et al. Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.
[18] Franklin C. Crow,et al. Summed-area tables for texture mapping , 1984, SIGGRAPH.
[19] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[20] Jianfeng Gao,et al. A Persona-Based Neural Conversation Model , 2016, ACL.
[21] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.
[22] Alexander M. Rush,et al. Latent Alignment and Variational Attention , 2018, NeurIPS.
[23] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[24] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[25] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[26] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[27] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[28] Xuanjing Huang,et al. Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification , 2016, EMNLP.
[29] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[30] Mahmoud Nabil,et al. CUFE at SemEval-2016 Task 4: A Gated Recurrent Model for Sentiment Classification , 2016, *SEMEVAL.
[31] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[32] Amanda Stent,et al. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers) , 2018, North American Chapter of the Association for Computational Linguistics.
[33] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[34] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[35] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[36] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[37] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[38] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.
[39] Sanjeev Saxena,et al. On Parallel Prefix Computation , 1994, Parallel Process. Lett..
[40] Christof Monz,et al. Recurrent Memory Networks for Language Modeling , 2016, NAACL.
[41] Peter Dayan,et al. Fast Parametric Learning with Activation Memorization , 2018, ICML.
[42] Szymon Rusinkiewicz,et al. Accelerating Large-Kernel Convolution Using Summed-Area Tables , 2019, ArXiv.
[43] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.
[44] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[45] Victor S. Lempitsky,et al. Deep Neural Networks with Box Convolutions , 2018, NeurIPS.
[46] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[47] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.
[48] S. C. Kremer,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[49] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[50] Quan Pan,et al. A Generative Model for category text generation , 2018, Inf. Sci..
[51] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[52] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[53] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Chengqi Zhang,et al. Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling , 2018, ICLR.
[55] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[56] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[57] Guillaume Lample,et al. Neural Architectures for Named Entity Recognition , 2016, NAACL.
[58] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[59] Eric P. Xing,et al. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2014, ACL 2014.
[60] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[61] Angela Fan,et al. Controllable Abstractive Summarization , 2017, NMT@ACL.
[62] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[63] Ruslan Salakhutdinov,et al. Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function , 2019, AAAI.
[64] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[65] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.
[66] Richard Socher,et al. An Analysis of Neural Language Modeling at Multiple Scales , 2018, ArXiv.
[67] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.
[68] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[69] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[70] Quoc V. Le,et al. Massive Exploration of Neural Machine Translation Architectures , 2017, EMNLP.
[71] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[72] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.