暂无分享,去创建一个
Di He | Tie-Yan Liu | Bin Dong | Liwei Wang | Yiping Lu | Tao Qin | Zhuohan Li | Zhiqing Sun | Tie-Yan Liu | Tao Qin | Yiping Lu | Bin Dong | Di He | Liwei Wang | Zhiqing Sun | Zhuohan Li
[1] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[2] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] E Weinan,et al. A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.
[4] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[5] Chris Callison-Burch,et al. Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .
[6] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[7] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[8] G. Quispel,et al. Acta Numerica 2002: Splitting methods , 2002 .
[9] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[10] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[11] Bin Dong,et al. Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations , 2017, ICML.
[12] Hector J. Levesque,et al. The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[13] G. Strang. On the Construction and Comparison of Difference Schemes , 1968 .
[14] Uri M. Ascher,et al. Computer methods for ordinary differential equations and differential-algebraic equations , 1998 .
[15] M. Thorpe,et al. Deep limits of residual neural networks , 2018, Research in the Mathematical Sciences.
[16] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[17] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.
[18] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[19] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[20] Chong Fu,et al. Convolutional neural networks combined with Runge–Kutta methods , 2018, Neural Comput. Appl..
[21] Alexander V. Bobylev,et al. The error of the splitting scheme for solving evolutionary equations , 2001, Appl. Math. Lett..
[22] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[23] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[24] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[25] B. Matthews. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.
[26] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[27] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[28] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[29] Noboru Murata,et al. Transport Analysis of Infinitely Deep Neural Network , 2016, J. Mach. Learn. Res..
[30] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[31] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[32] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[33] Ed H. Chi,et al. AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks , 2019, ICLR.
[34] Bin Dong,et al. Dynamically Unfolding Recurrent Restorer: A Moving Endpoint Control Method for Image Restoration , 2018, ICLR.
[35] Tomaso A. Poggio,et al. Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex , 2016, ArXiv.
[36] Wotao Yin,et al. Splitting Methods in Communication, Imaging, Science, and Engineering , 2017 .
[37] Peter Clark,et al. The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.
[38] Eldad Haber,et al. Stable architectures for deep neural networks , 2017, ArXiv.
[39] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[40] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[41] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[42] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[43] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[44] J. Geiser. Decomposition Methods for Differential Equations: Theory and Applications , 2009 .
[45] Tengyu Ma,et al. Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.
[46] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[47] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[48] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[49] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[50] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[51] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[52] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[53] Wei Liu,et al. Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling , 2018, NeurIPS.
[54] H. P.,et al. An Introduction to Celestial Mechanics , 1914, Nature.