暂无分享,去创建一个
Shengfeng Pan | Yunfeng Liu | Jianlin Su | Yu Lu | Bo Wen
[1] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[2] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[3] Jianfeng Gao,et al. DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.
[4] Fabrice Muhlenbach,et al. UdL at SemEval-2017 Task 1: Semantic Textual Similarity Estimation of English Sentence Pairs Using Regression Model over Pairwise Features , 2017, SemEval@ACL.
[5] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[6] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[7] Xianpei Han,et al. CAIL2019-SCM: A Dataset of Similar Case Matching in Legal Domain , 2019, ArXiv.
[8] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[9] Lukasz Kaiser,et al. Rethinking Attention with Performers , 2020, ArXiv.
[10] Cho-Jui Hsieh,et al. Learning to Encode Position for Transformer with Continuous Dynamical Model , 2020, ICML.
[11] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[12] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[13] Shuai Yi,et al. Efficient Attention: Attention with Linear Complexities , 2018, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[15] Max Horn,et al. Translational Equivariance in Kernelizable Attention , 2021, ArXiv.
[16] Jakob Grue Simonsen,et al. Encoding word order in complex embeddings , 2019, ICLR.
[17] Antoine Liutkus,et al. Relative Positional Encoding for Transformers with Linear Complexity , 2021, ICML.
[18] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[19] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[20] Xiaozhe Ren,et al. NEZHA: Neural Contextualized Representation for Chinese Language Understanding , 2019, ArXiv.
[21] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[22] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[25] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Ankit Singh Rawat,et al. Are Transformers universal approximators of sequence-to-sequence functions? , 2020, ICLR.
[27] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[28] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[29] Tie-Yan Liu,et al. Rethinking Positional Encoding in Language Pre-training , 2020, ICLR.
[30] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[31] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[32] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[33] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[34] Davis Liang,et al. Improve Transformer Models with Better Relative Position Embeddings , 2020, FINDINGS.
[35] Sen Jia,et al. How Much Position Information Do Convolutional Neural Networks Encode? , 2020, ICLR.
[36] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.
[37] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[38] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.