暂无分享,去创建一个
[1] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[2] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.
[3] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[4] Changhan Wang,et al. Levenshtein Transformer , 2019, NeurIPS.
[5] David Chiang,et al. Improving Lexical Choice in Neural Machine Translation , 2017, NAACL.
[6] Marc'Aurelio Ranzato,et al. Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.
[7] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[8] Dong Wang,et al. Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.
[9] Vikas Raunak. Simple and Effective Dimensionality Reduction for Word Embeddings , 2017 .
[10] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[11] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[12] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[15] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[16] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[17] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[18] Di He,et al. Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder , 2019, AAAI.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[21] Julian Salazar,et al. Transformers without Tears: Improving the Normalization of Self-Attention , 2019, ArXiv.
[22] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[23] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[24] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .