Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
暂无分享,去创建一个
[1] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[2] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[3] Hinrich Schütze,et al. AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.
[4] Idan Szpektor,et al. DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion , 2019, NAACL.
[5] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .
[6] Di He,et al. Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.
[7] Jakob Uszkoreit,et al. KERMIT: Generative Insertion-Based Modeling for Sequences , 2019, ArXiv.
[8] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[9] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[10] Yejin Choi,et al. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.
[11] Sebastian Ruder,et al. Fine-tuned Language Models for Text Classification , 2018, ArXiv.
[12] Andrew Y. Ng,et al. Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.
[13] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[14] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[15] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.
[16] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.
[17] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[18] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[19] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .
[20] Alex Wang,et al. Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling , 2018, ArXiv.
[21] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[22] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Alexander M. Rush,et al. Bottom-Up Abstractive Summarization , 2018, EMNLP.
[25] Yang Liu,et al. Fine-tune BERT for Extractive Summarization , 2019, ArXiv.
[26] Kevin Gimpel,et al. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units , 2016, ArXiv.
[27] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[28] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[29] Alex Wang,et al. Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling , 2018, ACL.
[30] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[31] Lukasz Kaiser,et al. Sample Efficient Text Summarization Using a Single Pre-Trained Transformer , 2019, ArXiv.
[32] Xiaodong Liu,et al. Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.
[33] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[34] BaesensBart,et al. To tune or not to tune , 2015 .
[35] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[36] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[37] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[38] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[39] Mirella Lapata,et al. Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.
[40] Manaal Faruqui,et al. Learning To Split and Rephrase From Wikipedia Edit History , 2018, EMNLP.
[41] Shashi Narayan,et al. Split and Rephrase , 2017, EMNLP.
[42] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[43] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[44] Yoav Goldberg,et al. Split and Rephrase: Better Evaluation and a Stronger Baseline , 2018, ACL.
[45] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[46] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[47] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[48] Gunhee Kim,et al. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks , 2018, NAACL.
[49] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[50] Chris Callison-Burch,et al. Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.
[51] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.
[52] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.
[53] Jordan J. Louviere,et al. Best-Worst Scaling: Theory, Methods and Applications , 2015 .
[54] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[55] Benjamin Van Durme,et al. Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.
[56] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[57] Ran Wang,et al. To Tune or Not To Tune? How About the Best of Both Worlds? , 2019, ArXiv.
[58] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.
[59] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[60] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.