暂无分享,去创建一个
Ilya Sutskever | Alec Radford | Harrison Edwards | Arvind Neelakantan | Stanislas Polu | Aditya Ramesh | Pranav Shyam | Igor Babuschkin | Jesse Michael Han | Tao Xu | Alex Ray | Alec Radford | I. Babuschkin | Ilya Sutskever | A. Ramesh | Harrison Edwards | Arvind Neelakantan | Alex Ray | Pranav Shyam | Stanislas Polu | Tao Xu | Igor Babuschkin | I. Sutskever
[1] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[2] Doug Downey,et al. G-DAug: Generative Data Augmentation for Commonsense Reasoning , 2020, FINDINGS.
[3] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Regina Barzilay,et al. Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.
[5] Guillaume Lample,et al. DOBF: A Deobfuscation Pre-Training Objective for Programming Languages , 2021, NeurIPS.
[6] Timo Schick,et al. Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP , 2021, Transactions of the Association for Computational Linguistics.
[7] Vishrav Chaudhary,et al. CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data , 2019, LREC.
[8] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[9] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[10] Eunah Cho,et al. Data Augmentation using Pre-trained Transformer Models , 2020, LIFELONGNLP.
[11] Wei Zhao,et al. Denoising based Sequence-to-Sequence Pre-training for Text Generation , 2019, EMNLP.
[12] Kenneth Heafield,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers , 2016, Annual Meeting of the Association for Computational Linguistics.
[13] Yannis Papanikolaou,et al. DARE: Data Augmented Relation Extraction with GPT-2 , 2020, ArXiv.
[14] Christian Szegedy,et al. Mathematical Reasoning via Self-supervised Skip-tree Training , 2020, ICLR.
[15] Ondrej Bojar,et al. Improving Translation Model by Monolingual Data , 2011, WMT@EMNLP.
[16] Markus Freitag,et al. Scaling Laws for Neural Machine Translation , 2021, ICLR.
[17] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[18] Yichao Lu,et al. Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings , 2020, Transactions of the Association for Computational Linguistics.
[19] Yoshua Bengio,et al. Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.
[20] Boi Faltings,et al. Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems , 2021, EMNLP.
[21] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[22] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[23] Shafiq Joty,et al. Cross-model Back-translated Distillation for Unsupervised Machine Translation , 2020 .
[24] Eric P. Xing,et al. Controllable Text Generation , 2017, ArXiv.
[25] Eneko Agirre,et al. Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.
[26] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[27] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[28] Kevin Knight,et al. Deciphering Foreign Language , 2011, ACL.
[29] Ryan Cotterell,et al. Explaining and Generalizing Back-Translation through Wake-Sleep , 2018, ArXiv.
[30] Ivan Titov,et al. Inducing Crosslingual Distributed Representations of Words , 2012, COLING.
[31] Timo Schick,et al. Generating Datasets with Pretrained Language Models , 2021, EMNLP.
[32] Hai Zhao,et al. Cross-lingual Supervision Improves Unsupervised Neural Machine Translation , 2020, NAACL.
[33] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[34] Mark Chen,et al. Scaling Laws for Autoregressive Generative Modeling , 2020, ArXiv.
[35] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[36] Furu Wei,et al. XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders , 2020, ArXiv.
[37] Guillaume Lample,et al. Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.
[38] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2021, NAACL.
[39] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.
[40] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[41] Quoc V. Le,et al. STraTA: Self-Training with Task Augmentation for Better Few-shot Learning , 2021, EMNLP.
[42] Ateret Anaby-Tavor,et al. Do Not Have Enough Data? Deep Learning to the Rescue! , 2020, AAAI.
[43] Roger B. Grosse,et al. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning , 2021, ICML.
[44] Marie-Francine Moens,et al. Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction , 2015, ACL.
[45] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[46] Andy Way,et al. Investigating Backtranslation in Neural Machine Translation , 2018, EAMT.
[47] James T. Kwok,et al. Generalizing from a Few Examples , 2019, ACM Comput. Surv..
[48] Marjan Ghazvininejad,et al. Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.