MASS: Masked Sequence to Sequence Pre-training for Language Generation
暂无分享,去创建一个
Xu Tan | Tao Qin | Tie-Yan Liu | Kaitao Song | Jianfeng Lu | Tie-Yan Liu | Tao Qin | Xu Tan | Jianfeng Lu | Kaitao Song
[1] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..
[2] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[3] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[4] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.
[5] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[6] Guillaume Lample,et al. Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.
[7] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[8] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[9] Eneko Agirre,et al. Unsupervised Statistical Machine Translation , 2018, EMNLP.
[10] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[11] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[12] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[13] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[15] Di He,et al. Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.
[16] Cristian Danescu-Niculescu-Mizil,et al. Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs , 2011, CMCL@ACL.
[17] Zhiyuan Liu,et al. Neural Headline Generation with Minimum Risk Training , 2016, ArXiv.
[18] Honglak Lee,et al. An efficient framework for learning sentence representations , 2018, ICLR.
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Masaaki Nagata,et al. Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization , 2016, EACL.
[22] Di He,et al. Double Path Networks for Sequence to Sequence Learning , 2018, COLING.
[23] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.
[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[26] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.
[27] Wei Chen,et al. Unsupervised Neural Machine Translation with Weight Sharing , 2018 .
[28] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[29] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[30] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[31] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[32] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[33] Di He,et al. Multilingual Neural Machine Translation with Knowledge Distillation , 2019, ICLR.
[34] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[35] Philip Bachman,et al. Machine Comprehension by Text-to-Text Neural Question Generation , 2017, Rep4NLP@ACL.
[36] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.
[37] Xu Tan,et al. Unsupervised Pivot Translation for Distant Languages , 2019, ACL.
[38] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[39] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.
[40] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[41] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[42] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.
[43] Di He,et al. Dense Information Flow for Neural Machine Translation , 2018, NAACL.
[44] Andrew M. Dai,et al. MaskGAN: Better Text Generation via Filling in the ______ , 2018, ICLR.
[45] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[46] Jiajun Zhang,et al. Exploiting Source-side Monolingual Data in Neural Machine Translation , 2016, EMNLP.
[47] John Blitzer,et al. Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.
[48] Yaser Al-Onaizan,et al. Zero-Resource Translation with Multi-Lingual Neural Machine Translation , 2016, EMNLP.
[49] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.
[50] Guillaume Lample,et al. Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.
[51] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[52] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.
[53] Xiaogang Wang,et al. Learning Deep Representation with Large-Scale Attributes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[54] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[55] Xu Tan,et al. Almost Unsupervised Text to Speech and Automatic Speech Recognition , 2019, ICML.
[56] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[57] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[58] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.