暂无分享,去创建一个
Furu Wei | Tao Ge | Wangchunshu Zhou | Ke Xu
[1] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.
[2] Graham Neubig,et al. Understanding Knowledge Distillation in Non-autoregressive Machine Translation , 2019, ICLR.
[3] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[4] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[8] Hwee Tou Ng,et al. Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.
[9] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[10] Ke Xu,et al. Improving Grammatical Error Correction with Machine Translation Pairs , 2020, EMNLP.
[11] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[12] Mohit Bansal,et al. Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering , 2019, EMNLP.
[13] Ming Zhou,et al. Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction , 2020, EMNLP.
[14] Ke Xu,et al. BERT Loses Patience: Fast and Robust Inference with Early Exit , 2020, NeurIPS.
[15] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[16] Furu Wei,et al. BERT-of-Theseus: Compressing BERT by Progressive Module Replacing , 2020, EMNLP.
[17] Ali Farhadi,et al. Defending Against Neural Fake News , 2019, NeurIPS.
[18] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[19] Ted Briscoe,et al. The BEA-2019 Shared Task on Grammatical Error Correction , 2019, BEA@ACL.
[20] Helen Yannakoudakis,et al. A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.
[21] Sylviane Granger. The computer learner corpus: a versatile new source of data for SLA research: Sylviane Granger , 2014 .
[22] Marcin Junczys-Dowmunt,et al. Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data , 2019, BEA@ACL.
[23] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[24] E. Kochmar,et al. Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data , 2019 .
[25] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .
[26] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[27] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[28] Xiaodong Liu,et al. Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.
[29] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.
[30] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[31] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[32] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[33] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[34] Bill Yuchen Lin,et al. Pre-training Text-to-Text Transformers for Concept-centric Common Sense , 2020, ArXiv.
[35] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[36] Jianfeng Gao,et al. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.
[37] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[38] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[39] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[40] Roy Schwartz,et al. The Right Tool for the Job: Matching Model and Instance Complexities , 2020, ACL.
[41] Kentaro Inui,et al. An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction , 2019, EMNLP.
[42] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[43] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[44] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[45] Yuji Matsumoto,et al. Mining Revision Log of Language Learning SNS for Automated Japanese Error Correction of Second Language Learners , 2011, IJCNLP.
[46] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[47] Kurt Keutzer,et al. Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT , 2020, AAAI.
[48] Xinya Du,et al. Harvesting Paragraph-level Question-Answer Pairs from Wikipedia , 2018, ACL.