暂无分享,去创建一个
[1] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.
[2] Dan Klein,et al. Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints , 2016, ACL.
[3] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[4] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[5] Jianfeng Gao,et al. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.
[6] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[7] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[10] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[11] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[12] Yiming Yang,et al. MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices , 2020, ACL.
[13] 悠太 菊池,et al. 大規模要約資源としてのNew York Times Annotated Corpus , 2015 .
[14] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Mirella Lapata,et al. Noisy Self-Knowledge Distillation for Text Summarization , 2020, NAACL.
[16] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[17] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[20] Jiajun Shen,et al. Revisiting Self-Training for Neural Sequence Generation , 2020, ICLR.
[21] M. Maybury,et al. Automatic Summarization , 2002, Computational Linguistics.
[22] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[23] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[24] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[25] Alexander M. Rush,et al. Sequence-Level Knowledge Distillation , 2016, EMNLP.
[26] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[27] Razvan Pascanu,et al. Sobolev Training for Neural Networks , 2017, NIPS.
[28] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[29] Alexander M. Rush,et al. Pre-trained Summarization Distillation , 2020, ArXiv.
[30] Furu Wei,et al. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers , 2020, NeurIPS.
[31] Furu Wei,et al. BERT-of-Theseus: Compressing BERT by Progressive Module Replacing , 2020, EMNLP.
[32] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[33] Noah A. Smith,et al. Deep Encoder, Shallow Decoder: Reevaluating the Speed-Quality Tradeoff in Machine Translation , 2020, ArXiv.
[34] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] Yao Zhao,et al. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.
[37] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[38] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[39] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[40] Graham Neubig,et al. Stronger Baselines for Trustable Results in Neural Machine Translation , 2017, NMT@ACL.
[41] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.