暂无分享,去创建一个
Huanqi Cao | Maosong Sun | Minlie Huang | Juanzi Li | Zhiyuan Liu | Hao Zhou | Wentao Han | Jian Guan | Xiaozhi Wang | Jie Tang | Xu Han | Zhengyan Zhang | Fanchao Qi | Pei Ke | Daixuan Li | Haozhe Ji | Yanan Zheng | Yujia Qin | Deming Ye | Xiaoyan Zhu | Zhenbo Sun | Yuxian Gu | Yusheng Su | Guoyang Zeng | Shengqi Chen | Jie Tang | Zhiyuan Liu | Minlie Huang | Maosong Sun | Juan-Zi Li | Xiaoyan Zhu | Zhengyan Zhang | Fanchao Qi | Hao Zhou | Zhenbo Sun | Pei Ke | Yujia Qin | Yusheng Su | Jian Guan | Xu Han | Huanqi Cao | Xiaozhi Wang | Guoyang Zeng | Haozhe Ji | Deming Ye | S. Chen | Yanan Zheng | Yuxian Gu | Wentao Han | Daixuan Li
[1] Wanxiang Che,et al. Pre-Training with Whole Word Masking for Chinese BERT , 2019, ArXiv.
[2] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[3] Minlie Huang,et al. ChID: A Large-scale Chinese IDiom Dataset for Cloze Test , 2019, ACL.
[4] Wenhan Xiong,et al. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model , 2019, ICLR.
[5] Kyunghyun Cho,et al. Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models , 2020, ICLR.
[6] Wei-Ying Ma,et al. Topic Aware Neural Response Generation , 2016, AAAI.
[7] Xinyan Xiao,et al. DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications , 2017, QA@ACL.
[8] Lawrence S. Moss,et al. OCNLI: Original Chinese Natural Language Inference , 2020, FINDINGS.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[12] Ali Farhadi,et al. Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.
[13] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[14] Qun Liu,et al. NEZHA: Neural Contextualized Representation for Chinese Language Understanding , 2019, ArXiv.
[15] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[16] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[17] Yu Sun,et al. ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.
[18] Wanxiang Che,et al. Revisiting Pre-Trained Models for Chinese Natural Language Processing , 2020, FINDINGS.
[19] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[20] Maosong Sun,et al. ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.
[21] Yong Jiang,et al. A Large-Scale Chinese Short-Text Conversation Dataset , 2020, NLPCC.
[22] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[23] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[24] Minlie Huang,et al. SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge , 2020, EMNLP.
[25] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[26] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[27] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[28] Wentao Ma,et al. A Span-Extraction Dataset for Chinese Machine Reading Comprehension , 2019, EMNLP-IJCNLP.
[29] Jie Zhou,et al. Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models , 2020, AI Open.
[30] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.
[31] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.
[32] Cong Yu,et al. CLUE: A Chinese Language Understanding Evaluation Benchmark , 2020, COLING.
[33] James Demmel,et al. Large Batch Optimization for Deep Learning: Training BERT in 76 minutes , 2019, ICLR.
[34] Qun Liu,et al. Know What You Don't Need: Single-Shot Meta-Pruning for Attention Heads , 2020, AI Open.
[35] Tianyu Gao,et al. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation , 2019, ArXiv.
[36] Marius Mosbach,et al. On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines , 2020, ArXiv.
[37] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[38] Olatunji Ruwase,et al. DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters , 2020, KDD.
[39] Xiaoyan Zhu,et al. Generating Informative Responses with Controlled Sentence Function , 2018, ACL.
[40] Roy Schwartz,et al. Knowledge Enhanced Contextual Word Representations , 2019, EMNLP/IJCNLP.
[41] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.