Dict-BERT: Enhancing Language Model Pre-training with Dictionary
暂无分享,去创建一个
Shuohang Wang | W. Yu | Meng Jiang | Yichong Xu | Michael Zeng | Chenguang Zhu | Donghan Yu | Yuwei Fang
[1] Lianhui Qin,et al. Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts , 2022, FINDINGS.
[2] Shuohang Wang,et al. KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering , 2021, ACL.
[3] Elena Sofia Ruzzetti,et al. Lacking the Embedding of a Word? Look it up into a Traditional Dictionary , 2021, FINDINGS.
[4] Leyang Cui,et al. Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation , 2021, EMNLP.
[5] Michael Zeng,et al. Does Knowledge Help General NLU? An Empirical Study , 2021, ArXiv.
[6] Bill Yuchen Lin,et al. Pre-training Text-to-Text Transformers for Concept-centric Common Sense , 2020, ICLR.
[7] Zhiting Hu,et al. A Survey of Knowledge-enhanced Text Generation , 2020, ACM Comput. Surv..
[8] Donghan Yu,et al. JAKET: Joint Pre-training of Knowledge Graph and Language Understanding , 2020, AAAI.
[9] Chenguang Zhu,et al. Injecting Entity Types into Entity-Guided Text Generation , 2020, EMNLP.
[10] Philip S. Yu,et al. KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning , 2020, AAAI.
[11] Tie-Yan Liu,et al. Taking Notes on the Fly Helps BERT Pre-training , 2020, ArXiv.
[12] Chenguang Zhu,et al. Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization , 2020, ArXiv.
[13] Doug Downey,et al. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.
[14] Xipeng Qiu,et al. Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.
[15] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[16] Xuanjing Huang,et al. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters , 2020, FINDINGS.
[17] Minlie Huang,et al. A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation , 2020, TACL.
[18] Wenhan Xiong,et al. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model , 2019, ICLR.
[19] Zhiyuan Liu,et al. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation , 2019, Transactions of the Association for Computational Linguistics.
[20] Peter J. Liu,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[21] Teven Le Scao,et al. Transformers: State-of-the-Art Natural Language Processing , 2019, EMNLP.
[22] Zhe Zhao,et al. K-BERT: Enabling Language Representation with Knowledge Graph , 2019, AAAI.
[23] Michael Tschannen,et al. On Mutual Information Maximization for Representation Learning , 2019, ICLR.
[24] Di He,et al. Representation Degeneration Problem in Training Natural Language Generation Models , 2019, ICLR.
[25] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[26] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[27] Maosong Sun,et al. ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.
[28] Alexander A. Alemi,et al. On Variational Bounds of Mutual Information , 2019, ICML.
[29] Hinrich Schütze,et al. Rare Words: A Major Problem for Contextualized Embeddings And How to Fix it by Attentive Mimicking , 2019, AAAI.
[30] Jaewoo Kang,et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..
[31] Di He,et al. FRAGE: Frequency-Agnostic Word Representation , 2018, NeurIPS.
[32] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[33] Samuel R. Bowman,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[34] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[35] Doug Downey,et al. Definition Modeling: Learning to Define Word Embeddings in Natural Language , 2016, AAAI.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[38] Jason Weston,et al. Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.
[39] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[40] Pascal Vincent,et al. Auto-Encoding Dictionary Definitions into Consistent Word Embeddings , 2018, EMNLP.
[41] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .