Locating and Editing Factual Knowledge in GPT
暂无分享,去创建一个
[1] Li Dong,et al. Knowledge Neurons in Pretrained Transformers , 2021, ACL.
[2] Nicola De Cao,et al. Editing Factual Knowledge in Language Models , 2021, EMNLP.
[3] Danqi Chen,et al. Factual Probing Is [MASK]: Learning vs. Learning to Recall , 2021, NAACL.
[4] Yonatan Belinkov,et al. Probing Classifiers: Promises, Shortcomings, and Advances , 2021, CL.
[5] E. Hovy,et al. Measuring and Improving Consistency in Pretrained Language Models , 2021, Transactions of the Association for Computational Linguistics.
[6] Roger Wattenhofer,et al. Of Non-Linearity and Commutativity in BERT , 2021, 2021 International Joint Conference on Neural Networks (IJCNN).
[7] Omer Levy,et al. Transformer Feed-Forward Layers Are Key-Value Memories , 2020, EMNLP.
[8] Yoav Goldberg,et al. Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals , 2020, Transactions of the Association for Computational Linguistics.
[9] David Bau,et al. Rewriting a Deep Generative Model , 2020, ECCV.
[10] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[11] Uri Shalit,et al. CausaLM: Causal Model Explanation Through Counterfactual Language Models , 2020, CL.
[12] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[13] Fabio Petroni,et al. How Context Affects Language Models' Factual Predictions , 2020, AKBC.
[14] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.
[15] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[16] Peter J. Liu,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[17] Sebastian Riedel,et al. Language Models as Knowledge Bases? , 2019, EMNLP.
[18] Zhe Gan,et al. Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization , 2018, NeurIPS.
[19] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Judea Pearl,et al. Direct and Indirect Effects , 2001, UAI.
[22] James A. Anderson,et al. A simple neural network generating an interactive memory , 1972 .
[23] Teuvo Kohonen,et al. Correlation Matrix Memories , 1972, IEEE Transactions on Computers.
[24] Huteng Dai,et al. Learning nonlocal phonotactics in Strictly Piecewise phonotactic model , 2021, SCIL.
[25] Yonatan Belinkov,et al. Investigating Gender Bias in Language Models Using Causal Mediation Analysis , 2020, NeurIPS.
[26] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[27] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[28] Miguel Ángel García Cumbreras,et al. Association for Computational Linguistics , 2001 .