Progress measures for grokking via mechanistic interpretability
暂无分享,去创建一个
[1] J. Steinhardt,et al. Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small , 2022, ArXiv.
[2] Tom B. Brown,et al. In-context Learning and Induction Heads , 2022, ArXiv.
[3] S. Kakade,et al. Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit , 2022, NeurIPS.
[4] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[5] J. Susskind,et al. The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon , 2022, ArXiv.
[6] Max Tegmark,et al. Towards Understanding Grokking: An Effective Theory of Representation Learning , 2022, NeurIPS.
[7] Tom B. Brown,et al. Predictability and Surprise in Large Generative Models , 2022, FAccT.
[8] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[9] J. Steinhardt,et al. The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models , 2022, ICLR.
[10] Yuri Burda,et al. Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets , 2022, ArXiv.
[11] D. Hassabis,et al. Acquisition of chess knowledge in AlphaZero , 2021, Proceedings of the National Academy of Sciences of the United States of America.
[12] A. Rogozhnikov. Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation , 2022, ICLR.
[13] Jaime Fern'andez del R'io,et al. Array programming with NumPy , 2020, Nature.
[14] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[15] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[16] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[17] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[18] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[19] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[20] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[21] Scott T. Rickard,et al. Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.