Emergent Abilities of Large Language Models
暂无分享,去创建一个
J. Dean | Oriol Vinyals | Dani Yogatama | P. Liang | Colin Raffel | W. Fedus | Donald Metzler | Yi Tay | Barret Zoph | Ed Chi | Denny Zhou | Sebastian Borgeaud | Jason Wei | Tatsunori Hashimoto | Rishi Bommasani | Maarten Bosma
[1] Tom B. Brown,et al. In-context Learning and Induction Heads , 2022, ArXiv.
[2] Tom B. Brown,et al. Language Models (Mostly) Know What They Know , 2022, ArXiv.
[3] Gerard de Melo,et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.
[4] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, ArXiv.
[5] D. Schuurmans,et al. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models , 2022, International Conference on Learning Representations.
[6] Christopher D. Manning. Human Language Understanding & Reasoning , 2022, Daedalus.
[7] J. Dean,et al. Designing Effective Sparse Expert Models , 2022, 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[8] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, ArXiv.
[9] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[10] Hyung Won Chung,et al. What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? , 2022, ICML.
[11] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[12] James L. McClelland,et al. Can language models learn from explanations in context? , 2022, EMNLP.
[13] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[14] Adrian S. Wong,et al. Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language , 2022, ICLR.
[15] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[16] D. Schuurmans,et al. Self-Consistency Improves Chain of Thought Reasoning in Language Models , 2022, ArXiv.
[17] Carrie J. Cai,et al. PromptChainer: Chaining Large Language Model Prompts through Visual Programming , 2022, CHI Extended Abstracts.
[18] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[19] M. Lewis,et al. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? , 2022, Conference on Empirical Methods in Natural Language Processing.
[20] Tom B. Brown,et al. Predictability and Surprise in Large Generative Models , 2022, FAccT.
[21] Quantifying Memorization Across Neural Language Models , 2022, ArXiv.
[22] Matt Gardner,et al. Impact of Pretraining Term Frequencies on Few-Shot Reasoning , 2022, ArXiv.
[23] Transformer Memory as a Differentiable Search Index , 2022, ArXiv.
[24] Deduplicating Training Data Mitigates Privacy Risks in Language Models , 2022, 2202.06539.
[25] Geoffrey Irving,et al. Red Teaming Language Models with Language Models , 2022, EMNLP.
[26] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, ArXiv.
[27] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.
[28] P. Abbeel,et al. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents , 2022, ICML.
[29] Percy Liang,et al. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities , 2022, CHI.
[30] Quoc V. Le,et al. GLaM: Efficient Scaling of Language Models with Mixture-of-Experts , 2021, ICML.
[31] Diego de Las Casas,et al. Improving language models by retrieving from trillions of tokens , 2021, ICML.
[32] Sang Michael Xie,et al. An Explanation of In-context Learning as Implicit Bayesian Inference , 2021, ICLR.
[33] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.
[34] Phu Mon Htut,et al. BBQ: A hand-built bias benchmark for question answering , 2021, FINDINGS.
[35] Carrie J. Cai,et al. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts , 2021, CHI.
[36] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[37] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.
[38] Luke Zettlemoyer,et al. Noisy Channel Language Model Prompting for Few-Shot Text Classification , 2021, ACL.
[39] Nicholas Carlini,et al. Deduplicating Training Data Makes Language Models Better , 2021, ACL.
[40] Shachar Mirkin,et al. Emergent Structures and Training Dynamics in Large Language Models , 2022, BIGSCIENCE.
[41] Vinh Q. Tran,et al. Unifying Language Learning Paradigms , 2022, ArXiv.
[42] Ellie Pavlick,et al. Mapping Language Models to Grounded Conceptual Spaces , 2022, ICLR.
[43] Zornitsa Kozareva,et al. Efficient Large Scale Language Modeling with Mixtures of Experts , 2021, ArXiv.
[44] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[45] Po-Sen Huang,et al. Ethical and social risks of harm from Language Models , 2021, ArXiv.
[46] Dario Amodei,et al. A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.
[47] David Bieber,et al. Show Your Work: Scratchpads for Intermediate Computation with Language Models , 2021, ArXiv.
[48] Mohammad Bavarian,et al. Training Verifiers to Solve Math Word Problems , 2021, ArXiv.
[49] Nicholas Carlini,et al. Unsolved Problems in ML Safety , 2021, ArXiv.
[50] Ellie Pavlick,et al. Frequency Effects on Syntactic Rule Learning in Transformers , 2021, EMNLP.
[51] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[52] Daphne Ippolito,et al. Wordcraft: a Human-AI Collaborative Editor for Story Writing , 2021, ArXiv.
[53] Sang Michael Xie,et al. Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning , 2021, NeurIPS.
[54] Luke Zettlemoyer,et al. Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right , 2021, EMNLP.
[55] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.
[56] D. Klein,et al. Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.
[57] Laria Reynolds,et al. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm , 2021, CHI Extended Abstracts.
[58] Noam M. Shazeer,et al. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, J. Mach. Learn. Res..
[59] Danqi Chen,et al. Making Pre-trained Language Models Better Few-shot Learners , 2021, ACL.
[60] Colin Raffel,et al. Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.
[61] Samuel R. Bowman,et al. When Do You Need Billions of Words of Pretraining Data? , 2020, ACL.
[62] Sanjeev Arora,et al. A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks , 2020, ICLR.
[63] Hinrich Schütze,et al. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners , 2020, NAACL.
[64] Dawn Song,et al. Measuring Massive Multitask Language Understanding , 2020, ICLR.
[65] Orhan Firat,et al. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding , 2020, ICLR.
[66] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.
[67] Omer Levy,et al. Emergent linguistic structure in artificial neural networks trained by self-supervision , 2020, Proceedings of the National Academy of Sciences.
[68] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[69] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.
[70] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[71] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[72] José Camacho-Collados,et al. WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations , 2018, NAACL.
[73] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[74] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[75] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[76] Quoc V. Le,et al. A Simple Method for Commonsense Reasoning , 2018, ArXiv.
[77] Nathaniel J. Smith,et al. Bootstrapping language acquisition , 2017, Cognition.
[78] Chandler May,et al. Social Bias in Elicited Natural Language Inferences , 2017, EthNLP@EACL.
[79] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[80] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[81] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[82] Paul A. Lewis,et al. New Perspectives on Emergence in Economics , 2012 .
[83] H. Hwang,et al. BASIC NOTIONS , 2022 .
[84] Timothy O'Connor,et al. Emergence in Science and Philosophy , 2010 .
[85] Percy Liang,et al. Semi-Supervised Learning for Natural Language , 2005 .
[86] Scott Miller,et al. Name Tagging with Word Clusters and Discriminative Training , 2004, NAACL.
[87] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[88] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .
[89] Stephanie Forrest,et al. Emergent computation: self-organizing, collective, and cooperative phenomena in natural and artificial computing networks , 1990 .
[90] Tad Hogg,et al. Phase Transitions in Artificial Intelligence Systems , 1987, Artif. Intell..
[91] Philip W. Anderson,et al. More Is Different Broken symmetry and the nature of the hierarchical structure of science , 1972 .