Emergent analogical reasoning in large language models

The recent advent of large language models has reinvigorated debate over whether human cognitive capacities might emerge in such generic models given sufficient training data. Of particular interest is the ability of these models to reason about novel problems zero-shot, without any direct training. In human cognition, this capacity is closely tied to an ability to reason by analogy. Here we performed a direct comparison between human reasoners and a large language model (the text-davinci-003 variant of Generative Pre-trained Transformer (GPT)-3) on a range of analogical tasks, including a non-visual matrix reasoning task based on the rule structure of Raven's Standard Progressive Matrices. We found that GPT-3 displayed a surprisingly strong capacity for abstract pattern induction, matching or even surpassing human capabilities in most settings; preliminary tests of GPT-4 indicated even better performance. Our results indicate that large language models such as GPT-3 have acquired an emergent ability to find zero-shot solutions to a broad range of analogy problems.

[1]  Taylor W. Webb,et al.  Learning to reason over visual objects , 2023, ICLR.

[2]  Anna A. Ivanova,et al.  Dissociating language and thought in large language models: a cognitive perspective , 2023, ArXiv.

[3]  Trevor J. Bihl,et al.  Zero-shot visual reasoning through probabilistic analogical mapping , 2022, ArXiv.

[4]  James L. McClelland,et al.  Language models show human-like content effects on reasoning , 2022, ArXiv.

[5]  Eric Schulz,et al.  Using cognitive psychology to understand GPT-3 , 2022, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Dean,et al.  Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..

[7]  Gerard de Melo,et al.  Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.

[8]  S. Gu,et al.  Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.

[9]  Andrew Kyle Lampinen,et al.  Data Distributional Properties Drive Emergent In-Context Learning in Transformers , 2022, NeurIPS.

[10]  L. Benini,et al.  A neuro-vector-symbolic architecture for solving Raven’s progressive matrices , 2022, Nature Machine Intelligence.

[11]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[12]  Matthew J. Kmiecik,et al.  Differential effects of semantic distance, distractor salience, and relations in verbal analogy , 2022, Psychonomic Bulletin & Review.

[13]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[14]  Luis Espinosa Anke,et al.  BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies? , 2021, ACL.

[15]  Keith J. Holyoak,et al.  Probabilistic Analogical Mapping with Semantic Relation Networks , 2021, Psychological review.

[16]  M. Mitchell Abstraction and analogy‐making in artificial intelligence , 2021, Annals of the New York Academy of Sciences.

[17]  Ishan Sinha,et al.  Emergent Symbols through Binding in External Memory , 2020, ICLR.

[18]  Klaus Greff,et al.  On the Binding Problem in Artificial Neural Networks , 2020, ArXiv.

[19]  Thomas L. Griffiths,et al.  Understanding Human Intelligence through Human Limitations , 2020, Trends in Cognitive Sciences.

[20]  Hinrich Schütze,et al.  Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models , 2020, Proceedings of the National Academy of Sciences.

[21]  Jonathan D. Cohen,et al.  Learning Representations that Support Extrapolation , 2020, ICML.

[22]  Jimmy Ba,et al.  The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning , 2020, ArXiv.

[23]  Jaime Fern'andez del R'io,et al.  Array programming with NumPy , 2020, Nature.

[24]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[25]  Dedre Gentner,et al.  Spatial alignment facilitates visual comparison. , 2020, Journal of experimental psychology. Human perception and performance.

[26]  Hongjing Lu,et al.  Verbal analogy problem sets: An inventory of testing materials , 2020, Behavior research methods.

[27]  Johannes L. Schönberger,et al.  SciPy 1.0: fundamental algorithms for scientific computing in Python , 2019, Nature Methods.

[28]  Feng Gao,et al.  RAVEN: A Dataset for Relational and Analogical Visual REasoNing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ying Nian Wu,et al.  Emergence of analogy from relation learning , 2019, Proceedings of the National Academy of Sciences.

[30]  Felix Hill,et al.  Learning to Make Analogies by Contrasting Abstract Relational Structure , 2019, ICLR.

[31]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Julian N. Marewski,et al.  What can the brain teach us about building artificial intelligence? , 2016, Behavioral and Brain Sciences.

[34]  Joshua de Leeuw,et al.  jsPsych: A JavaScript library for creating behavioral experiments in a Web browser , 2014, Behavior Research Methods.

[35]  Jonathan D. Cohen,et al.  Indirection and symbol-like processing in the prefrontal cortex and basal ganglia , 2013, Proceedings of the National Academy of Sciences.

[36]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[37]  K. Holyoak Analogy and Relational Reasoning , 2012 .

[38]  D. Klahr,et al.  Scientific Thinking and Reasoning , 2012 .

[39]  Laura E. Matzen,et al.  Recreating Raven’s: Software for systematically generating large numbers of Raven-like matrix problems with normed properties , 2010, Behavior research methods.

[40]  Derek C. Penn,et al.  Darwin's mistake: Explaining the discontinuity between human and nonhuman minds , 2008, Behavioral and Brain Sciences.

[41]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[42]  John E. Hummel,et al.  Varieties of sameness: the impact of relational complexity on perceptual comparisons , 2004, Cogn. Sci..

[43]  Jeffrey P. Bigham,et al.  Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems , 2003, ArXiv.

[44]  S. Phillips,et al.  Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. , 1998, The Behavioral and brain sciences.

[45]  Bruce D. Burns,et al.  Meta-analogical transfer: Transfer between episodes of analogical reasoning. , 1996 .

[46]  Charles Cole,et al.  Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought , 1996 .

[47]  Melanie Mitchell,et al.  The Copycat project: a model of mental fluidity and analogy-making , 1995 .

[48]  Kenneth D. Forbus,et al.  The Roles of Similarity in Transfer: Separating Retrievability From Inferential Soundness , 1993, Cognitive Psychology.

[49]  David J. Chalmers,et al.  High-level perception, representation, and analogy: a critique of artificial intelligence methodology , 1992, J. Exp. Theor. Artif. Intell..

[50]  M A Just,et al.  From the SelectedWorks of Marcel Adam Just 1990 What one intelligence test measures : A theoretical account of the processing in the Raven Progressive Matrices Test , 2016 .

[51]  Brian Falkenhainer,et al.  The Structure-Mapping Engine: Algorithm and Examples , 1989, Artif. Intell..

[52]  K. Holyoak,et al.  Surface and structural similarity in analogical transfer , 1987, Memory & cognition.

[53]  K. Holyoak,et al.  Development of analogical problem-solving skill. , 1984, Child development.

[54]  Dedre Gentner,et al.  Structure-Mapping: A Theoretical Framework for Analogy , 1983, Cogn. Sci..

[55]  K. Holyoak,et al.  Analogical problem solving , 1980, Cognitive Psychology.

[56]  R. Sternberg,et al.  Developmental Patterns in the Solution of Verbal Analogies. , 1980 .

[57]  R. Cattell Abilities: Their structure, growth, and action , 1974 .

[58]  P C Wason,et al.  Reasoning about a Rule , 1968, The Quarterly journal of experimental psychology.

[59]  Allen Newell,et al.  Elements of a theory of human problem solving. , 1958 .

[60]  Kenneth D. Forbus,et al.  Modeling Visual Problem Solving as Analogical Reasoning , 2017, Psychological review.

[61]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[62]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[63]  G. Marcus The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[64]  John E. Hummel,et al.  The Proper Treatment of Symbols in a Connectionist Architecture , 2000 .

[65]  Melanie Mitchell,et al.  Analogy-making as perception - a computer model , 1993, Neural network modeling and connectionism.

[66]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[67]  Douglas Hofstadter,et al.  The Copycat Project: An Experiment in Nondeterminism and Creative Analogies , 1984 .

[68]  M. Scheerer,et al.  Problem Solving , 1967, Nature.