Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems

What explains the dramatic progress from 20th-century to 21st-century AI, and how can the remaining limitations of current AI be overcome? The widely accepted narrative attributes this progress to massive increases in the quantity of computational and data resources available to support statistical learning in deep artificial neural networks. We show that an additional crucial factor is the development of a new type of computation. Neurocompositional computing adopts two principles that must be simultaneously respected to enable human-level cognition: the principles of Compositionality and Continuity. These have seemed irreconcilable until the recent mathematical discovery that compositionality can be realized not only through discrete methods of symbolic computing, but also through novel forms of continuous neural computing. The revolutionary recent progress in AI has resulted from the use of limited forms of neurocompositional computing. New, deeper forms of neurocompositional computing create AI systems that are more robust, accurate, and comprehensible.

[1]  M. Ridley Explainable Artificial Intelligence (XAI) , 2022, Information Technology and Libraries.

[2]  R. Thomas McCoy,et al.  Neurocompositional computing in human and machine intelligence: A tutorial , 2022 .

[3]  Asli Celikyilmaz,et al.  Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization , 2021, NAACL.

[4]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[5]  Tal Linzen,et al.  COGS: A Compositional Generalization Challenge Based on Semantic Interpretation , 2020, EMNLP.

[6]  Omer Levy,et al.  Emergent linguistic structure in artificial neural networks trained by self-supervision , 2020, Proceedings of the National Academy of Sciences.

[7]  J. Henderson The Unstoppable Rise of Computational Linguistics in Deep Learning , 2020, ACL.

[8]  Eric Rosen,et al.  Learning a gradient grammar of French liaison , 2020 .

[9]  Xipeng Qiu,et al.  Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.

[10]  Xiaodong Liu,et al.  RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers , 2019, ACL.

[11]  Jianfeng Gao,et al.  Mapping natural-language problems to formal-language solutions using structured neural representations , 2019, ICML.

[12]  R. Thomas McCoy,et al.  Discovering the Compositional Structure of Vector Representations with Role Learning Networks , 2019, BLACKBOXNLP.

[13]  S. Kolassa Two Cheers for Rebooting AI: Building Artificial Intelligence We Can Trust , 2020 .

[14]  Jianfeng Gao,et al.  Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving , 2019, ArXiv.

[15]  Eva Zimmermann Gradient Symbolic Representations and the Typology of Ghost Segments , 2019, Proceedings of the Annual Meetings on Phonology.

[16]  Luciano Serafini,et al.  Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning , 2019, FLAP.

[17]  Marco Baroni,et al.  Linguistic generalization and compositionality in modern artificial neural networks , 2019, Philosophical Transactions of the Royal Society B.

[18]  Ewan Dunbar,et al.  RNNs Implicitly Implement Tensor Product Representations , 2018, ICLR.

[19]  Chang Liu,et al.  Attentive Tensor Product Learning , 2018, AAAI.

[20]  Eric R Rosen,et al.  Learning complex inflectional paradigms through blended gradient inputs , 2019 .

[21]  Paul Smolensky,et al.  Augmenting Compositional Models for Knowledge Base Completion Using Gradient Representations , 2018, ArXiv.

[22]  Jürgen Schmidhuber,et al.  Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[23]  Terrence J. Sejnowski,et al.  The Deep Learning Revolution , 2018 .

[24]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[25]  Peter Stone,et al.  A century-long commitment to assessing artificial intelligence and its impact on society , 2018, Commun. ACM.

[26]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[27]  Li Deng,et al.  Tensor Product Generation Networks for Deep NLP Modeling , 2017, NAACL.

[28]  Li Deng,et al.  Question-Answering with Grammatically-Interpretable Representations , 2017, AAAI.

[29]  Adam Wierman,et al.  Thinking Fast and Slow , 2017, SIGMETRICS Perform. Evaluation Rev..

[30]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[31]  Michael T. Putnam,et al.  Coactivation in bilingual grammars: A computational account of code mixing , 2016 .

[32]  Dan Klein,et al.  Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[34]  P. Smolensky,et al.  Gradient Symbolic Representations in Grammar: The case of French Liaison , 2016 .

[35]  Matthew Goldrick,et al.  Optimization and Quantization in Gradient Symbol Systems: A Framework for Integrating the Continuous and the Discrete in Cognition , 2014, Cogn. Sci..

[36]  Chris Eliasmith,et al.  How to Build a Brain: A Neural Architecture for Biological Cognition , 2013 .

[37]  P. Smolensky Symbolic functions from neural computation , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[38]  S. Furber,et al.  To build a brain , 2012, IEEE Spectrum.

[39]  Markus Werning,et al.  The Oxford Handbook of Compositionality , 2012 .

[40]  Zoltán Gendler Szabó,et al.  The case for compositionality , 2012 .

[41]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[42]  Joe Pater,et al.  Weighted Constraints in Generative Linguistics , 2009, Cogn. Sci..

[43]  Joe Pater The harmonic mind : from neural computation to optimality-theoretic grammar , 2009 .

[44]  Géraldine Legendre,et al.  The Harmonic Mind: From Neural Computation to Optimality-Theoretic GrammarVolume I: Cognitive Architecture (Bradford Books) , 2006 .

[45]  Ross W. Gayler Vector Symbolic Architectures answer Jackendoff's challenges for cognitive neuroscience , 2004, ArXiv.

[46]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[47]  G. Marcus The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[48]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[49]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[50]  A Prince,et al.  Optimality: From Neural Networks to Universal Grammar , 1997, Science.

[51]  Terrence J. Sejnowski,et al.  The Computational Brain , 1996, Artif. Intell..

[52]  Marvin Minsky,et al.  Logical vs. analogical or symbolic vs. connectionist or neat vs. scruffy , 1991 .

[53]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[54]  James A. Hendler,et al.  AI Planning: Systems and Techniques , 1990, AI Mag..

[55]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[56]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[57]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[58]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.

[59]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[60]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[61]  Aravind K. Joshi,et al.  Natural language parsing: Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? , 1985 .

[62]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[63]  Allen Newell,et al.  Physical Symbol Systems , 1980, Cogn. Sci..

[64]  D. Hofstadter,et al.  Godel, Escher, Bach: An Eternal Golden Braid , 1979 .

[65]  J. F. Staal,et al.  Syntactic and Semantic Relations in Pāṇini , 1969 .

[66]  B. V. Bowden,et al.  Faster Than Thought: A Symposium on Digital Computing Machines , 1953 .

[67]  Philosophical Transactions of the Royal Society B: biological sciences , 2019 .