Model-free, Model-based, and General Intelligence

During the 60s and 70s, AI researchers explored intuitions about intelligence by writing programs that displayed intelligent behavior. Many good ideas came out from this work but programs written by hand were not robust or general. After the 80s, research increasingly shifted to the development of learners capable of inferring behavior and functions from experience and data, and solvers capable of tackling well-defined but intractable models like SAT, classical planning, Bayesian networks, and POMDPs. The learning approach has achieved considerable success but results in black boxes that do not have the flexibility, transparency, and generality of their model-based counterparts. Model-based approaches, on the other hand, require models and scalable algorithms. Model-free learners and model-based solvers have close parallels with Systems 1 and 2 in current theories of the human mind: the first, a fast, opaque, and inflexible intuitive mind; the second, a slow, transparent, and flexible analytical mind. In this paper, I review developments in AI and draw on these theories to discuss the gap between model-free learners and model-based solvers, a gap that needs to be bridged in order to have intelligent systems that are robust and general.

[1]  M. Zeldin Heuristics! , 2010 .

[2]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[3]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[4]  Yuxiao Hu,et al.  Generalized Planning: Synthesizing Plans that Work for Multiple Environments , 2011, IJCAI.

[5]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[6]  Pieter Abbeel,et al.  Learning Generalized Reactive Policies using Deep Neural Networks , 2017, ICAPS.

[7]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[8]  Hector Geffner,et al.  Purely Declarative Action Representations are Overrated : Classical Planning with Simulators , 2017 .

[9]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[10]  Martin Gebser,et al.  Answer Set Solving in Practice , 2012, Answer Set Solving in Practice.

[11]  Gerd Folkers,et al.  On computable numbers , 2016 .

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Hector Geffner,et al.  Classical Planning with Simulators: Results on the Atari Video Games , 2015, IJCAI.

[14]  Hector Geffner,et al.  Width and Serialization of Classical Planning Problems , 2012, ECAI.

[15]  Jonathan Evans,et al.  Science Perspectives on Psychological , 2022 .

[16]  Roman Barták,et al.  Constraint Processing , 2009, Encyclopedia of Artificial Intelligence.

[17]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[19]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[20]  John McCarthy,et al.  Generality in artificial intelligence , 1987, Resonance.

[21]  Keith E. Stanovich,et al.  The Robot's Rebellion: Finding Meaning in the Age of Darwin , 2004 .

[22]  Drew McDermott,et al.  Using Regression-Match Graphs to Control Search in Planning , 1999, Artif. Intell..

[23]  Blai Bonet,et al.  Planning as heuristic search , 2001, Artif. Intell..

[24]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[25]  Jussi Rintanen,et al.  Complexity of Planning with Partial Observability , 2004, ICAPS.

[26]  Rrio Op-amps FEATURES , 2008 .

[27]  Hector Geffner,et al.  Best-First Width Search: Exploration and Exploitation in Classical Planning , 2017, AAAI.

[28]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[29]  Hector Geffner,et al.  Probabilistic Plan Recognition Using Off-the-Shelf Classical Planners , 2010, AAAI.

[30]  Ramanathan V. Guha,et al.  Cyc: toward programs with common sense , 1990, CACM.

[31]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[32]  John McCarthy,et al.  Programs with common sense , 1960 .

[33]  Alberto Camacho,et al.  Non-Deterministic Planning with Temporally Extended Goals: LTL over Finite and Infinite Traces , 2017, AAAI.

[34]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[35]  Judea Pearl,et al.  Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution , 2018, WSDM.

[36]  R. Kaye Minesweeper is NP-complete , 2000 .

[37]  Peter Norvig,et al.  Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp , 1991 .

[38]  Henry Kautz,et al.  Pushing the envelope: planning , 1996 .

[39]  Giuseppe De Giacomo,et al.  Generalized Planning: Non-Deterministic Abstractions and Trajectory Constraints , 2017, IJCAI.

[40]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[41]  Blai Bonet,et al.  Planning with Pixels in (Almost) Real Time , 2018, AAAI.

[42]  Krishnendu Chatterjee,et al.  A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs , 2015, AAAI.

[43]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[44]  Hector Geffner,et al.  Compiling Uncertainty Away in Conformant Planning Problems with Bounded Width , 2009, J. Artif. Intell. Res..

[45]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[46]  E. Feigenbaum,et al.  Computers and Thought , 1963 .

[47]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[48]  Eugene Charniak,et al.  Artificial Intelligence Programming , 1987 .

[49]  Joelle Pineau,et al.  Independently Controllable Features , 2017 .

[50]  Marlos C. Machado,et al.  State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.

[51]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[52]  Neil Immerman,et al.  A new representation and associated algorithms for generalized planning , 2011, Artif. Intell..

[53]  Blai Bonet,et al.  Automatic Derivation of Memoryless Policies and Finite-State Controllers Using Classical Planners , 2009, ICAPS.

[54]  Blai Bonet,et al.  Flexible and Scalable Partially Observable Planning with Linear Translations , 2014, AAAI.

[55]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[56]  Dimitri P. Bertsekas,et al.  Feature-based aggregation and deep reinforcement learning: a survey and some new implementations , 2018, IEEE/CAA Journal of Automatica Sinica.

[57]  Benjamin Kuipers,et al.  Autonomous Learning of High-Level States and Actions in Continuous Environments , 2012, IEEE Transactions on Autonomous Mental Development.

[58]  David Barber,et al.  Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.

[59]  Blai Bonet,et al.  A Concise Introduction to Models and Methods for Automated Planning , 2013, A Concise Introduction to Models and Methods for Automated Planning.

[60]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[61]  H. Eysenck THINKING , 1958 .