Deep active inference

This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the “deep active inference” agent. This agent minimises a variational free energy bound on the average surprise of its sensations, which is motivated by a homeostatic argument. It does so by optimising the parameters of a generative latent variable model of its sensory inputs, together with a variational density approximating the posterior distribution over the latent variables, given its observations, and by acting on its environment to actively sample input that is likely under this generative model. The internal dynamics of the agent are implemented using deep and recurrent neural networks, as used in machine learning, making the deep active inference agent a scalable and very flexible class of active inference agent. Using the mountain car problem, we show how goal-directed behaviour can be implemented by defining appropriate priors on the latent states in the agent’s model. Furthermore, we show that the deep active inference agent can learn a generative model of the environment, which can be sampled from to understand the agent’s beliefs about the environment and its interaction therewith.

[1]  József Fiser,et al.  Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment , 2011, Science.

[2]  W. Ashby,et al.  Every Good Regulator of a System Must Be a Model of That System , 1970 .

[3]  Karl J. Friston,et al.  Free-Energy and Illusions: The Cornsweet Effect , 2011, Front. Psychology.

[4]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[5]  Karl J. Friston,et al.  Action understanding and active inference , 2011, Biological Cybernetics.

[6]  D. Knill,et al.  Bayesian sampling in visual perception , 2011, Proceedings of the National Academy of Sciences.

[7]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[8]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Karl J. Friston,et al.  Evidence for surprise minimization over value maximization in choice behavior , 2015, Scientific Reports.

[11]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[12]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[13]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[14]  John C. Baez,et al.  Relative Entropy in Biological Systems , 2015, Entropy.

[15]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16]  Christopher L. Buckley,et al.  An active inference implementation of phototaxis , 2017, ECAL.

[17]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[18]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  H T Siegelmann,et al.  Dating and Context of Three Middle Stone Age Sites with Bone Points in the Upper Semliki Valley, Zaire , 2007 .

[21]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[22]  Karl J. Friston,et al.  Active inference and epistemic value , 2015, Cognitive neuroscience.

[23]  John C. Platt,et al.  Constrained Differential Optimization , 1987, NIPS.

[24]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[25]  Marc Harper,et al.  The Replicator Equation as an Inference Dynamic , 2009, ArXiv.

[26]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[27]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[28]  Karl J. Friston,et al.  The Computational Anatomy of Psychosis , 2013, Front. Psychiatry.

[29]  John O. Campbell Universal Darwinism As a Process of Bayesian Inference , 2016, Front. Syst. Neurosci..

[30]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[31]  Karl J. Friston,et al.  Predictive coding under the free-energy principle , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[32]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[34]  Karl J. Friston,et al.  Active Inference: A Process Theory , 2017, Neural Computation.

[35]  Karl J. Friston,et al.  A free energy principle for the brain , 2006, Journal of Physiology-Paris.

[36]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[37]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[38]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[39]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[41]  Karl J. Friston,et al.  Answering Schrödinger's question: A free-energy formulation , 2017, Physics of life reviews.

[42]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[43]  Karl J. Friston Life as we know it , 2013, Journal of The Royal Society Interface.

[44]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[45]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[46]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[47]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[48]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[49]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[50]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[51]  Jascha Sohl-Dickstein,et al.  Guided evolutionary strategies: escaping the curse of dimensionality in random search , 2018, ArXiv.

[52]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[53]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[54]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[55]  A. Caticha Relative Entropy and Inductive Inference , 2003, physics/0311093.

[56]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[57]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[58]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[59]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[60]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[61]  Xiao-Jing Wang,et al.  A Recurrent Network Mechanism of Time Integration in Perceptual Decisions , 2006, The Journal of Neuroscience.

[62]  Karl J. Friston,et al.  Deep temporal models and active inference , 2017, Neuroscience and biobehavioral reviews.

[63]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[64]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[65]  József Fiser,et al.  Perceptual Decision-Making as Probabilistic Inference by Neural Sampling , 2014, Neuron.

[66]  M. Sommer,et al.  Corollary discharge across the animal kingdom , 2008, Nature Reviews Neuroscience.

[67]  Geoffrey E. Hinton Reducing the Dimensionality of Data with Neural , 2008 .

[68]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[69]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[70]  Karl J. Friston,et al.  Active Inference, Curiosity and Insight , 2017, Neural Computation.

[71]  Karl J. Friston,et al.  Deep temporal models and active inference , 2017, Neuroscience & Biobehavioral Reviews.

[72]  Eörs Szathmáry,et al.  How Can Evolution Learn? , 2016, Trends in ecology & evolution.

[73]  Karl J. Friston,et al.  A Free Energy Principle for Biological Systems. , 2012, Entropy.