On the Relationship Between Active Inference and Control as Inference

Active Inference (AIF) is an emerging framework in the brain sciences which suggests that biological agents act to minimise a variational bound on model evidence. Control-as-Inference (CAI) is a framework within reinforcement learning which casts decision making as a variational inference problem. While these frameworks both consider action selection through the lens of variational inference, their relationship remains unclear. Here, we provide a formal comparison between them and demonstrate that the primary difference arises from how value is incorporated into their respective generative models. In the context of this comparison, we highlight several ways in which these frameworks can inform one another.

[1]  Beren Millidge,et al.  Whence the Expected Free Energy? , 2020, Neural Computation.

[2]  Marc Toussaint,et al.  Approximate Inference and Stochastic Optimal Control , 2010, ArXiv.

[3]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .

[4]  Daniel Guo,et al.  Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.

[5]  Kai Ueltzhöffer,et al.  Deep active inference , 2017, Biological Cybernetics.

[6]  Christopher L. Buckley,et al.  The modularity of action and perception revisited using control theory and active inference , 2018, ALIFE.

[7]  Hagai Attias,et al.  Planning by Probabilistic Inference , 2003, AISTATS.

[8]  Evangelos Theodorou,et al.  Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[9]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[12]  Karl J. Friston,et al.  Generalised free energy and active inference , 2018, Biological Cybernetics.

[13]  Beren Millidge,et al.  Deep Active Inference as Variational Policy Gradients , 2019, Journal of Mathematical Psychology.

[14]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[15]  Karl J. Friston,et al.  Deep active inference agents using Monte-Carlo methods , 2020, NeurIPS.

[16]  Karl J. Friston,et al.  The graphical brain: Belief propagation and active inference , 2017, Network Neuroscience.

[17]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[18]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[19]  Karl J. Friston,et al.  Neuronal message passing using Mean-field, Bethe, and Marginal approximations , 2019, Scientific Reports.

[20]  Wojciech Jaskowski,et al.  Model-Based Active Exploration , 2018, ICML.

[21]  R. Lewin,et al.  MASTERING THE GAME , 1998 .

[22]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[23]  Robert R. Bitmead,et al.  Duality between the discrete-time Kalman filter and LQ control law , 1995, IEEE Trans. Autom. Control..

[24]  Alexander Tschantz,et al.  Scaling Active Inference , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[25]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[26]  Karl J. Friston,et al.  Active inference and epistemic value , 2015, Cognitive neuroscience.

[27]  Emanuel Todorov,et al.  General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.

[28]  Marc Toussaint,et al.  Robot trajectory optimization using approximate inference , 2009, ICML '09.

[29]  Karl J. Friston,et al.  Reinforcement Learning or Active Inference? , 2009, PloS one.

[30]  Karl J. Friston,et al.  A free energy principle for the brain , 2006, Journal of Physiology-Paris.

[31]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[32]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[33]  Sergey Levine,et al.  Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.

[34]  Entropy Order Determination Maximum a Posteriori , 1997 .

[35]  Anil K. Seth,et al.  Reinforcement Learning through Active Inference , 2020, ArXiv.

[36]  A. Clark Radical predictive processing , 2015 .

[37]  J. Andrew Bagnell,et al.  Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .

[38]  Sergey Levine,et al.  Efficient Exploration via State Marginal Matching , 2019, ArXiv.

[39]  Pablo Lanillos,et al.  End-to-End Pixel-Based Deep Active Inference for Body Perception and Action , 2020, 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[40]  Karl J. Friston,et al.  Active Inference: A Process Theory , 2017, Neural Computation.

[41]  Marc Toussaint,et al.  On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.

[42]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[43]  M. Botvinick,et al.  Planning as inference , 2012, Trends in Cognitive Sciences.

[44]  Anil K. Seth,et al.  Learning action-oriented models through active inference , 2019, bioRxiv.

[45]  Christopher L. Buckley,et al.  Generative models as parsimonious descriptions of sensorimotor loops , 2019, Behavioral and Brain Sciences.

[46]  K. Rawlik On probabilistic inference approaches to stochastic optimal control , 2013 .

[47]  David P. McGovern,et al.  Evaluating the neurophysiological evidence for predictive processing as a model of perception , 2020, Annals of the New York Academy of Sciences.

[48]  Filip De Turck,et al.  VIME: Variational Information Maximizing Exploration , 2016, NIPS.

[49]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[50]  Hilbert J. Kappen,et al.  Risk Sensitive Path Integral Control , 2010, UAI.

[51]  Karl J. Friston,et al.  Active inference and robot control: a case study , 2016, Journal of The Royal Society Interface.