Deriving time-averaged active inference from control principles

Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or discounted-surprise problems, despite deriving from the infinite-horizon, average-surprise imperative of the free-energy principle. Here we derive an infinite-horizon, average-surprise formulation of active inference from optimal control principles. Our formulation returns to the roots of active inference in neuroanatomy and neurophysiology, formally reconnecting active inference to optimal feedback control. Our formulation provides a unified objective functional for sensorimotor control and allows for reference states to vary over time.

[1]  Lancelot Da Costa,et al.  On Bayesian mechanics: a physics of and by beliefs , 2022, Interface Focus.

[2]  Marc W Howard Formal models of memory based on temporally-varying representations , 2022, 2201.01796.

[3]  Jan-Willem van de Meent,et al.  Interoception as modeling, allostasis as control , 2021, Biological Psychology.

[4]  M. Ramstead,et al.  Active inference models do not contradict folk psychology , 2021, Synthese.

[5]  Ashutosh Nayyar,et al.  Online Learning for Unknown Partially Observable MDPs , 2021, AISTATS.

[6]  G. Pezzulo,et al.  Simulating homeostatic, allostatic and goal-directed forms of interoceptive control using active inference , 2021, Biological Psychology.

[7]  Jun Tani,et al.  Active Inference in Robotics and Artificial Agents: Survey and Challenges , 2021, ArXiv.

[8]  N. Daw,et al.  Linear reinforcement learning in planning, grid fields, and cognitive control , 2021, Nature Communications.

[9]  Keith W. Ross,et al.  On-Policy Deep Reinforcement Learning for the Average-Reward Criterion , 2021, ICML.

[10]  Sergey Levine,et al.  DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[11]  W. Grill,et al.  Functions of Interoception: From Energy Regulation to Experience of the Self , 2021, Trends in Neurosciences.

[12]  A. Kucukelbir,et al.  Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning , 2020, AISTATS.

[13]  Alaa A. Ahmed,et al.  Vigor: Neuroeconomics of Movement Control , 2020 .

[14]  Mohammadhosein Hasanbeig,et al.  Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer , 2020, ArXiv.

[15]  Karl J. Friston,et al.  Active inference on discrete state-spaces: A synthesis , 2020, Journal of mathematical psychology.

[16]  M. Andermann,et al.  Estimation of Current and Future Physiological States in Insular Cortex , 2020, Neuron.

[17]  Physics of Biological Action and Perception , 2020 .

[18]  J. Krebs,et al.  Foraging Theory , 2019 .

[19]  Greg Wayne,et al.  Hierarchical motor control in mammals and machines , 2019, Nature Communications.

[20]  Sergey Levine,et al.  Planning with Goal-Conditioned Policies , 2019, NeurIPS.

[21]  Scott T. Grafton,et al.  A Minimum Free Energy Model of Motor Learning , 2019, Neural Computation.

[22]  Peter Sterling,et al.  Allostasis: A Brain-Centered, Predictive Mode of Physiological Regulation , 2019, Trends in Neurosciences.

[23]  Alberto Camacho,et al.  LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning , 2019, IJCAI.

[24]  Jakob Hohwy,et al.  Allostasis, interoception, and the free energy principle: Feeling our way forward , 2018, Oxford Scholarship Online.

[25]  Craig S. Chapman,et al.  Decision-making in sensorimotor control , 2018, Nature Reviews Neuroscience.

[26]  Sheila A. McIlraith,et al.  Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning , 2018, ICML.

[27]  Karl J. Friston,et al.  Hierarchical Active Inference: A Theory of Motivated Control , 2018, Trends in Cognitive Sciences.

[28]  Karl J. Friston,et al.  The Homeostatic Logic of Reward , 2018, bioRxiv.

[29]  Karl J. Friston,et al.  Deep temporal models and active inference , 2017, Neuroscience & Biobehavioral Reviews.

[30]  Rafal Bogacz,et al.  A tutorial on the free-energy framework for modelling perception and learning , 2017, Journal of mathematical psychology.

[31]  Karl J. Friston,et al.  Active Inference: A Process Theory , 2017, Neural Computation.

[32]  G. Pezzulo,et al.  Navigating the Affordance Landscape: Feedback Control as a Process Model of Behavior and Cognition , 2016, Trends in Cognitive Sciences.

[33]  Karl J. Friston,et al.  Active Inference, homeostatic regulation and adaptive behavioural control , 2015, Progress in Neurobiology.

[34]  W. K. Simmons,et al.  Interoceptive predictions in the brain , 2015, Nature Reviews Neuroscience.

[35]  M. Husain,et al.  Reward Pays the Cost of Noise Reduction in Motor and Cognitive Control , 2015, Current Biology.

[36]  H. Kappen,et al.  Path integral control and state-dependent feedback. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Anatol G. Feldman,et al.  Referent control of action and perception , 2015, Springer New York.

[38]  Yunpeng Pan,et al.  Nonparametric infinite horizon Kullback-Leibler stochastic control , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[39]  Evangelos Theodorou,et al.  Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[40]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[41]  Karl J. Friston,et al.  Predictions not commands: active inference in the motor system , 2012, Brain Structure and Function.

[42]  Karl J. Friston,et al.  Active inference and agency: optimal control without cost functions , 2012, Biological Cybernetics.

[43]  P. Sterling Allostasis: A model of predictive regulation , 2012, Physiology & Behavior.

[44]  Marc W. Howard,et al.  A Scale-Invariant Internal Representation of Time , 2012, Neural Computation.

[45]  Mark L Latash,et al.  Motor synergies and the equilibrium-point hypothesis. , 2010, Motor control.

[46]  J. Kalaska,et al.  Neural mechanisms for interacting with a world full of action choices. , 2010, Annual review of neuroscience.

[47]  William H. Alexander,et al.  Hyperbolically Discounted Temporal Difference Learning , 2010, Neural Computation.

[48]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[49]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[50]  Karl J. Friston,et al.  Generalised Filtering , 2010 .

[51]  Karl J. Friston,et al.  Reinforcement Learning or Active Inference? , 2009, PloS one.

[52]  Emanuel Todorov,et al.  Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.

[53]  Karl J. Friston,et al.  A Hierarchy of Time-Scales and the Brain , 2008, PLoS Comput. Biol..

[54]  R. Carpenter,et al.  Homeostasis: a plea for a unified approach. , 2004, Advances in physiology education.

[55]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[56]  David S. Touretzky,et al.  Behavioral considerations suggest an average reward TD model of the dopamine system , 2000, Neurocomputing.

[57]  Gregor Schöner,et al.  The uncontrolled manifold concept: identifying control variables for a functional task , 1999, Experimental Brain Research.

[58]  Prasad Tadepalli,et al.  Model-Based Average Reward Reinforcement Learning , 1998, Artif. Intell..

[59]  N. Mrosovsky Rheostasis: The Physiology of Change , 1990 .

[60]  A. G. Feldman Once More on the Equilibrium-Point Hypothesis (λ Model) for Motor Control , 1986 .

[61]  L. Pinneo On noise in the nervous system. , 1966, Psychological review.