Using subgoals to reduce the descriptive complexity of probabilistic inference and control programs

Humans and other animals are able to flexibly select among internally generated goals and form plans to achieve them. Still, the neuronal and computational principles governing these abilities are incompletely known. In computational neuroscience, goal-directed decision-making has been linked to model-based methods of reinforcement learning, which use a model of the task to predict the outcome of possible courses of actions, and can select flexibly among them. In principle, this method permits planning optimal action sequences. However, model-based computations are prohibitive for large state spaces and several methods to simplify them have been proposed. In hierarchical reinforcement learning, temporal abstractions methods such as the Options framework permit splitting the search space by learning reusable macro-actions that achieve subgoals. In this article we offer a normative perspective on the role of subgoals and temporal abstractions in model-based computations. We hypothesize that the main role of subgoals is reducing the complexity of learning, inference, and control tasks by guiding the selection of more compact control programs. To explore this idea, we adopt a Bayesian formulation of model-based search: planning-as-inference. In the proposed method, subgoals and associated policies are selected via probabilistic inference using principles of descriptive complexity. We present preliminary results that show the suitability of the proposed method and discuss the links with brain circuits for goal and subgoal processing in prefrontal cortex.

[1]  Milos Hauskrecht,et al.  Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.

[2]  G. Pezzulo,et al.  The Value of Foresight: How Prospection Affects Decision-Making , 2011, Front. Neurosci..

[3]  M. Botvinick,et al.  Planning as inference , 2012, Trends in Cognitive Sciences.

[4]  Karl J. Friston,et al.  A Hierarchy of Time-Scales and the Brain , 2008, PLoS Comput. Biol..

[5]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[6]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[7]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[8]  J. Tanji,et al.  Representation of immediate and final behavioral goals in the monkey prefrontal cortex during an instructed delay period. , 2005, Cerebral cortex.

[9]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[10]  Giovanni Pezzulo,et al.  The Mixed Instrumental Controller: Using Value of Information to Combine Habitual Choice and Mental Simulation , 2013, Front. Psychol..

[11]  Andrew G. Barto,et al.  Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[12]  Hector Geffner,et al.  Width and Serialization of Classical Planning Problems , 2012, ECAI.

[13]  Corso Elvezia,et al.  Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997 .

[14]  S. Wise,et al.  Frontal pole cortex: encoding ends at the end of the endbrain , 2011, Trends in Cognitive Sciences.

[15]  J. Tanji,et al.  Activity in the Lateral Prefrontal Cortex Reflects Multiple Steps of Future Events in Action Plans , 2006, Neuron.

[16]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[17]  E. Miller,et al.  An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[18]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[19]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[20]  Matthijs A. A. van der Meer,et al.  Expectancies in Decision Making, Reinforcement Learning, and Ventral Striatum , 2009, Frontiers in neuroscience.

[21]  Hagai Attias,et al.  Planning by Probabilistic Inference , 2003, AISTATS.

[22]  Daniel Polani,et al.  Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[23]  Alec Solway,et al.  Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. , 2012, Psychological review.

[24]  Rajesh P. N. Rao,et al.  Planning and Acting in Uncertain Environments using Probabilistic Inference , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Karl J. Friston,et al.  Reinforcement Learning or Active Inference? , 2009, PloS one.

[26]  R. Passingham,et al.  The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight , 2012 .