Representational efficiency outweighs action efficiency in human program induction

The importance of hierarchically structured representations for tractable planning has long been acknowledged. However, the questions of how people discover such abstractions and how to define a set of optimal abstractions remain open. This problem has been explored in cognitive science in the problem solving literature and in computer science in hierarchical reinforcement learning. Here, we emphasize an algorithmic perspective on learning hierarchical representations in which the objective is to efficiently encode the structure of the problem, or, equivalently, to learn an algorithm with minimal length. We introduce a novel problem-solving paradigm that links problem solving and program induction under the Markov Decision Process (MDP) framework. Using this task, we target the question of whether humans discover hierarchical solutions by maximizing efficiency in number of actions they generate or by minimizing the complexity of the resulting representation and find evidence for the primacy of representational efficiency.

[1]  H. Simon,et al.  Perception in chess , 1973 .

[2]  John H. R. Maunsell,et al.  Hierarchical organization and functional streams in the visual cortex , 1983, Trends in Neurosciences.

[3]  HERBERT A. SIMON,et al.  The Architecture of Complexity , 1991 .

[4]  A. Shiryayev On Tables of Random Numbers , 1993 .

[5]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[7]  Andrew G. Barto,et al.  Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[8]  Andrew G. Barto,et al.  Skill Characterization Based on Betweenness , 2008, NIPS.

[9]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[10]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[11]  Daniel Polani,et al.  Information Theory of Decisions and Actions , 2011 .

[12]  Daniel Polani,et al.  Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[13]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[14]  Andrew G. Barto,et al.  Behavioral Hierarchy: Exploration and Representation , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.

[15]  Alec Solway,et al.  Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..

[16]  Matthew Crosby,et al.  Association for the Advancement of Artificial Intelligence , 2014 .

[17]  Alec Solway,et al.  Reinforcement learning, efficient coding, and the statistics of natural tasks , 2015, Current Opinion in Behavioral Sciences.

[18]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[19]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[20]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[21]  Pieter Abbeel,et al.  Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.

[22]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23]  Pieter Abbeel,et al.  Meta Learning Shared Hierarchies , 2017, ICLR.

[24]  Doina Precup,et al.  When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.