Learning Causal Overhypotheses through Exploration in Children and Computational Models

Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information through causal inference or induction rather than exploration. In contrast, human children— some of the most proficient explorers—have been shown to use causal information to great benefit. In this work, we introduce a novel RL environment designed with a controllable causal structure, which allows us to evaluate exploration strategies used by both agents and children in a unified environment. In addition, through experimentation on both computation models and children, we demonstrate that there are significant differences between information-gain optimal RL exploration in causal environments and the exploration of children in the same environments. We conclude with a discussion of how these findings may inspire new directions of research into efficient exploration and disambiguation of causal structures for RL algorithms.

[1]  Daniel McDuff,et al.  CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning , 2021, CLeaR.

[2]  Doina Precup,et al.  A Survey of Exploration Methods in Reinforcement Learning , 2021, ArXiv.

[3]  Danilo Jimenez Rezende,et al.  Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning , 2021, NeurIPS Datasets and Benchmarks.

[4]  Song-Chun Zhu,et al.  ACRE: Abstract Causal REasoning Beyond Covariation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yoshua Bengio,et al.  CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning , 2020, ICLR.

[6]  B. Schölkopf,et al.  Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning , 2020, ICML.

[7]  Zeb Kurth-Nelson,et al.  Alchemy: A structured task distribution for meta-reinforcement learning , 2021, ArXiv.

[8]  Shane Legg,et al.  Meta-trained agents implement Bayes-optimal agents , 2020, NeurIPS.

[9]  Fabio Viola,et al.  Causally Correct Partial Models for Reinforcement Learning , 2020, ArXiv.

[10]  Andrew J. Davison,et al.  RLBench: The Robot Learning Benchmark & Learning Environment , 2019, IEEE Robotics and Automation Letters.

[11]  Joshua B. Tenenbaum,et al.  Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning , 2019, Proceedings of the National Academy of Sciences.

[12]  Marlos C. Machado,et al.  Count-Based Exploration with the Successor Representation , 2018, AAAI.

[13]  S. Levine,et al.  Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.

[14]  Silvio Savarese,et al.  Causal Induction from Visual Observations for Goal Directed Tasks , 2019, ArXiv.

[15]  Ross B. Girshick,et al.  PHYRE: A New Benchmark for Physical Reasoning , 2019, NeurIPS.

[16]  Sergey Levine,et al.  Causal Confusion in Imitation Learning , 2019, NeurIPS.

[17]  Deepak Pathak,et al.  Self-Supervised Exploration via Disagreement , 2019, ICML.

[18]  Zeb Kurth-Nelson,et al.  Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.

[19]  Taehoon Kim,et al.  Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[20]  Amos J. Storkey,et al.  Exploration by Random Network Distillation , 2018, ICLR.

[21]  Thien Huu Nguyen,et al.  BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning , 2018, ICLR.

[22]  Dawn Xiaodong Song,et al.  Assessing Generalization in Deep Reinforcement Learning , 2018, ArXiv.

[23]  Jonathan D. Nelson,et al.  Asking the right questions about the psychology of human inquiry: Nine open challenges , 2018, Psychonomic bulletin & review.

[24]  John Schulman,et al.  Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.

[25]  Marlos C. Machado,et al.  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[26]  Christopher G. Lucas,et al.  Changes in cognitive flexibility and hypothesis search across human life history from childhood to adolescence to adulthood , 2017, Proceedings of the National Academy of Sciences.

[27]  Marcus Hutter,et al.  Count-Based Exploration in Feature Space for Reinforcement Learning , 2017, IJCAI.

[28]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Marc G. Bellemare,et al.  Count-Based Exploration with Neural Density Models , 2017, ICML.

[30]  Filip De Turck,et al.  #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.

[31]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[32]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[33]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[34]  Christopher G. Lucas,et al.  When children are better (or at least more open-minded) learners than adults: Developmental differences in learning the forms of causal relationships , 2014, Cognition.

[35]  A. Gopnik Scientific Thinking in Young Children: Theoretical Advances, Empirical Research, and Policy Implications , 2012, Science.

[36]  L. Schulz The origins of inquiry: inductive inference and exploration in early childhood , 2012, Trends in Cognitive Sciences.

[37]  A. Gopnik,et al.  Reconstructing constructivism: causal models, Bayesian learning mechanisms, and the theory theory. , 2012, Psychological bulletin.

[38]  A. Gopnik,et al.  Learning about causes from people: observational causal learning in 24-month-old infants. , 2012, Developmental psychology.

[39]  C. Legare Exploring explanation: explaining inconsistent evidence informs exploratory, hypothesis-testing behavior in young children. , 2012, Child development.

[40]  Noah D. Goodman,et al.  Where science starts: Spontaneous experiments in preschoolers’ exploratory play , 2011, Cognition.

[41]  Joshua B Tenenbaum,et al.  Theory-based causal induction. , 2009, Psychological review.

[42]  Jonathan D. Nelson Experience matters 1 Experience matters : Information acquisition optimizes probability gain , 2009 .

[43]  L. Schulz,et al.  Serious fun: preschoolers engage in more exploratory play when evidence is confounded. , 2007, Developmental psychology.

[44]  J. Tenenbaum,et al.  Bayesian Special Section Learning Overhypotheses with Hierarchical Bayesian Models , 2022 .

[45]  A. Gopnik,et al.  Conditional probability versus spatial contiguity in causal learning: Preschoolers use new contingency evidence to overcome prior spatial assumptions. , 2007, Developmental psychology.

[46]  David M. Sobel,et al.  A theory of causal learning in children: causal maps and Bayes nets. , 2004, Psychological review.

[47]  David M. Sobel,et al.  Causal learning mechanisms in very young children: two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. , 2001, Developmental psychology.

[48]  David M. Sobel,et al.  Detecting blickets: how young children use information about novel causal powers in categorization and induction. , 2000, Child development.

[49]  Nick Chater,et al.  A rational analysis of the selection task as optimal data selection. , 1994 .

[50]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.