论文信息 - HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving - 字舞流文

HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Humans learn compositional and causal abstraction, i.e., knowledge, in response to the structure of naturalistic tasks. When presented with a problem-solving task involving some objects, toddlers would first interact with these objects to reckon what they are and what can be done with them. Leveraging these concepts, they could understand the internal structure of this task, without seeing all of the problem instances. Remarkably, they further build cognitively executable strategies to rapidly solve novel problems. To empower a learning agent with similar capability, we argue there shall be three levels of generalization in how an agent represents its knowledge: perceptual, conceptual, and algorithmic. In this paper, we devise the very first systematic benchmark that offers joint evaluation covering all three levels. This benchmark is centered around a novel task domain, HALMA, for visual concept development and rapid problem solving. Uniquely, HALMA has a minimum yet complete concept space, upon which we introduce a novel paradigm to rigorously diagnose and dissect learning agents’ capability in understanding and generalizing complex and structural concepts. We conduct extensive experiments on reinforcement learning agents with various inductive biases and carefully report their proficiency and weakness.1

Song-Chun Zhu | Ying Nian Wu | Sirui Xie | Yixin Zhu | Xiaojian Ma | Peiyu Yu | Song-Chun Zhu | Y. Wu | Yixin Zhu | Xiaojian Ma | Sirui Xie | Peiyu Yu

[1] D. Kahneman,et al. The reviewing of object files: Object-specific integration of information , 1992, Cognitive Psychology.

[2] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[3] Thomas L. Griffiths,et al. A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[4] Ali Farhadi,et al. Visual Semantic Navigation using Scene Priors , 2018, ICLR.

[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[6] Feng Gao,et al. RAVEN: A Dataset for Relational and Analogical Visual REasoNing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.

[8] Rowan McAllister,et al. Learning Invariant Representations for Reinforcement Learning without Reconstruction , 2020, ICLR.

[9] S. Sanner. First-order Decision-theoretic Planning in Structured Relational Environments , 2008 .

[10] Lea Fleischer,et al. General Pattern Theory A Mathematical Study Of Regular Structures , 2016 .

[11] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Jonathan D. Cohen,et al. The Computational and Neural Basis of Cognitive Control: Charted Territory and New Frontiers , 2014, Cogn. Sci..

[15] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[16] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17] Kari S. Kretch,et al. Cliff or step? Posture-specific learning at the edge of a drop-off. , 2013, Child development.

[18] Joshua B. Tenenbaum,et al. Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense , 2020, Engineering.

[19] Ruslan Salakhutdinov,et al. Embodied Multimodal Multitask Learning , 2019, IJCAI.

[20] Joel Z. Leibo,et al. Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[21] Razvan Pascanu,et al. Deep reinforcement learning with relational inductive biases , 2018, ICLR.

[22] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[23] Marco Baroni,et al. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[24] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[26] Doina Precup,et al. Bisimulation Metrics for Continuous Markov Decision Processes , 2011, SIAM J. Comput..

[27] Aaron C. Courville,et al. Systematic Generalization: What Is Required and Can It Be Learned? , 2018, ICLR.

[28] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.

[29] Jonathan D. Cohen,et al. Prefrontal cortex and flexible cognitive control: rules without symbols. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[31] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.

[32] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.

[33] J. Flavell. The Developmental psychology of Jean Piaget , 1963 .

[34] Song-Chun Zhu,et al. Understanding tools: Task-oriented object modeling, learning and recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[36] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[37] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[38] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[39] S. Carey. The Origin of Concepts , 2000 .

[40] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[41] Felix Hill,et al. Measuring abstract reasoning in neural networks , 2018, ICML.

[42] Stephen Clark,et al. Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[43] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[44] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[45] Zeb Kurth-Nelson,et al. Been There, Done That: Meta-Learning with Episodic Recall , 2018, ICML.

[46] G. Marcus. The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[47] Manuel Lopes,et al. Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[48] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.

[49] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[50] S. Dehaene,et al. The Number Sense: How the Mind Creates Mathematics. , 1998 .

[51] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[52] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.

[53] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[54] Geoffrey E. Hinton,et al. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[55] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[56] S. Dehaene,et al. Cross-linguistic regularities in the frequency of number words , 1992, Cognition.

[57] K. Holyoak,et al. Mental Leaps: Analogy in Creative Thought , 1994 .

[58] J. Gibson. The Ecological Approach to Visual Perception , 1979 .

[59] Zenon W. Pylyshyn,et al. Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[60] Sungjin Ahn,et al. SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition , 2020, ICLR.

[61] R. Weale. Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[62] Jieyu Zhao,et al. Simple Principles of Metalearning , 1996 .

[63] Doina Precup,et al. What can I do here? A Theory of Affordances in Reinforcement Learning , 2020, ICML.

[64] Klaus Greff,et al. Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.

[65] Lisa Feigenson,et al. Tracking individuals via object-files: evidence from infants' manual search , 2003 .