Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

Humans have an inherent ability to learn novel concepts from only a few samples and generalize these concepts to different situations. Even though today's machine learning models excel with a plethora of training data on standard recognition tasks, a considerable gap exists between machine-level pattern recognition and human-level concept learning. To narrow this gap, the Bongard Problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems. Despite new advances in representation learning and learning to learn, BPs remain a daunting challenge for modern AI. Inspired by the original one hundred BPs, we propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning. We develop a program-guided generation technique to produce a large set of human-interpretable visual cognition problems in action-oriented LOGO language. Our benchmark captures three core properties of human cognition: 1) context-dependent perception, in which the same object may have disparate interpretations given different contexts; 2) analogy-making perception, in which some meaningful concepts are traded off for other meaningful concepts; and 3) perception with a few samples but infinite vocabulary. In experiments, we show that the state-of-the-art deep learning methods perform substantially worse than human subjects, implying that they fail to capture core human cognition properties. Finally, we discuss research directions towards a general architecture for visual reasoning to tackle this benchmark.

[1]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[2]  S. Srihari Mixture Density Networks , 1994 .

[3]  Maithilee Kunda,et al.  A computational model for solving problems from the Raven’s Progressive Matrices intelligence test using iconic visual representations , 2013, Cognitive Systems Research.

[4]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[5]  David J. Chalmers,et al.  High-level perception, representation, and analogy: a critique of artificial intelligence methodology , 1992, J. Exp. Theor. Artif. Intell..

[6]  Tom Silver,et al.  Behavior Is Everything: Towards Representing Concepts with Sensorimotor Contingencies , 2018, AAAI.

[7]  Kai-Uwe Kühnberger,et al.  Neural-Symbolic Learning and Reasoning: A Survey and Interpretation , 2017, Neuro-Symbolic Artificial Intelligence.

[8]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[9]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[10]  Samy Bengio,et al.  Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML , 2020, ICLR.

[11]  Yue Wang,et al.  Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[12]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[13]  Chuang Gan,et al.  Visual Concept-Metaconcept Learning , 2020, NeurIPS.

[14]  Chuang Gan,et al.  The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.

[15]  Alexandre Linhares,et al.  A glimpse at the metaphysics of Bongard problems , 2000, Artif. Intell..

[16]  D. Gentner,et al.  Introduction : The Place of Analogy in Cognition , 2005 .

[17]  Dileep George,et al.  Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs , 2018, Science Robotics.

[18]  J. Raven STANDARDIZATION OF PROGRESSIVE MATRICES, 1938 , 1941 .

[19]  Charles Cole,et al.  Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought , 1996 .

[20]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[21]  Frank Jäkel,et al.  Solving Bongard Problems with a Visual Language and Pragmatic Reasoning , 2018, ArXiv.

[22]  Feng Gao,et al.  RAVEN: A Dataset for Relational and Analogical Visual REasoNing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Joshua B. Tenenbaum,et al.  The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving , 2019, CogSci.

[24]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[25]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ross B. Girshick,et al.  PHYRE: A New Benchmark for Physical Reasoning , 2019, NeurIPS.

[27]  M. Golubitsky,et al.  The Recognition Problem , 1985 .

[28]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[29]  Helge J. Ritter,et al.  Physical Bongard Problems , 2012, AIAI.

[30]  Kazumi Saito,et al.  A concept learning algorithm with adaptive search , 1993, Machine Intelligence 14.

[31]  B. Indurkhya,et al.  Metaphor and Cognition: An Interactionist Approach , 1993, CL.

[32]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[33]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[34]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[35]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[36]  Pushmeet Kohli,et al.  Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.

[37]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[38]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[39]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[40]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, ICCV 2003.

[41]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[42]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[43]  M A Just,et al.  From the SelectedWorks of Marcel Adam Just 1990 What one intelligence test measures : A theoretical account of the processing in the Raven Progressive Matrices Test , 2016 .

[44]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[45]  Trevor Darrell,et al.  A New Meta-Baseline for Few-Shot Learning , 2020, ArXiv.

[46]  Anton van den Hengel,et al.  V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices , 2019, AAAI.

[47]  Ali Farhadi,et al.  Learning Everything about Anything: Webly-Supervised Visual Concept Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Luc De Raedt,et al.  Neural-Symbolic Learning and Reasoning: Contributions and Challenges , 2015, AAAI Spring Symposia.