Solving Bongard Problems with a Visual Language and Pragmatic Reasoning

More than 50 years ago Bongard introduced 100 visual concept learning problems as a testbed for intelligent vision systems. These problems are now known as Bongard problems. Although they are well known in the cognitive science and AI communities only moderate progress has been made towards building systems that can solve a substantial subset of them. In the system presented here, visual features are extracted through image processing and then translated into a symbolic visual vocabulary. We introduce a formal language that allows representing complex visual concepts based on this vocabulary. Using this language and Bayesian inference, complex visual concepts can be induced from the examples that are provided in each Bongard problem. Contrary to other concept learning problems the examples from which concepts are induced are not random in Bongard problems, instead they are carefully chosen to communicate the concept, hence requiring pragmatic reasoning. Taking pragmatic reasoning into account we find good agreement between the concepts with high posterior probability and the solutions formulated by Bongard himself. While this approach is far from solving all Bongard problems, it solves the biggest fraction yet.

[1]  Armando Solar-Lezama,et al.  Unsupervised Learning by Program Synthesis , 2015, NIPS.

[2]  Matthias Bethge,et al.  Comparing deep neural networks against humans: object recognition when the signal gets weaker , 2017, ArXiv.

[3]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[4]  Noah D. Goodman,et al.  The logical primitives of thought: Empirical foundations for compositional cognitive models. , 2016, Psychological review.

[5]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[7]  Joshua B. Tenenbaum,et al.  Grammar-based object representations in a scene parsing task , 2009 .

[8]  Jaakko Hintikka,et al.  On the Logic of Perception , 1969 .

[9]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[10]  Zhiyi Chi,et al.  Statistical Properties of Probabilistic Context-Free Grammars , 1999, CL.

[11]  D. Hofstadter Gödel, Escher, Bach , 1979 .

[12]  J. Tenenbaum,et al.  Generalization, similarity, and Bayesian inference. , 2001, The Behavioral and brain sciences.

[13]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jacob Feldman,et al.  Minimization of Boolean complexity in human concept learning , 2000, Nature.

[15]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[16]  R. Weale Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[17]  Virginia Savova A Grammar-Based Approach to Visual Category Learning , 2008 .

[18]  Luc De Raedt,et al.  Inductive Logic Programming , 2010, Encyclopedia of Machine Learning.

[19]  Ting Li,et al.  Comparing machines and humans on a visual categorization test , 2011, Proceedings of the National Academy of Sciences.

[20]  M. M. Bongard,et al.  Pattern Recognition , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[21]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[22]  José Hernández-Orallo,et al.  Computer models solving intelligence test problems: Progress and implications , 2016, Artif. Intell..

[23]  Jake Porway,et al.  A stochastic graph grammar for compositional object representation and recognition , 2009, Pattern Recognit..

[24]  King-Sun Fu,et al.  Syntactic Methods in Pattern Recognition , 1974, IEEE Transactions on Systems, Man, and Cybernetics.

[25]  Kai-Uwe Kühnberger,et al.  A Unifying Approach to High- and Low-Level Cognition , 2013 .

[26]  Kazumi Saito,et al.  A concept learning algorithm with adaptive search , 1993, Machine Intelligence 14.

[27]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[28]  Marvin Minsky,et al.  Linearly Unrecognizable Patterns , 1967 .

[29]  Robert A. Jacobs,et al.  From Sensory Signals to Modality-Independent Conceptual Representations: A Probabilistic Language of Thought Approach , 2015, PLoS Comput. Biol..

[30]  Robert A Jacobs,et al.  Visual Shape Perception as Bayesian Inference of 3D Object-Centered Shape Representations , 2017, Psychological review.

[31]  Jovisa D. Zunic,et al.  Measuring Elongation from Shape Boundary , 2007, Journal of Mathematical Imaging and Vision.

[32]  M A Just,et al.  From the SelectedWorks of Marcel Adam Just 1990 What one intelligence test measures : A theoretical account of the processing in the Raven Progressive Matrices Test , 2016 .

[33]  R. Jacobs,et al.  Learning abstract visual concepts via probabilistic program induction in a Language of Thought , 2017, Cognition.

[34]  Lauren E. Welbourne,et al.  Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes , 2017, Current Biology.

[35]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[36]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[37]  Mario Fritz,et al.  A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.

[38]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[39]  Z. Pylyshyn Is vision continuous with cognition? The case for cognitive impenetrability of visual perception. , 1999, The Behavioral and brain sciences.

[40]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[41]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Simone Cirillo,et al.  An anthropomorphic method for progressive matrix problems , 2013, Cognitive Systems Research.

[43]  Armand Joulin,et al.  Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.

[44]  Fanya S. Montalvo Diagram Understanding: The Intersection of Computer Vision and Graphics , 1985 .

[45]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[46]  Kenneth D. Forbus,et al.  The Cognitive Science of Sketch Worksheets , 2017, Top. Cogn. Sci..

[47]  Michael G. Thomason,et al.  Syntactic Pattern Recognition, An Introduction , 1978, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Noah D. Goodman,et al.  Teaching Games : Statistical Sampling Assumptions for Learning in Pedagogical Situations , 2008 .

[49]  Maithilee Kunda,et al.  A computational model for solving problems from the Raven’s Progressive Matrices intelligence test using iconic visual representations , 2013, Cognitive Systems Research.

[50]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Noah D. Goodman,et al.  Concepts in a Probabilistic Language of Thought , 2014 .

[52]  Joshua B. Tenenbaum,et al.  Fragment Grammars: Exploring Computation and Reuse in Language , 2009 .