One-shot learning by inverting a compositional causal process

People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems. Here we present a Hierarchical Bayesian model based on com-positionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image. We evaluated performance on a challenging one-shot classification task, where our model achieved a human-level error rate while substantially outperforming two deep learning models. We also tested the model on another conceptual task, generating new examples, by using a "visual Turing test" to show that our model produces human-like performance.

[1]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[2]  Susan Carey,et al.  Acquiring a Single New Word , 1978 .

[3]  J. Freyd,et al.  Representing the dynamics of a static form , 1983, Memory & cognition.

[4]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[5]  M. Babcock,et al.  Perception of dynamic information in static handwritten forms. , 1988, The American journal of psychology.

[6]  E. Markman Categorization and naming in children , 1989 .

[7]  Jun S. Huang,et al.  Stroke segmentation by bernstein-bezier curve fitting , 1990, Pattern Recognit..

[8]  Ching Y. Suen,et al.  Thinning Methodologies - A Comprehensive Survey , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[10]  Geoffrey E. Hinton,et al.  Using Generative Models for Handwritten Digit Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Feldman,et al.  The Structure of Perceptual Categories , 1997, Journal of mathematical psychology.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Ching Y. Suen,et al.  Identification of Fork Points on the Skeletons of Handwritten Chinese Characters , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Jean-Luc Velay,et al.  Visual presentation of single letters activates a premotor area involved in writing , 2003, NeuroImage.

[16]  Geoffrey E. Hinton,et al.  Inferring Motor Programs from Images of Handwritten Digits , 2005, NIPS.

[17]  K. James,et al.  Letter processing automatically recruits a sensory–motor brain network , 2006, Neuropsychologia.

[18]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[20]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[23]  Charles Kemp,et al.  Abstraction and Relational learning , 2009, NIPS.

[24]  K. James,et al.  When writing impairs reading: letter perception's susceptibility to motor interference. , 2009, Journal of experimental psychology. General.

[25]  Julien Diard,et al.  Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters , 2011, PloS one.

[26]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[27]  Joshua B. Tenenbaum,et al.  Concept learning as motor program induction: A large-scale empirical study , 2012, CogSci.

[28]  Joshua B. Tenenbaum,et al.  Learning with Hierarchical-Deep Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.