论文信息 - One-shot learning by inverting a compositional causal process

One-shot learning by inverting a compositional causal process

People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems. Here we present a Hierarchical Bayesian model based on com-positionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image. We evaluated performance on a challenging one-shot classification task, where our model achieved a human-level error rate while substantially outperforming two deep learning models. We also tested the model on another conceptual task, generating new examples, by using a "visual Turing test" to show that our model produces human-like performance.

[1] Patrick Henry Winston,et al. Learning structural descriptions from examples , 1970 .

[2] Susan Carey,et al. Acquiring a Single New Word , 1978 .

[3] J. Freyd,et al. Representing the dynamics of a static form , 1983, Memory & cognition.

[4] I. Biederman. Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[5] M. Babcock,et al. Perception of dynamic information in static handwritten forms. , 1988, The American journal of psychology.

[6] E. Markman. Categorization and naming in children , 1989 .

[7] Jun S. Huang,et al. Stroke segmentation by bernstein-bezier curve fitting , 1990, Pattern Recognit..

[8] Ching Y. Suen,et al. Thinning Methodologies - A Comprehensive Survey , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[10] Geoffrey E. Hinton,et al. Using Generative Models for Handwritten Digit Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11] Feldman,et al. The Structure of Perceptual Categories , 1997, Journal of mathematical psychology.

[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13] Ching Y. Suen,et al. Identification of Fork Points on the Skeletons of Handwritten Chinese Characters , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14] Paul A. Viola,et al. Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15] Jean-Luc Velay,et al. Visual presentation of single letters activates a premotor area involved in writing , 2003, NeuroImage.

[16] Geoffrey E. Hinton,et al. Inferring Motor Programs from Images of Handwritten Digits , 2005, NIPS.

[17] K. James,et al. Letter processing automatically recruits a sensory–motor brain network , 2006, Neuropsychologia.

[18] Pietro Perona,et al. One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] J. Tenenbaum,et al. Word learning as Bayesian inference. , 2007, Psychological review.

[20] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[22] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.

[23] Charles Kemp,et al. Abstraction and Relational learning , 2009, NIPS.

[24] K. James,et al. When writing impairs reading: letter perception's susceptibility to motor interference. , 2009, Journal of experimental psychology. General.

[25] Julien Diard,et al. Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters , 2011, PloS one.

[26] Joshua B. Tenenbaum,et al. One shot learning of simple visual concepts , 2011, CogSci.

[27] Joshua B. Tenenbaum,et al. Concept learning as motor program induction: A large-scale empirical study , 2012, CogSci.

[28] Joshua B. Tenenbaum,et al. Learning with Hierarchical-Deep Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.