One Shot Learning via Compositions of Meaningful Patches

The task of discriminating one object from another is almost trivial for a human being. However, this task is computationally taxing for most modern machine learning methods, whereas, we perform this task at ease given very few examples for learning. It has been proposed that the quick grasp of concept may come from the shared knowledge between the new example and examples previously learned. We believe that the key to one-shot learning is the sharing of common parts as each part holds immense amounts of information on how a visual concept is constructed. We propose an unsupervised method for learning a compact dictionary of image patches representing meaningful components of an objects. Using those patches as features, we build a compositional model that outperforms a number of popular algorithms on a one-shot learning task. We demonstrate the effectiveness of this approach on hand-written digits and show that this model generalizes to multiple datasets.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Stuart Geman,et al.  Context and Hierarchy in a Probabilistic Image Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Baining Guo,et al.  Real-time texture synthesis by patch-based sampling , 2001, TOGS.

[6]  HARRY BLUM,et al.  Shape description using weighted symmetric axis features , 1978, Pattern Recognit..

[7]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Alan L. Yuille,et al.  FORMS: A flexible object recognition and modelling system , 1996, International Journal of Computer Vision.

[9]  Joshua B. Tenenbaum,et al.  One-shot learning by inverting a compositional causal process , 2013, NIPS.

[10]  Alan L. Yuille,et al.  Segmenting by seeking the symmetry axis , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[11]  Susan Carey,et al.  Acquiring a Single New Word , 1978 .

[12]  Justin Halberda,et al.  Rapid fast-mapping abilities in 2-year-olds. , 2011, Journal of experimental child psychology.

[13]  H. Blum Biological shape and visual science (part I) , 1973 .

[14]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[16]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[17]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[18]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[19]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[20]  Eve V. Clark,et al.  First Language Acquisition , 2002, The Study of Language.

[21]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[22]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[23]  Alan L. Yuille,et al.  Learning a dictionary of deformable patches using GPUs , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[24]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[25]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[26]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Fei-FeiLi,et al.  One-Shot Learning of Object Categories , 2006 .

[29]  Cheng-Lin Liu,et al.  Handwritten digit recognition: benchmarking of state-of-the-art techniques , 2003, Pattern Recognit..

[30]  Jun Zhu,et al.  An Active Patch Model for Real World Texture and Appearance Classification , 2014, ECCV.

[31]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[32]  Terezinha Nunes,et al.  Children's understanding of the formal and functional characteristics of written Chinese , 1998, Applied Psycholinguistics.

[33]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[34]  E. Kleinknecht,et al.  Beyond fast mapping: young children's extensions of novel words and novel facts. , 2001, Developmental psychology.

[35]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.