Learning to Learn: Model Regression Networks for Easy Small Sample Learning

We develop a conceptually simple but powerful approach that can learn novel categories from few annotated examples. In this approach, the experience with already learned categories is used to facilitate the learning of novel classes. Our insight is two-fold: (1) there exists a generic, category agnostic transformation from models learned from few samples to models learned from large enough sample sets, and (2) such a transformation could be effectively learned by high-capacity regressors. In particular, we automatically learn the transformation with a deep model regression network on a large collection of model pairs. Experiments demonstrate that encoding this transformation as prior knowledge greatly facilitates the recognition in the small sample size regime on a broad range of tasks, including domain adaptation, fine-grained recognition, action recognition, and scene classification.

[1]  Sebastian Thrun,et al.  Learning One More Thing , 1994, IJCAI.

[2]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[3]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[4]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[5]  S. Pinker How the Mind Works , 1999, Annals of the New York Academy of Sciences.

[6]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[7]  T. Ben-David,et al.  Exploiting Task Relatedness for Multiple , 2003 .

[8]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Michael Fink,et al.  Object Classification from a Single Example Utilizing Class Relevance Metrics , 2004, NIPS.

[10]  Jonathan Baxter,et al.  A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling , 1997, Machine Learning.

[11]  Yair Weiss,et al.  Learning object detection from a small number of examples: the importance of good features , 2004, CVPR 2004.

[12]  Yair Weiss,et al.  Learning From a Small Number of Training Examples by Exploiting Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  Neil D. Lawrence,et al.  Extensions of the Informative Vector Machine , 2004, Deterministic and Statistical Methods in Machine Learning.

[14]  Erik G. Learned-Miller,et al.  Building a classification cascade for visual identification from one example , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[17]  Shimon Ullman,et al.  Single-example Learning of Novel Classes using Representation by Similarity , 2005, BMVC.

[18]  Gilles Blanchard,et al.  Pattern Recognition from One Example by Chopping , 2005, NIPS.

[19]  Lior Wolf,et al.  Robust boosting for learning from few examples , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[21]  Bernhard Schölkopf,et al.  Semi-Supervised Learning (Adaptive Computation and Machine Learning) , 2006 .

[22]  Jason Weston,et al.  Inference with the Universum , 2006, ICML.

[23]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[25]  Kumar Chellapilla,et al.  Personalized handwriting recognition via biased regularization , 2006, ICML.

[26]  Daphna Weinshall,et al.  Learning a kernel function for classification with small training samples , 2006, ICML.

[27]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28]  Rong Yan,et al.  Adapting SVM Classifiers to Data with Shifted Distributions , 2007 .

[29]  Shimon Ullman,et al.  Uncovering shared structures in multiclass classification , 2007, ICML '07.

[30]  William-Chandra Tjhi,et al.  Dual Fuzzy-Possibilistic Co-clustering for Document Categorization , 2007 .

[31]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[32]  Daphne Koller,et al.  Learning a meta-level prior for feature relevance from multiple related tasks , 2007, ICML '07.

[33]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[35]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[36]  Trevor Darrell,et al.  Transfer learning for image classification with sparse prototype representations , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Tal Hassner,et al.  The One-Shot similarity kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[39]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[40]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[41]  Joachim Denzler,et al.  One-Shot Learning of Object Categories Using Dependent Gaussian Processes , 2010, DAGM-Symposium.

[42]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[43]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[44]  Andrew Zisserman,et al.  Tabula rasa: Model transfer for object category detection , 2011, 2011 International Conference on Computer Vision.

[45]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[46]  Leonidas J. Guibas,et al.  Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[47]  Antonio Torralba,et al.  Transfer Learning by Borrowing Examples for Multiclass Object Detection , 2011, NIPS.

[48]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[49]  Joshua B. Tenenbaum,et al.  One-Shot Learning with a Hierarchical Nonparametric Bayesian Model , 2011, ICML Unsupervised and Transfer Learning.

[50]  Andrew Zisserman,et al.  Enhancing Exemplar SVMs using Part Level Transfer Regularization , 2012, BMVC.

[51]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[53]  Erik Rodner,et al.  Visual Transfer Learning: Informal Introduction and Literature Overview , 2012, ArXiv.

[54]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[55]  Wei Li,et al.  One-shot learning gesture recognition from RGB-D data using bag of features , 2013, J. Mach. Learn. Res..

[56]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[57]  Joshua B. Tenenbaum,et al.  One-shot learning by inverting a compositional causal process , 2013, NIPS.

[58]  Trevor Darrell,et al.  Efficient Learning of Domain-invariant Image Representations , 2013, ICLR.

[59]  Ilja Kuzborskij,et al.  From N to N+1: Multiclass Transfer Incremental Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Tatiana Tommasi,et al.  Learning to Learn by Exploiting Prior Knowledge , 2013 .

[61]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[62]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[63]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[64]  Matthias Scheutz,et al.  Learning to Recognize Novel Objects in One Shot through Human-Robot Interactions in Natural Language Dialogues , 2014, AAAI.

[65]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Barbara Caputo,et al.  Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Dragomir Anguelov,et al.  Capturing Long-Tail Distributions of Object Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[69]  Trevor Darrell,et al.  One-Shot Adaptation of Supervised Deep Convolutional Models , 2013, ICLR.

[70]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[71]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[72]  Barbara Caputo,et al.  Learning Categories From Few Examples With Multi Model Knowledge Transfer , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[74]  John P. Collomosse,et al.  Incremental transfer learning for object recognition in streaming video , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[75]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[76]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[77]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[78]  Martial Hebert,et al.  Model recommendation: Generating object detectors from few samples , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Pietro Perona,et al.  Tropel: Crowdsourcing Detectors with Minimal Training , 2015, HCOMP.

[80]  Charless C. Fowlkes,et al.  Do We Need More Training Data? , 2015, International Journal of Computer Vision.

[81]  Yair Movshovitz-Attias,et al.  Ontological supervision for fine grained classification of Street View storefronts , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Terrance E. Boult,et al.  Towards Open World Recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[84]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[85]  Jonathan Tompson,et al.  Unsupervised Learning of Spatiotemporally Coherent Metrics , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[86]  Deva Ramanan,et al.  Articulated pose estimation with tiny synthetic videos , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[87]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[88]  Abhinav Gupta,et al.  Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[89]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[90]  Yair Movshovitz-Attias,et al.  Dataset Curation through Renders and Ontology Matching , 2015 .

[91]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[92]  Atsuto Maki,et al.  From generic to specific deep representations for visual recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[93]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[95]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[96]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[97]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[98]  Silvio Savarese,et al.  Robust single-view instance recognition , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[99]  Martial Hebert,et al.  Learning by Transferring from Unsupervised Universal Sources , 2016, AAAI.

[100]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Bharath Hariharan,et al.  Low-shot visual object recognition , 2016, ArXiv.

[102]  Yanwei Fu,et al.  Semi-supervised Vocabulary-Informed Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[103]  Jitendra Malik,et al.  Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[104]  Martial Hebert,et al.  Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification , 2016, ECCV.

[105]  Allan Jabri,et al.  Learning Visual Features from Large Weakly Supervised Data , 2015, ECCV.

[106]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[107]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[108]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[109]  Martial Hebert,et al.  Learning object models from few examples , 2016, SPIE Defense + Security.