Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes

The goal of this paper is to design image classification systems that, after an initial multi-task training phase, can automatically adapt to new tasks encountered at test time. We introduce a conditional neural process based approach to the multi-task classification setting for this purpose, and establish connections to the meta-learning and few-shot learning literature. The resulting approach, called CNAPs, comprises a classifier whose parameters are modulated by an adaptation network that takes the current task's dataset as input. We demonstrate that CNAPs achieves state-of-the-art results on the challenging Meta-Dataset benchmark indicating high-quality transfer-learning. We show that the approach is robust, avoiding both over-fitting in low-shot regimes and under-fitting in high-shot regimes. Timing experiments reveal that CNAPs is computationally efficient at test-time as it does not involve gradient based adaptation. Finally, we show that trained models are immediately deployable to continual learning and active learning where they can outperform existing approaches that do not leverage transfer learning.

[1]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[2]  Sebastian Nowozin,et al.  Meta-Learning Probabilistic Inference for Prediction , 2018, ICLR.

[3]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[4]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[5]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[6]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[7]  Richard E. Turner,et al.  Improving and Understanding Variational Continual Learning , 2019, ArXiv.

[8]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[9]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[14]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[15]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Wei Shen,et al.  Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[19]  Mark B. Ring Child: A First Step Towards Continual Learning , 1998, Learning to Learn.

[20]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[21]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[22]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[24]  Bernhard Schölkopf,et al.  Discriminative k-shot learning using probabilistic models , 2017, ArXiv.

[25]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[26]  Johannes Stallkamp,et al.  Detection of traffic signs in real-world images: The German traffic sign detection benchmark , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[27]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[28]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[29]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[30]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[32]  Hugo Larochelle,et al.  A Meta-Learning Perspective on Cold-Start Recommendations for Items , 2017, NIPS.

[33]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[34]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[35]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[36]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[37]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[38]  Yoshua Bengio,et al.  Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[39]  J. Schulman,et al.  Reptile: a Scalable Metalearning Algorithm , 2018 .

[40]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[43]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[44]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[45]  Yee Whye Teh,et al.  Attentive Neural Processes , 2019, ICLR.

[46]  Katja Hofmann,et al.  CAML: Fast Context Adaptation via Meta-Learning , 2018, ArXiv.