论文信息 - Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification

Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification

Few-shot learning (FSL) requires one to learn from object categories with a small amount of training data (as novel classes), while the remaining categories (as base classes) contain a sufficient amount of data for training. It is often desirable to transfer knowledge from the base classes and derive dominant features efficiently for the novel samples. In this work, we propose a sampling method that de-correlates an image based on maximum entropy reinforcement learning, and extracts varying sequences of patches on every forward-pass with discriminative information observed. This can be viewed as a form of "learned" data augmentation in the sense that we search for different sequences of patches within an image and performs classification with aggregation of the extracted features, resulting in improved FSL performances. In addition, our positive and negative sampling policies along with a newly defined reward function would favorably improve the effectiveness of our model. Our experiments on two benchmark datasets confirm the effectiveness of our framework and its superiority over recent FSL approaches.

[1] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Joan Bruna,et al. Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[4] Jianfeng Zhan,et al. Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks , 2017, ICANN.

[5] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[7] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.

[8] Matthew A. Brown,et al. Low-Shot Learning with Imprinted Weights , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Xi Peng,et al. A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.

[11] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[12] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] Yu-Chiang Frank Wang,et al. Learning Semantics-Guided Visual Attention for Few-Shot Image Classification , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[15] Luca Bertinetto,et al. Learning feed-forward one-shot learners , 2016, NIPS.

[16] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17] Nikos Komodakis,et al. Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[19] Bo Zhao,et al. Diversified Visual Attention Networks for Fine-Grained Object Classification , 2016, IEEE Transactions on Multimedia.

[20] Bharath Hariharan,et al. Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[22] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[23] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[24] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[25] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[26] Martial Hebert,et al. Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27] Joshua B. Tenenbaum,et al. One shot learning of simple visual concepts , 2011, CogSci.

[28] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[29] Zi Huang,et al. Multi-attention Network for One Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[31] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).