论文信息 - PARN: Position-Aware Relation Networks for Few-Shot Learning

PARN: Position-Aware Relation Networks for Few-Shot Learning

Few-shot learning presents a challenge that a classifier must quickly adapt to new classes that do not appear in the training set, given only a few labeled examples of each new class. This paper proposes a position-aware relation network (PARN) to learn a more flexible and robust metric ability for few-shot learning. Relation networks (RNs), a kind of architectures for relational reasoning, can acquire a deep metric ability for images by just being designed as a simple convolutional neural network (CNN)[23]. However, due to the inherent local connectivity of CNN, the CNN-based relation network (RN) can be sensitive to the spatial position relationship of semantic objects in two compared images. To address this problem, we introduce a deformable feature extractor (DFE) to extract more efficient features, and design a dual correlation attention mechanism (DCA) to deal with its inherent local connectivity. Successfully, our proposed approach extents the potential of RN to be position-aware of semantic objects by introducing only a small number of parameters. We evaluate our approach on two major benchmark datasets, i.e., Omniglot and Mini-Imagenet, and on both of the datasets our approach achieves state-of-the-art performance. It's worth noting that our 5-way 1-shot result on Omniglot even outperforms the previous 5-way 5-shot results.

[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2] Luca Bertinetto,et al. Learning feed-forward one-shot learners , 2016, NIPS.

[3] Jian Sun,et al. Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[4] Paul A. Viola,et al. Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.

[6] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[7] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[8] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[9] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[10] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[11] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[12] Yi Li,et al. Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13] Sebastian Thrun,et al. Learning to Learn , 1998, Springer US.

[14] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[15] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Hong Yu,et al. Meta Networks , 2017, ICML.

[17] Ole Winther,et al. Recurrent Relational Networks , 2017, NeurIPS.

[18] Yoshua Bengio,et al. MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[19] Hui Tang,et al. Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data , 2018, ECCV.

[20] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[21] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.

[22] Raquel Urtasun,et al. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.

[23] Rogério Schmidt Feris,et al. Delta-encoder: an effective sample synthesis method for few-shot object recognition , 2018, NeurIPS.

[24] Tao Mei,et al. Memory Matching Networks for One-Shot Image Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[26] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[27] Joshua B. Tenenbaum,et al. One shot learning of simple visual concepts , 2011, CogSci.

[28] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[29] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.

[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[32] Gabriela Csurka,et al. Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost , 2012, ECCV.

[33] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[34] Eunho Yang,et al. Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning , 2018, ICLR.

[35] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.