Augmented Bi-path Network for Few-shot Learning

Few-shot Learning (FSL) which aims to learn from few labeled training data is becoming a popular research topic, due to the expensive labeling cost in many real-world applications. One kind of successful FSL method learns to compare the testing (query) image and training (support) image by simply concatenating the features of two images and feeding it into the neural network. However, with few labeled data in each class, the neural network has difficulty in learning or comparing the local features of two images. Such simple image-level comparison may cause serious mis-classification. To solve this problem, we propose Augmented Bi-path Network (ABNet) for learning to compare both global and local features on multi-scales. Specifically, the salient patches are extracted and embedded as the local features for every image. Then, the model learns to augment the features for better robustness. Finally, the model learns to compare global and local features separately, i.e., in two paths, before merging the similarities. Extensive experiments show that the proposed ABNet outperforms the state-of-the-art methods. Both quantitative and visual ablation studies are provided to verify that the proposed modules lead to more precise comparison results.

[1]  Martial Hebert,et al.  Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[3]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[4]  Bin Wu,et al.  Deep Meta-Learning: Learning to Learn in the Concept Space , 2018, ArXiv.

[5]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[6]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, ICCV 2003.

[7]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[8]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[9]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[12]  Bo Zhao,et al.  MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning , 2018, ICML.

[13]  Ahmed M. Elgammal,et al.  SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Zi Huang,et al.  Multi-attention Network for One Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[16]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[18]  Rogério Schmidt Feris,et al.  Delta-encoder: an effective sample synthesis method for few-shot object recognition , 2018, NeurIPS.

[19]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[21]  Yu-Chiang Frank Wang,et al.  Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[23]  Yoshua Bengio,et al.  MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[24]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[25]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[26]  Xiaogang Wang,et al.  Finding Task-Relevant Features for Few-Shot Learning by Category Traversal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Luca Bertinetto,et al.  Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  J. Schulman,et al.  Reptile: a Scalable Metalearning Algorithm , 2018 .

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[32]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[33]  Zhe Gan,et al.  Semantic Compositional Networks for Visual Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[35]  Martial Hebert,et al.  Learning from Small Sample Sets by Combining Unsupervised Meta-Training with CNNs , 2016, NIPS.

[36]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[37]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Pieter Abbeel,et al.  Meta-Learning with Temporal Convolutions , 2017, ArXiv.

[39]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[40]  Xiangyang Xue,et al.  Multi-Level Semantic Feature Augmentation for One-Shot Learning , 2018, IEEE Transactions on Image Processing.

[41]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Cheng-Yuan Liou,et al.  Modeling word perception using the Elman network , 2008, Neurocomputing.

[43]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[44]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[45]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[46]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Radomír Mech,et al.  Minimum Barrier Salient Object Detection at 80 FPS , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).