Principal characteristic networks for few-shot learning

Abstract Few-shot learning aims to build a classifier that recognizes unseen new classes given only a few samples of them. Previous studies like prototypical networks utilized the mean of embedded support vectors to represent the prototype that is the representation of class and yield satisfactory results. However, the importance of these different embedded support vectors is not studied yet, which are valuable factors that could be used to push the limit of the few-shot learning. We propose a principal characteristic network that exploits the principal characteristic to better express prototype, computed by distributing weights based on embedded vectors’ different importance. The high-level abstract embedded vectors are extracted from our eResNet embedding network. In addition, we proposed a mixture loss function, which enlarges the inter-class distance in the embedding space for accurate classification. Extensive experimental results demonstrate that our network achieves state-of-the-art results on the Omniglot, miniImageNet and Cifar100 datasets.

[1]  Ke Li,et al.  Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[3]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[4]  Zhongfei Zhang,et al.  Zero-Shot Learning via Latent Space Encoding , 2017, IEEE Transactions on Cybernetics.

[5]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[6]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[8]  Gerald J. Sussman,et al.  Sparse Representations for Fast, One-Shot Learning , 1997, AAAI/IAAI.

[9]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[10]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[14]  Lei Guo,et al.  When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Junwei Han,et al.  Duplex Metric Learning for Image Set Classification , 2018, IEEE Transactions on Image Processing.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[20]  Alan L. Yuille,et al.  One Shot Learning via Compositions of Meaningful Patches , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[22]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[23]  Nuno Vasconcelos,et al.  AGA: Attribute-Guided Augmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.