Transductive Maximum Margin Classifier for Few-Shot Learning

Few-shot learning aims to train a classifier that can generalize well when just a small number of labeled samples per class are given. We introduce Transductive Maximum Margin Classifier (TMMC) for few-shot learning. The basic idea of the classical maximum margin classifier is to solve an optimal prediction function that the corresponding separating hyperplane can correctly divide the training data and the resulting classifier has the largest geometric margin. In few-shot learning scenarios, the training samples are scarce, not enough to find a separating hyperplane with good generalization ability on unseen data. TMMC is constructed using a mixture of the labeled support set and the unlabeled query set in a given task. The unlabeled samples in the query set can adjust the separating hyperplane so that the prediction function is optimal on both the labeled and unlabeled samples. Furthermore, we leverage an efficient and effective quasi-Newton algorithm, the L-BFGS method to optimize TMMC. Experimental results on three standard few-shot learning benchmarks including miniImagenet, tieredImagenet and CUB suggest that our TMMC achieves state-of-the-art accuracies.

[1]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[3]  Taesup Kim,et al.  Edge-Labeling Graph Neural Network for Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[5]  Bernt Schiele,et al.  Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[7]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[8]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[9]  Stefano Soatto,et al.  A Baseline for Few-Shot Image Classification , 2019, ICLR.

[10]  Jose Dolz,et al.  Laplacian Regularized Few-Shot Learning , 2020, ICML.

[11]  Seungjin Choi,et al.  Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[12]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[13]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[14]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[15]  Wei Shen,et al.  Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[18]  Frank Nielsen,et al.  Guaranteed Bounds on Information-Theoretic Measures of Univariate Mixtures Using Piecewise Log-Sum-Exp Inequalities , 2016, Entropy.

[19]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[20]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[21]  Tong Zhang,et al.  Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.

[22]  Bin Wu,et al.  Deep Meta-Learning: Learning to Learn in the Concept Space , 2018, ArXiv.

[23]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[24]  Fei Sha,et al.  Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[27]  Jinlu Liu,et al.  Prototype Rectification for Few-Shot Learning , 2020, ECCV.

[28]  Patrick Pérez,et al.  Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  Stan Matwin,et al.  Learning to Learn with Conditional Class Dependencies , 2018, ICLR.

[31]  Eunho Yang,et al.  Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning , 2018, ICLR.

[32]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[33]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[34]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[35]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[37]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[38]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[39]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[40]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Tsendsuren Munkhdalai,et al.  Rapid Adaptation with Conditionally Shifted Neurons , 2017, ICML.

[42]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[43]  Bernhard Schölkopf,et al.  Discriminative k-shot learning using probabilistic models , 2017, ArXiv.

[44]  Oliver Kramer,et al.  Sparse Quasi-Newton Optimization for Semi-supervised Support Vector Machines , 2012, ICPRAM.

[45]  Xilin Chen,et al.  Cross Attention Network for Few-shot Classification , 2019, NeurIPS.

[46]  Tsendsuren Munkhdalai,et al.  Learning Rapid-Temporal Adaptations , 2017, ArXiv.

[47]  Bartunov Sergey,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[48]  Bingbing Ni,et al.  Variational Few-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Neil D. Lawrence,et al.  Empirical Bayes Transductive Meta-Learning with Synthetic Gradients , 2020, ICLR.

[51]  Feiyue Huang,et al.  LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning , 2019, ICML.

[52]  Yonghong Tian,et al.  Transductive Episodic-Wise Adaptive Metric for Few-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.