Pareto Self-Supervised Training for Few-Shot Learning

While few-shot learning (FSL) aims for rapid generalization to new concepts with little supervision, self-supervised learning (SSL) constructs supervisory signals directly computed from unlabeled data. Exploiting the complementarity of these two manners, few-shot auxiliary learning has recently drawn much attention to deal with few labeled data. Previous works benefit from sharing inductive bias between the main task (FSL) and auxiliary tasks (SSL), where the shared parameters of tasks are optimized by minimizing a linear combination of task losses. However, it is challenging to select a proper weight to balance tasks and reduce task conflict. To handle the problem as a whole, we propose a novel approach named as Pareto self-supervised training (PSST) for FSL. PSST explicitly decomposes the few-shot auxiliary problem into multiple constrained multi-objective subproblems with different trade-off preferences, and here a preference region in which the main task achieves the best performance is identified. Then, an effective preferred Pareto exploration is proposed to find a set of optimal solutions in such a preference region. Extensive experiments on several public benchmark datasets validate the effectiveness of our approach by achieving state-of-the-art performance.

[1]  Ann Nowé,et al.  Multi-objective reinforcement learning using sets of pareto dominating policies , 2014, J. Mach. Learn. Res..

[2]  Fabio Maria Carlucci,et al.  Domain Generalization by Solving Jigsaw Puzzles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[5]  Stefano Soatto,et al.  Few-Shot Learning With Embedded Class Models and Shot-Free Meta Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Alexander Kolesnikov,et al.  S4L: Self-Supervised Semi-Supervised Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[8]  Patrick Pérez,et al.  Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[10]  Jörg Fliege,et al.  Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..

[11]  Dipti Srinivasan,et al.  A Survey of Multiobjective Evolutionary Algorithms Based on Decomposition , 2017, IEEE Transactions on Evolutionary Computation.

[12]  Jörg Fliege,et al.  A Method for Constrained Multiobjective Optimization Based on SQP Techniques , 2016, SIAM J. Optim..

[13]  Qingfu Zhang,et al.  Pareto Multi-Task Learning , 2019, NeurIPS.

[14]  Daniel Hern'andez-Lobato,et al.  Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints , 2016, Neurocomputing.

[15]  Cordelia Schmid,et al.  Diversity With Cooperation: Ensemble Methods for Few-Shot Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[17]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[18]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[19]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Zaiqiao Meng,et al.  Dynamic Collaborative Recurrent Learning , 2019, CIKM.

[22]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Armand Joulin,et al.  Unsupervised Learning by Predicting Noise , 2017, ICML.

[24]  Subhransu Maji,et al.  When Does Self-supervision Improve Few-shot Learning? , 2020, ECCV.

[25]  Hong Shen,et al.  Neural Variational Hybrid Collaborative Filtering , 2018, ArXiv.

[26]  Frank Hutter,et al.  Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[27]  Julien Mairal,et al.  Unsupervised Pre-Training of Image Features on Non-Curated Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[30]  Andreas Krause,et al.  Active Learning for Multi-Objective Optimization , 2013, ICML.

[31]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[33]  Luca Bertinetto,et al.  Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[34]  Ngai-Man Cheung,et al.  Attentive Weights Generation for Few Shot Learning via Information Maximization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Alexander Kolesnikov,et al.  Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Vladlen Koltun,et al.  Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.

[37]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[38]  Donglin Wang,et al.  Multi-Initialization Meta-Learning with Domain Adaptation , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[39]  C. Hillermeier Generalized Homotopy Approach to Multiobjective Optimization , 2001 .

[40]  C. Hillermeier Nonlinear Multiobjective Optimization: A Generalized Homotopy Approach , 2001 .

[41]  Shuguang Cui,et al.  PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Wei Shen,et al.  Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Ziqing Xu,et al.  Deep Transfer Tensor Decomposition with Orthogonal Constraint for Recommender Systems , 2021, AAAI.

[44]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[45]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[46]  Donglin Wang,et al.  Improving Cold-Start Recommendation via Multi-prior Meta-learning , 2021, European Conference on Information Retrieval.

[47]  Sung Ju Hwang,et al.  Self-supervised Label Augmentation via Input Transformations , 2019, ICML.

[48]  Eckart Zitzler,et al.  Evolutionary algorithms for multiobjective optimization: methods and applications , 1999 .

[49]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.