Simultaneous Perturbation Stochastic Approximation for Few-Shot Learning

Few-shot learning is an important research field of machine learning in which a classifier must be trained in such a way that it can adapt to new classes which are not included in the training set. However, only small amounts of examples of each class are available for training. This is one of the key problems with learning algorithms of this type which leads to the significant uncertainty. We attack this problem via randomized stochastic approximation. In this paper, we suggest to consider the new multi-task loss function and propose the SPSA-like few-shot learning approach based on the prototypical networks method. We provide a theoretical justification and an analysis of experiments for this approach. The results of experiments on the benchmark dataset demonstrate that the proposed method is superior to the original prototypical networks.

[1]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[3]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[4]  Joshua B. Tenenbaum,et al.  The Omniglot challenge: a 3-year progress report , 2019, Current Opinion in Behavioral Sciences.

[5]  Wenguang Hou,et al.  Simultaneous perturbation stochastic approximation for clustering of a Gaussian mixture model under unknown but bounded disturbances , 2017, 2017 IEEE Conference on Control Technology and Applications (CCTA).

[6]  Alexander T. Vakhitov,et al.  Algorithm for stochastic approximation with trial input perturbation in the nonstationary problem of optimization , 2009 .

[7]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[8]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[9]  Alexander Senov,et al.  Accelerating Gradient Descent with Projective Response Surface Methodology , 2017, LION.

[10]  Mubarak Shah,et al.  Task Agnostic Meta-Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[12]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[13]  Natalia O. Amelina,et al.  Simultaneous Perturbation Stochastic Approximation for Tracking Under Unknown but Bounded Disturbances , 2015, IEEE Transactions on Automatic Control.

[14]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[15]  H. Robbins A Stochastic Approximation Method , 1951 .

[16]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[17]  Oleg N. Granichin,et al.  Stochastic Approximation Algorithm with Randomization at the Input for Unsupervised Parameters Estimation of Gaussian Mixture Model with Sparse Parameters , 2019, Autom. Remote. Control..

[18]  Andrei Boiarov,et al.  Large Scale Landmark Recognition via Deep Metric Learning , 2019, CIKM.

[19]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[20]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21]  Zeev Volkovich,et al.  Randomized Algorithms in Automatic Control and Data Mining , 2014, Intelligent Systems Reference Library.

[22]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[23]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[24]  O. Granichin Randomized Algorithms for Stochastic Approximation under Arbitrary Disturbances , 2002 .

[25]  Alexander Senov,et al.  Arabic manuscript author verification using deep convolutional networks , 2017, 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR).

[26]  Oleg Granichin,et al.  A stochastic recursive procedure with dependent noises in the observation that uses sample perturbations in the input , 1989 .