论文信息 - Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph

Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph

A variety of machine learning applications expect to achieve rapid learning from a limited number of labeled data. However, the success of most current models is the result of heavy training on big data. Meta-learning addresses this problem by extracting common knowledge across different tasks that can be quickly adapted to new tasks. However, they do not fully explore weakly-supervised information, which is usually free or cheap to collect. In this paper, we show that weakly-labeled data can significantly improve the performance of meta-learning on few-shot classification. We propose prototype propagation network (PPN) trained on few-shot tasks together with data annotated by coarse-label. Given a category graph of the targeted fine-classes and some weakly-labeled coarse-classes, PPN learns an attention mechanism which propagates the prototype of one class to another on the graph, so that the K-nearest neighbor (KNN) classifier defined on the propagated prototypes results in high accuracy across different few-shot tasks. The training tasks are generated by subgraph sampling, and the training objective is obtained by accumulating the level-wise classification loss on the subgraph. The resulting graph of prototypes can be continually re-used and updated for new tasks and classes. We also introduce two practical test/inference settings which differ according to whether the test task can leverage any weakly-supervised information as in training. On two benchmarks, PPN significantly outperforms most recent few-shot learning methods in different settings, even when they are also allowed to train on weakly-labeled data.

[1] Nicholas B. Turk-Browne,et al. Complementary learning systems within the hippocampus: A neural network modeling approach to reconciling episodic memory with statistical learning , 2016, bioRxiv.

[2] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[3] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4] Jing Jiang,et al. Attributed Graph Clustering: A Deep Attentional Embedding Approach , 2019, IJCAI.

[5] Yi Yang,et al. Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Alan L. Yuille,et al. One Shot Learning via Compositions of Meaningful Patches , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[9] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[10] Zhi-Hua Zhou,et al. A brief introduction to weakly supervised learning , 2018 .

[11] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[12] Mikhail Belkin,et al. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[13] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[14] Zoubin Ghahramani,et al. Learning from labeled and unlabeled data with label propagation , 2002 .

[15] Alex Beatson,et al. Amortized Bayesian Meta-Learning , 2018, ICLR.

[16] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[17] Yi Yang,et al. Transductive Propagation Network for Few-shot Learning , 2018, ArXiv.

[18] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.

[19] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[20] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[21] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.

[22] Chun Wang,et al. MGAE: Marginalized Graph Autoencoder for Graph Clustering , 2017, CIKM.

[23] Joan Bruna,et al. Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[24] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[25] Steven Skiena,et al. Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[26] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .

[27] Philip S. Yu,et al. A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[28] L. Christophorou. Science , 2018, Emerging Dynamics: Science, Energy, Society and Values.

[29] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[30] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.

[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[33] Ryan A. Rossi,et al. Graph Classification using Structural Attention , 2018, KDD.

[34] J. Meigs,et al. WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[35] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[36] Charu C. Aggarwal,et al. Learning Deep Network Representations with Adversarially Regularized Autoencoders , 2018, KDD.

[37] Jeff A. Bilmes,et al. Scaling Submodular Maximization via Pruned Submodularity Graphs , 2016, AISTATS.

[38] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.