Semi-supervised Drifted Stream Learning with Short Lookback

In many scenarios, 1) data streams are generated in real time; 2) labeled data are expensive and only limited labels are available in the beginning; 3) real-world data is not always i.i.d. and data drift over time gradually; 4) the storage of historical streams is limited. This learning setting limits the applicability and availability of many Machine Learning (ML) algorithms. We generalize the learning task under such setting as a semi-supervised drifted stream learning with short lookback problem (SDSL). SDSL imposes two under-addressed challenges on existing methods in semi-supervised learning and continuous learning: 1) robust pseudo-labeling under gradual shifts and 2) anti-forgetting adaptation with short lookback. To tackle these challenges, we propose a principled and generic generation-replay framework to solve SDSL. To achieve robust pseudo-labeling, we develop a novel pseudo-label classification model to leverage supervised knowledge of previously labeled data, unsupervised knowledge of new data, and, structure knowledge of invariant label semantics. To achieve adaptive anti-forgetting model replay, we propose to view the anti-forgetting adaptation task as a flat region search problem. We propose a novel minimax game-based replay objective function to solve the flat region search problem and develop an effective optimization solver. Experimental results demonstrate the effectiveness of the proposed method.

[1]  Xiang Zhang,et al.  Exploring Edge Disentanglement for Node Classification , 2022, WWW.

[2]  Xiao-Ming Wu,et al.  Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by Finding Flat Minima , 2021, NeurIPS.

[3]  Pheng-Ann Heng,et al.  Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning , 2021, NeurIPS.

[4]  Xiang Zhang,et al.  GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks , 2021, WSDM.

[5]  Mingsheng Long,et al.  Self-Tuning for Data-Efficient Deep Learning , 2021, ICML.

[6]  Xiang Zhang,et al.  Semi-Supervised Graph-to-Graph Translation , 2020, CIKM.

[7]  Kurt Keutzer,et al.  Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Marie-Francine Moens,et al.  Online Continual Learning from Imbalanced Data , 2020, ICML.

[9]  Zhi Zhou,et al.  RECORD: Resource Constrained Semi-Supervised Learning under Distribution Shift , 2020, KDD.

[10]  Bernt Schiele,et al.  Mnemonics Training: Multi-Class Incremental Learning Without Forgetting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[12]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[13]  Enhong Chen,et al.  Tracking and Forecasting Dynamics in Crowdfunding: A Basis-Synthesis Approach , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[14]  Davide Maltoni,et al.  Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[15]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[16]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[19]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[20]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[21]  João Gama,et al.  Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency , 2015, SDM.

[22]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[23]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[25]  J. B. Rosen The gradient projection method for nonlinear programming: Part II , 1961 .

[26]  J. B. Rosen The Gradient Projection Method for Nonlinear Programming. Part I. Linear Constraints , 1960 .

[27]  Jing Jiang,et al.  Cross-Topic Rumor Detection using Topic-Mixtures , 2021, EACL.

[28]  Yu Wang,et al.  Learning to Adapt to Evolving Domains , 2020, NeurIPS.

[29]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .