论文信息 - Task-agnostic Continual Learning with Hybrid Probabilistic Models

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generativediscriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting, all leveraging the invertibility and exact likelihood which are uniquely enabled by the normalizing flow model. We use the generative capabilities of the flow to avoid catastrophic forgetting through generative replay and a novel functional regularization technique. For task identification, we use state-of-the-art anomaly detection techniques based on measuring the typicality of the model’s statistics. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.

[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[2] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[3] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[4] Yee Whye Teh,et al. Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[5] Yee Whye Teh,et al. Functional Regularisation for Continual Learning using Gaussian Processes , 2019, ICLR.

[6] Kaushik Roy,et al. Gradient Projection Memory for Continual Learning , 2021, ICLR.

[7] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[8] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[9] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[10] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[11] Andreas S. Tolias,et al. Three scenarios for continual learning , 2019, ArXiv.

[12] Dmitry Vetrov,et al. Semi-Conditional Normalizing Flows for Semi-Supervised Learning , 2019, ArXiv.

[13] Seyed Iman Mirzadeh,et al. Linear Mode Connectivity in Multitask and Continual Learning , 2020, ICLR.

[14] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[15] Ev Zisselman,et al. Deep Residual Flow for Out of Distribution Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] David Duvenaud,et al. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[17] Yee Whye Teh,et al. Continual Unsupervised Representation Learning , 2019, NeurIPS.

[18] David Duvenaud,et al. Invertible Residual Networks , 2018, ICML.

[19] Elad Hoffer,et al. Task Agnostic Continual Learning Using Online Variational Bayes , 2018, 1803.10123.

[20] Andrew Gordon Wilson,et al. Semi-Supervised Learning with Normalizing Flows , 2019, ICML.

[21] Mohammad Emtiyaz Khan,et al. Continual Deep Learning by Functional Regularisation of Memorable Past , 2020, NeurIPS.

[22] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[23] Hava T. Siegelmann,et al. Brain-inspired replay for continual learning with artificial neural networks , 2020, Nature Communications.

[24] Adrian G. Bors,et al. Learning latent representations across multiple data domains using Lifelong VAEGAN , 2020, ECCV.

[25] Eric T. Nalisnick,et al. Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality , 2019 .

[26] David Duvenaud,et al. Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[27] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).