Memory-Efficient Incremental Learning Through Feature Adaptation

We introduce an approach for incremental learning that preserves feature descriptors of training images from previously learned classes, instead of the images themselves, unlike most existing work. Keeping the much lower-dimensional feature embeddings of images reduces the memory footprint significantly. We assume that the model is updated incrementally for new classes as new data becomes available sequentially.This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images. Feature adaptation is learned with a multi-layer perceptron, which is trained on feature pairs corresponding to the outputs of the original and updated network on a training image. We validate experimentally that such a transformation generalizes well to the features of the previous set of classes, and maps features to a discriminative subspace in the feature space. As a result, the classifier is optimized jointly over new and old classes without requiring old class images. Experimental results show that our method achieves state-of-the-art classification accuracy in incremental learning benchmarks, while having at least an order of magnitude lower memory footprint compared to image-preserving strategies.

[1]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[5]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[6]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[7]  Joost van de Weijer,et al.  Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[8]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[9]  Max Welling,et al.  Herding dynamical weights to learn , 2009, ICML '09.

[10]  Albert Gordo,et al.  End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.

[11]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[12]  Ying Fu,et al.  Incremental Learning Using Conditional Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Joost van de Weijer,et al.  Semantic Drift Compensation for Class-Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[18]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[19]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[20]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[22]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[25]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[26]  Faisal Shafait,et al.  Revisiting Distillation and Incremental Classifier Learning , 2018, ACCV.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Larry P. Heck,et al.  Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[30]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[31]  Bharath Hariharan,et al.  Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[33]  Martial Hebert,et al.  Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Shiguang Shan,et al.  Exemplar-Supported Generative Reproduction for Class Incremental Learning , 2018, BMVC.

[35]  Frédéric Jurie,et al.  Generating Visual Representations for Zero-Shot Classification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[39]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[41]  Colin Raffel,et al.  Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[42]  Yannis Avrithis,et al.  Label Propagation for Deep Semi-Supervised Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[44]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[45]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[46]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[47]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[48]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[49]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Matthieu Guillaumin,et al.  Incremental Learning of Random Forests for Large-Scale Image Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[52]  Jianfeng Zhan,et al.  Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks , 2017, ICANN.

[53]  Matthijs Douze,et al.  Low-Shot Learning with Large-Scale Diffusion , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[55]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Bogdan Raducanu,et al.  Memory Replay GANs: Learning to Generate New Categories without Forgetting , 2018, NeurIPS.

[57]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.