论文信息 - Just a Glimpse: Rethinking Temporal Information for Video Continual Learning

Just a Glimpse: Rethinking Temporal Information for Video Continual Learning

Class-incremental learning is one of the most important settings for the study of Continual Learning, as it closely resembles real-world application scenarios. With constrained memory sizes, catastrophic forgetting arises as the number of classes/tasks increases. Studying continual learning in the video domain poses even more challenges, as video data contains a large number of frames, which places a higher burden on the replay memory. The current common practice is to sub-sample frames from the video stream and store them in the replay memory. In this paper, we propose SMILE a novel replay mechanism for effective video continual learning based on individual/single frames. Through extensive experimentation, we show that under extreme memory constraints, video diversity plays a more significant role than temporal information. Therefore, our method focuses on learning from a small number of frames that represent a large number of unique videos. On three representative video datasets, Kinetics, UCF101, and ActivityNet, the proposed method achieves state-of-the-art performance, outperforming the previous state-of-the-art by up to 21.49%.

Juan Leon Alcazar | Bernard Ghanem | Merey Ramazanova | Chen Zhao | Lama Alssum

[1] Shiwei Zhang,et al. Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning , 2022, NeurIPS.

[2] Clayton D. Scott,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] D. Bacciu,et al. Practical Recommendations for Replay-based Continual Learning Methods , 2022, ICIAP Workshops.

[4] Fabian Caba Heilbron,et al. vCLIMB: A Novel Video Class Incremental Learning Benchmark , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Deepak Pathak,et al. The CLEAR Benchmark: Continual LEArning on Real-World Imagery , 2022, NeurIPS Datasets and Benchmarks.

[6] Yihong Gong,et al. Class Incremental Learning for Video Action Classification , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[7] Vladlen Koltun,et al. Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Zibo Lin,et al. When Video Classification Meets Incremental Classes , 2021, ACM Multimedia.

[9] Marcus Rohrbach,et al. SMART Frame Selection for Action Recognition , 2020, AAAI.

[10] Joost van de Weijer,et al. Class-Incremental Learning: Survey and Performance Evaluation on Image Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Ruiping Wang,et al. CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions , 2020, Artif. Intell..

[12] Philip H. S. Torr,et al. GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[13] Tyler L. Hayes,et al. REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2019, ECCV.

[14] Tinne Tuytelaars,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Dahua Lin,et al. Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Yandong Guo,et al. Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Larry P. Heck,et al. Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18] David Filliat,et al. Generative Models from the perspective of Continual Learning , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[19] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.

[20] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21] Rama Chellappa,et al. Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Bing Liu,et al. Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation , 2018, ICLR.

[23] Zhanxing Zhu,et al. Reinforced Continual Learning , 2018, NeurIPS.

[24] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[25] Svetlana Lazebnik,et al. Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights , 2018, ECCV.

[26] Alexandros Karatzoglou,et al. Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[27] Marcus Rohrbach,et al. Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[28] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[29] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Limin Wang,et al. Temporal Segment Networks for Action Recognition in Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Jiwon Kim,et al. Continual Learning with Deep Generative Replay , 2017, NIPS.

[32] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[33] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Tinne Tuytelaars,et al. Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.

[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[40] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[42] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[43] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[44] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[45] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .