PLOP: Learning without Forgetting for Continual Semantic Segmentation

Deep learning approaches are nowadays ubiquitously used to tackle computer vision tasks such as semantic segmentation, requiring large datasets and substantial computational power. Continual learning for semantic segmentation (CSS) is an emerging trend that consists in updating an old model by sequentially adding new classes. However, continual learning methods are usually prone to catastrophic forgetting. This issue is further aggravated in CSS where, at each step, old classes from previous iterations are collapsed into the background. In this paper, we propose Local POD, a multi-scale pooling distillation scheme that preserves long- and short-range spatial relationships at feature level. Furthermore, we design an entropy-based pseudo-labelling of the background w.r.t. classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes. Our approach, called PLOP, significantly outperforms state-of-the-art methods in existing CSS scenarios, as well as in newly proposed challenging benchmarks 1.

[1]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[2]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[3]  A. Yuille,et al.  Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation , 2020, ECCV.

[4]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[5]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[6]  Xilin Chen,et al.  Object-Contextual Representations for Semantic Segmentation , 2019, ECCV.

[7]  Piyush Rai,et al.  Generalized Zero-Shot Learning via Synthesized Examples , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[9]  Larry S. Davis,et al.  M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning , 2019, ArXiv.

[10]  Davide Maltoni,et al.  CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.

[11]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Pietro Zanuttigh,et al.  Incremental Learning Techniques for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[13]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[14]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[18]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[19]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[20]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[22]  Adrian Popescu,et al.  ScaIL: Classifier Weights Scaling for Class Incremental Learning , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Yunchao Wei,et al.  CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Karan Sapra,et al.  Hierarchical Multi-Scale Attention for Semantic Segmentation , 2020, ArXiv.

[25]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yang Zou,et al.  Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[27]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[28]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[29]  Matthieu Cord,et al.  ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in Semantic Segmentation , 2020, ArXiv.

[30]  Kyunghyun Cho,et al.  Continual Learning via Neural Pruning , 2019, ArXiv.

[31]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Philipp Fürnstahl,et al.  Learn the new, keep the old: Extending pretrained models with new anatomy and images , 2018, MICCAI.

[34]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[35]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Orçun Göksel,et al.  Extending pretrained segmentation networks with additional anatomical structures , 2018, International Journal of Computer Assisted Radiology and Surgery.

[37]  Jiashi Feng,et al.  Strip Pooling: Rethinking Spatial Pooling for Scene Parsing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[40]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.

[41]  Christopher Kanan,et al.  REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2020, ECCV.

[42]  Toshihiko Yamasaki,et al.  Zero-Shot Semantic Segmentation via Variational Mapping , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[43]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Matthieu Cord,et al.  PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[45]  Yong Seok Heo,et al.  Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy , 2020, Sensors.

[46]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[47]  Yi-Ming Chan,et al.  Compacting, Picking and Growing for Unforgetting Continual Learning , 2019, NeurIPS.

[48]  Joost van de Weijer,et al.  Bookworm continual learning: beyond zero-shot learning and continual learning , 2020, ArXiv.

[49]  Yi Zhang,et al.  PSANet: Point-wise Spatial Attention Network for Scene Parsing , 2018, ECCV.

[50]  Vittorio Ferrari,et al.  COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[56]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[57]  Samuel Rota Bulo,et al.  Modeling the Background for Incremental Learning in Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[59]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[60]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[61]  Jonghyun Choi,et al.  Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence , 2019, ArXiv.

[62]  Linda G. Shapiro,et al.  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[63]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[66]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[67]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[68]  Matthieu Cord,et al.  Zero-Shot Semantic Segmentation , 2019, NeurIPS.

[69]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Matthieu Cord,et al.  Insights from the Future for Continual Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[71]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Bernt Schiele,et al.  Mnemonics Training: Multi-Class Incremental Learning Without Forgetting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Cordelia Schmid,et al.  Memory-Efficient Incremental Learning Through Feature Adaptation , 2020, ECCV.

[76]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[77]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[78]  Ronald Kemker,et al.  Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.

[79]  Xiaogang Wang,et al.  Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[80]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[81]  Richard Socher,et al.  Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting , 2019, ICML.

[82]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Marc'Aurelio Ranzato,et al.  On Tiny Episodic Memories in Continual Learning , 2019 .