IF2Net: Innately Forgetting-Free Networks for Continual Learning

Continual learning can incrementally absorb new concepts without interfering with previously learned knowledge. Motivated by the characteristics of neural networks, in which information is stored in weights on connections, we investigated how to design an Innately Forgetting-Free Network (IF2Net) for continual learning context. This study proposed a straightforward yet effective learning paradigm by ingeniously keeping the weights relative to each seen task untouched before and after learning a new task. We first presented the novel representation-level learning on task sequences with random weights. This technique refers to tweaking the drifted representations caused by randomization back to their separate task-optimal working states, but the involved weights are frozen and reused (opposite to well-known layer-wise updates of weights). Then, sequential decision-making without forgetting can be achieved by projecting the output weight updates into the parsimonious orthogonal space, making the adaptations not disturb old knowledge while maintaining model plasticity. IF2Net allows a single network to inherently learn unlimited mapping rules without telling task identities at test time by integrating the respective strengths of randomization and orthogonalization. We validated the effectiveness of our approach in the extensive theoretical analysis and empirical study.

[1]  Jianzhuang Liu,et al.  Anti-Retroactive Interference for Lifelong Learning , 2022, ECCV.

[2]  Xindong Wu,et al.  Are Graph Convolutional Networks With Random Weights Feasible? , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Zeyang Zhang,et al.  Self-Growing Binary Activation Network: A Novel Deep Learning Model With Dynamic Architecture. , 2022, IEEE transactions on neural networks and learning systems.

[4]  Qixiang Ye,et al.  Dynamic Support Network for Few-Shot Class Incremental Learning , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  D. Dong,et al.  A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning , 2022, IEEE Transactions on Cybernetics.

[6]  Xueming Li,et al.  EPicker is an exemplar-based continual learning approach for knowledge accumulation in cryoEM particle picking , 2022, Nature Communications.

[7]  Yifan Peng,et al.  Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  N. Vasconcelos,et al.  Class-Incremental Learning with Strong Pre-trained Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jennifer G. Dy,et al.  Learning to Prompt for Continual Learning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pheng-Ann Heng,et al.  Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning , 2021, NeurIPS.

[11]  Oleg S. Pianykh,et al.  Dynamic memory to alleviate catastrophic forgetting in continual learning with medical imaging , 2021, Nature Communications.

[12]  Qionghai Dai,et al.  Memory Recall: A Simple Neural Network Training Framework Against Catastrophic Forgetting , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Weiqiang Wang,et al.  GopGAN: Gradients Orthogonal Projection Generative Adversarial Network With Continual Learning , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Song Zhu,et al.  Hybrid Parallel Stochastic Configuration Networks for Industrial Data Analytics , 2021, IEEE Transactions on Industrial Informatics.

[15]  Zhanxing Zhu,et al.  Adaptive Progressive Continual Learning , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Bing Liu,et al.  Continual Learning by Using Information of Each Class Holistically , 2021, AAAI.

[17]  Dapeng Chen,et al.  Layerwise Optimization by Gradient Decomposition for Continual Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jihwan Bang,et al.  Rainbow Memory: Continual Learning with a Memory of Diverse Samples , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Lawrence Carin,et al.  Efficient Feature Transformations for Discriminative and Generative Continual Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ping Zhou,et al.  Compact Incremental Random Weight Network for Estimating the Underground Airflow Quantity , 2021, IEEE Transactions on Industrial Informatics.

[21]  C. L. P. Chen,et al.  Research Review for Broad Learning System: Algorithms, Theory, and Applications , 2021, IEEE Transactions on Cybernetics.

[22]  Ponnuthurai N. Suganthan,et al.  On the origins of randomization-based feedforward neural networks , 2021, Appl. Soft Comput..

[23]  Richard E. Turner,et al.  Generalized Variational Continual Learning , 2020, ICLR.

[24]  Joost van de Weijer,et al.  Class-Incremental Learning: Survey and Performance Evaluation on Image Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Laurent Itti,et al.  Beneficial Perturbation Network for Designing General Adaptive Artificial Intelligence Systems , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Rayan Saab,et al.  Random Vector Functional Link Networks for Function Approximation on Manifolds , 2020, Frontiers in Applied Mathematics and Statistics.

[27]  Diego Klabjan,et al.  Efficient Architecture Search for Continual Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Mohammad Emtiyaz Khan,et al.  Continual Deep Learning by Functional Regularisation of Memorable Past , 2020, NeurIPS.

[29]  Hang Su,et al.  Triple-Memory Networks: A Brain-Inspired Method for Continual Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Tyler L. Hayes,et al.  REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2019, European Conference on Computer Vision.

[31]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Albert Gordo,et al.  Using Hindsight to Anchor Past Knowledge in Continual Learning , 2019, AAAI.

[33]  Tinne Tuytelaars,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Inyoung Paik,et al.  Overcoming Catastrophic Forgetting by Neuron-level Plasticity Control , 2019, AAAI.

[35]  Megha Nawhal,et al.  Lifelong GAN: Continual Learning for Conditional Image Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Ling Shao,et al.  Random Path Selection for Incremental Learning , 2019, ArXiv.

[37]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Richard E. Turner,et al.  Improving and Understanding Variational Continual Learning , 2019, ArXiv.

[39]  Tianyou Chai,et al.  Stochastic configuration networks with block increments for data modeling in process industries , 2019, Inf. Sci..

[40]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[41]  Larry P. Heck,et al.  Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[42]  Sung Ju Hwang,et al.  Scalable and Order-robust Continual Learning with Additive Parameter Decomposition , 2019, ICLR.

[43]  Qingshan Liu,et al.  A Discrete-Time Projection Neural Network for Sparse Signal Reconstruction With Application to Face Recognition , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[45]  Shan Yu,et al.  Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.

[46]  Andreas S. Tolias,et al.  Generative replay with feedback connections as a general strategy for continual learning , 2018, ArXiv.

[47]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[48]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[49]  Xu He,et al.  Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , 2018, ICLR.

[50]  Xizhao Wang,et al.  A review on neural networks with random weights , 2018, Neurocomputing.

[51]  O. Shamir,et al.  Size-Independent Sample Complexity of Neural Networks , 2017, COLT.

[52]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[53]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[54]  Le Zhang,et al.  Visual Tracking With Convolutional Random Vector Functional Link Network , 2017, IEEE Transactions on Cybernetics.

[55]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[56]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[57]  Ming Li,et al.  Insights into randomized algorithms for neural networks: Practical issues and common pitfalls , 2017, Inf. Sci..

[58]  Dianhui Wang,et al.  Randomness in neural networks: an overview , 2017, WIREs Data Mining Knowl. Discov..

[59]  Dianhui Wang,et al.  Stochastic Configuration Networks: Fundamentals and Algorithms , 2017, IEEE Transactions on Cybernetics.

[60]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[61]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  P. N. Suganthan,et al.  A comprehensive evaluation of random vector functional link networks , 2016, Inf. Sci..

[63]  Junmo Kim,et al.  Less-forgetting Learning in Deep Neural Networks , 2016, ArXiv.

[64]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[66]  Qingshan Liu,et al.  $L_{1}$ -Minimization Algorithms for Sparse Signal Reconstruction Based on a Projection Neural Network , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[67]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[68]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[69]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[70]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[71]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[72]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[73]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[74]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[75]  Steve Rogers,et al.  Adaptive Filter Theory , 1996 .

[76]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[77]  Dejan J. Sobajic,et al.  Learning and generalization characteristics of the random vector Functional-link net , 1994, Neurocomputing.

[78]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[79]  Y. Takefuji,et al.  Functional-link net computing: theory, system architecture, and functionalities , 1992, Computer.

[80]  Sharad Singhal,et al.  Training feed-forward networks with the extended Kalman algorithm , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[81]  Bing Liu,et al.  HRN: A Holistic Approach to One Class Learning , 2020, NeurIPS.

[82]  C. L. Philip Chen,et al.  Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[83]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[84]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[85]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .