Lifelong Generative Modelling Using Dynamic Expansion Graph Model

Variational Autoencoders (VAEs) suffer from degenerated performance, when learning several successive tasks. This is caused by catastrophic forgetting. In order to address the knowledge loss, VAEs are using either Generative Replay (GR) mechanisms or Expanding Network Architectures (ENA). In this paper we study the forgetting behaviour of VAEs using a joint GR and ENA methodology, by deriving an upper bound on the negative marginal log-likelihood. This theoretical analysis provides new insights into how VAEs forget the previously learnt knowledge during lifelong learning. The analysis indicates the best performance achieved when considering model mixtures, under the ENA framework, where there are no restrictions on the number of components. However, an ENA-based approach may require an excessive number of parameters. This motivates us to propose a novel Dynamic Expansion Graph Model (DEGM). DEGM expands its architecture, according to the novelty associated with each new databases, when compared to the information already learnt by the network from previous tasks. DEGM training optimizes knowledge structuring, characterizing the joint probabilistic representations corresponding to the past and more recently learned tasks. We demonstrate that DEGM guarantees optimal performance for each task while also minimizing the required number of parameters. Supplementary materials (SM) and source code are available.

[1]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[2]  Adrian G. Bors,et al.  Learning joint latent representations based on information maximization , 2021, Inf. Sci..

[3]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Junmo Kim,et al.  Less-forgetting Learning in Deep Neural Networks , 2016, ArXiv.

[7]  Jan Kautz,et al.  NVAE: A Deep Hierarchical Variational Autoencoder , 2020, NeurIPS.

[8]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  Alexandros Kalousis,et al.  Lifelong Generative Modeling , 2017, Neurocomputing.

[10]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[11]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  Vladimir Pavlovic,et al.  Recursive Inference for Variational Autoencoders , 2020, NeurIPS.

[14]  Adrian G. Bors,et al.  Lifelong learning of interpretable image representations , 2020, 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[15]  Alexandre Lacoste,et al.  Hierarchical Importance Weighted Autoencoders , 2019, ICML.

[16]  Tinne Tuytelaars,et al.  Task-Free Continual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kristen Grauman,et al.  Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[19]  Fei Ye,et al.  Lifelong Teacher-Student Network Learning , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Gunhee Kim,et al.  Variational Laplace Autoencoders , 2019, ICML.

[21]  Adrian G. Bors,et al.  Mixtures of Variational Autoencoders , 2020, 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[22]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[23]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Adrian G. Bors,et al.  Lifelong Infinite Mixture Model Based on Knowledge-Driven Dirichlet Process , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Dmitry Vetrov,et al.  Importance Weighted Hierarchical Variational Inference , 2019, NeurIPS.

[27]  Adrian G. Bors,et al.  Lifelong Twin Generative Adversarial Networks , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[28]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[30]  Mohammad Emtiyaz Khan,et al.  Continual Deep Learning by Functional Regularisation of Memorable Past , 2020, NeurIPS.

[31]  Dustin Tran,et al.  BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning , 2020, ICLR.

[32]  Adrian G. Bors,et al.  Learning latent representations across multiple data domains using Lifelong VAEGAN , 2020, ECCV.

[33]  Pushmeet Kohli,et al.  The Autoencoding Variational Autoencoder , 2020, NeurIPS.

[34]  Gerald Tesauro,et al.  Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.

[35]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[36]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[37]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[38]  Mingrui Liu,et al.  Improved Schemes for Episodic Memory-based Lifelong Learning , 2020, NeurIPS.

[39]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[40]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[41]  Yee Whye Teh,et al.  Continual Unsupervised Representation Learning , 2019, NeurIPS.

[42]  Adrian G. Bors,et al.  Lifelong Mixture of Variational Autoencoders , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[43]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[44]  Masashi Sugiyama,et al.  Unsupervised Domain Adaptation Based on Source-guided Discrepancy , 2018, AAAI.

[45]  Tom Eccles,et al.  Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies , 2018, NeurIPS.

[46]  Junsoo Ha,et al.  A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning , 2020, ICLR.

[47]  Adrian G. Bors,et al.  InfoVAEGAN: Learning Joint Interpretable Representations by Information Maximization and Maximum Likelihood , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[48]  Lawrence Carin,et al.  Symmetric Variational Autoencoder and Connections to Adversarial Learning , 2017, AISTATS.

[49]  Dmitry P. Vetrov,et al.  Doubly Semi-Implicit Variational Inference , 2018, AISTATS.

[50]  A. Bors,et al.  Deep Mixture Generative Autoencoders , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[52]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[53]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[54]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[55]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[56]  Taesup Moon,et al.  Continual Learning with Node-Importance based Adaptive Group Sparse Regularization , 2020, NeurIPS.

[57]  Justin Domke,et al.  Importance Weighting and Variational Inference , 2018, NeurIPS.

[58]  Chu-Song Chen,et al.  Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval , 2014, ECCV.