Indian Buffet Neural Networks for Continual Learning

We place an Indian Buffet Process (IBP) prior over the neural structure of a Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to increase and decrease automatically. We apply this methodology to the problem of resource allocation in continual learning, where new tasks occur and the network requires extra resources. Our BNN exploits online variational inference with relaxations to the Bernoulli and Beta distributions (which constitute the IBP prior), so allowing the use of the reparameterisation trick to learn variational posteriors via gradient-based methods. As we automatically learn the number of weights in the BNN, overfitting and underfitting problems are largely overcome. We show empirically that the method offers competitive results compared to Variational Continual Learning (VCL) in some settings.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[3]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[4]  Yee Whye Teh,et al.  Variational Inference for the Indian Buffet Process , 2009, AISTATS.

[5]  Lancelot F. James Bayesian poisson calculus for latent feature modeling via generalized Indian buffet process priors , 2017 .

[6]  Elad Hoffer,et al.  Task Agnostic Continual Learning Using Online Variational Bayes , 2018, 1803.10123.

[7]  Max Welling,et al.  Bayesian Compression for Deep Learning , 2017, NIPS.

[8]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[9]  Zoubin Ghahramani,et al.  Accelerated sampling for the Indian Buffet Process , 2009, ICML '09.

[10]  Thomas L. Griffiths,et al.  Particle Filtering for Nonparametric Bayesian Matrix Factorization , 2006, NIPS.

[11]  Andre Wibisono,et al.  Streaming Variational Bayes , 2013, NIPS.

[12]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[13]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[14]  Richard E. Turner,et al.  Improving and Understanding Variational Continual Learning , 2019, ArXiv.

[15]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[16]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[17]  Sebastian Thrun,et al.  Lifelong Learning: A Case Study. , 1995 .

[18]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[19]  Jeffrey Ling,et al.  Structured Variational Autoencoders for the Beta-Bernoulli Process , 2017 .

[20]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[21]  Marco Cote STICK-BREAKING VARIATIONAL AUTOENCODERS , 2017 .

[22]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[23]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[24]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[27]  Sotirios Chatzis,et al.  Indian Buffet Process Deep Generative Models for Semi-Supervised Classification , 2014, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Byoung-Tak Zhang,et al.  Overcoming Catastrophic Forgetting by Incremental Moment Matching , 2017, NIPS.

[29]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[30]  Yarin Gal,et al.  A Unifying Bayesian View of Continual Learning , 2019, ArXiv.

[31]  David M. Blei,et al.  Structured Stochastic Variational Inference , 2014, 1404.4114.

[32]  David Barber,et al.  Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[33]  Richard E. Turner,et al.  Overpruning in Variational Bayesian Neural Networks , 2018, 1801.06230.

[34]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[35]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.