This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Ensemble Learning in Fixed Expansion Layer Network

Catastrophic forgetting is a well-studied attribute of most parameterized supervised learning systems. A variation of this phenomenon, in the context of feedforward neural networks, arises when nonstationary inputs lead to loss of previously learned mappings. The majority of the schemes proposed in the literature for mitigating catastrophic forgetting were not data driven and did not scale well. We introduce the fixed expansion layer (FEL) feedforward neural network, which embeds a sparsely encoding hidden layer to help mitigate forgetting of prior learned representations. In addition, we investigate a novel framework for training ensembles of FEL networks, based on exploiting an information-theoretic measure of diversity between FEL learners, to further control undesired plasticity. The proposed methodology is demonstrated on a basic classification task, clearly emphasizing its advantages over existing techniques. The architecture proposed can be enhanced to address a range of computational intelligence tasks, such as regression problems and system control.

[1]  Robert M. French,et al.  Catastrophic Interference in Connectionist Networks: Can It Be Predicted, Can It Be Prevented? , 1993, NIPS.

[2]  Anthony V. Robins,et al.  Catastrophic forgetting in neural networks: the role of rehearsal mechanisms , 1993, Proceedings 1993 The First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[3]  Robi Polikar,et al.  Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes , 2009, IEEE Transactions on Neural Networks.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[6]  Joseph P. Levy,et al.  Connectionist Dual-Weight Architectures. , 1995 .

[7]  R. French,et al.  Modeling time perception in rats: Evidence for catastrophic interference in animal learning , 2020, Proceedings of the Twenty First Annual Conference of the Cognitive Science Society.

[8]  L’oubli catastrophique it,et al.  Avoiding catastrophic forgetting by coupling two reverberating neural networks , 2004 .

[9]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Robert M. French,et al.  Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .

[11]  Motonobu Hattori Dual-network memory model using a chaotic neural network , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[12]  Håvard Stranden,et al.  Catastophic Forgetting in Neural Networks , 2005 .

[13]  Robert M. French,et al.  Semi-distributed Representations and Catastrophic Forgetting in Connectionist Networks , 1992 .

[14]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[15]  J. Wearden,et al.  Adjusting to changes in the time of reinforcement: peak-interval transitions in rats. , 1997, Journal of experimental psychology. Animal behavior processes.

[16]  Geoffrey E. Hinton Using fast weights to deblur old memories , 1987 .

[17]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[18]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[19]  Robert M. French,et al.  Self-refreshing memory in artificial neural networks: learning temporal sequences without catastrophic forgetting , 2004, Connect. Sci..

[20]  Xin Yao,et al.  The Effectiveness of a New Negative Correlation Learning Algorithm for Classification Ensembles , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[21]  R. French Using pseudo-recurrent connectionist networks to solve the problem of sequential learning , 1997 .

[22]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[23]  Xin Yao,et al.  Diversity exploration and negative correlation learning on imbalanced data sets , 2009, 2009 International Joint Conference on Neural Networks.

[24]  R Ratcliff,et al.  Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.

[25]  Cynthia Rudin,et al.  The Rate of Convergence of Adaboost , 2011, COLT.

[26]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[27]  Gavin Brown,et al.  Negative Correlation Learning and the Ambiguity Family of Ensemble Methods , 2003, Multiple Classifier Systems.

[28]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[29]  Marcus Frean,et al.  Catastrophic forgetting in simple networks: an analysis of the pseudorehearsal solution. , 1999, Network.

[30]  Josef Kittler,et al.  Multiple Classifier Systems , 2004, Lecture Notes in Computer Science.