Resource-efficient Incremental learning in very high dimensions

We propose a three-layer neural architecture for incremental multi-class learning that remains resource-efficient even when the number of input dimensions is very high (≥ 1000). This so-called projection-prediction (PROPRE) architecture is strongly inspired by biological information processing in that it uses a prototype-based, topologically organized hidden layers trained with the SOM learning rule controlled by a global, task-related error signal. Furthermore, the SOM learning adapts only the weights of localized neural sub-populations that are similar to the input, which explicitly avoids the catastrophic forgetting effect of MLPs in case new input statistics are presented to the architecture. As the readout layer uses simple linear regression, the approach essentially applies locally linear models to " receptive fields " (RF) defined by SOM prototypes, whereas RF shape is implicitly defined by adjacent prototypes (which avoids the storage of covariance matrices that gets prohibitive for high input dimensionality). Both RF centers and shapes are jointly adapted w.r.t. input statistics and the classification task. Tests on the MNIST dataset show that the algorithm achieves compares favorably compared to the state-of-the-art LWPR algorithm at vastly decreased resource requirements .

[1]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[2]  M. Giese,et al.  Norm-based face encoding by single neurons in the monkey inferotemporal cortex , 2006, Nature.

[3]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[4]  Norman M Weinberger,et al.  The nucleus basalis and memory codes: Auditory cortical plasticity and the induction of specific, associative behavioral memory , 2003, Neurobiology of Learning and Memory.

[5]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[6]  Michael M Merzenich,et al.  Perceptual Learning Directs Auditory Cortical Map Reorganization through Top-Down Influences , 2006, The Journal of Neuroscience.

[7]  T. Palmeri,et al.  Not just the norm: Exemplar-based models also predict face aftereffects , 2014, Psychonomic Bulletin & Review.

[8]  R. Desimone,et al.  Clustering of perirhinal neurons with similar properties following visual experience in adult monkeys , 2000, Nature Neuroscience.

[9]  M. Hasselmo,et al.  The effect of learning on the face selective responses of neurons in the cortex in the superior temporal sulcus of the monkey , 2004, Experimental Brain Research.

[10]  Stefan Schaal,et al.  A Library for Locally Weighted Projection Regression , 2008, J. Mach. Learn. Res..

[11]  M. Hasselmo The role of acetylcholine in learning and memory , 2006, Current Opinion in Neurobiology.

[12]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Olivier Sigaud,et al.  On-line regression algorithms for learning mechanical models of robots: A survey , 2011, Robotics Auton. Syst..

[15]  A Gepperth,et al.  Efficient online bootstrapping of representations , 2012 .