Exploiting Multiple Timescales in Hierarchical Echo State Networks

Echo state networks (ESNs) are a powerful form of reservoir computing that only require training of linear output weights whilst the internal reservoir is formed of fixed randomly connected neurons. With a correctly scaled connectivity matrix, the neurons’ activity exhibits the echo-state property and responds to the input dynamics with certain timescales. Tuning the timescales of the network can be necessary for treating certain tasks, and some environments require multiple timescales for an efficient representation. Here we explore the timescales in hierarchical ESNs, where the reservoir is partitioned into two smaller linked reservoirs with distinct properties. Over three different tasks (NARMA10, a reconstruction task in a volatile environment, and psMNIST), we show that by selecting the hyperparameters of each partition such that they focus on different timescales, we achieve a significant performance improvement over a single ESN. Through a linear analysis, and under the assumption that the timescales of the first partition are much shorter than the second’s (typically corresponding to optimal operating conditions), we interpret the feedforward coupling of the partitions in terms of an effective representation of the input signal, provided by the first partition to the second, whereby the instantaneous input signal is expanded into a weighted combination of its time derivatives. Furthermore, we propose a data-driven approach to optimise the hyper-parameters through a gradient descent optimisation method that is an online approximation of backpropagation through time. We demonstrate the application of the online learning rule across all the tasks considered.

[1]  Davide Bacciu,et al.  Concentric ESN: Assessing the Effect of Modularity in Cycle Reservoirs , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[2]  Claudio Gallicchio,et al.  Deep reservoir computing: A critical experimental analysis , 2017, Neurocomputing.

[3]  Amir F. Atiya,et al.  New results on recurrent network training: unifying the algorithms and accelerating convergence , 2000, IEEE Trans. Neural Networks Learn. Syst..

[4]  Peter Tiño,et al.  Minimum Complexity Echo State Network , 2011, IEEE Transactions on Neural Networks.

[5]  Raphaël Couturier,et al.  Echo State Networks-Based Reservoir Computing for MNIST Handwritten Digits Recognition , 2016, 2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES).

[6]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[7]  K. Meert,et al.  A comparative study of fully and partially recurrent networks , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[8]  Igor Farkas,et al.  Computational analysis of memory capacity in echo state networks , 2016, Neural Networks.

[9]  Lorenzo Livi,et al.  Determination of the Edge of Criticality in Echo State Networks Through Fisher Information Maximization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[10]  András Lörincz,et al.  Reinforcement Learning with Echo State Networks , 2006, ICANN.

[11]  Herbert Jaeger,et al.  Optimization and applications of echo state networks with leaky- integrator neurons , 2007, Neural Networks.

[12]  Benjamin Schrauwen,et al.  Recurrent Kernel Machines: Computing with Infinite Echo State Networks , 2012, Neural Computation.

[13]  Sarah Marzen,et al.  The difference between memory and prediction in linear recurrent networks , 2017, Physical review. E.

[14]  S. Farris Are mushroom bodies cerebellum-like structures? , 2011, Arthropod structure & development.

[15]  Claudio Gallicchio,et al.  Deep Echo State Networks for Diagnosis of Parkinson's Disease , 2018, ESANN.

[16]  Yoshua Bengio,et al.  Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies , 2019, AAAI.

[17]  Nathaniel B Sawtell,et al.  A comparative approach to cerebellar function: insights from electrosensory systems , 2016, Current Opinion in Neurobiology.

[18]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[19]  Simon Haykin,et al.  Decoupled echo state networks with lateral inhibition , 2007, Neural Networks.

[20]  Claudio Gallicchio,et al.  Design of deep echo state networks , 2018, Neural Networks.

[21]  Amir Hussain,et al.  Multilayered Echo State Machine: A Novel Architecture and Algorithm , 2017, IEEE Transactions on Cybernetics.

[22]  Claudio Gallicchio,et al.  Echo State Property of Deep Reservoir Computing Networks , 2017, Cognitive Computation.

[23]  Nathaniel Rodriguez,et al.  Optimal modularity and memory capacity of neural reservoirs , 2017, Network Neuroscience.

[24]  C. Bordenave,et al.  The circular law , 2012 .

[25]  Herbert Jaeger,et al.  Discovering multiscale dynamical features with hierarchical Echo State Networks , 2008 .

[26]  Meng Wang,et al.  Gap junction networks in mushroom bodies participate in visual learning and memory in Drosophila , 2016, eLife.

[27]  Christof Teuscher,et al.  A Comparative Study of Reservoir Computing for Temporal Signal Processing , 2014, ArXiv.

[28]  Eric T. Trautman,et al.  A Complete Electron Microscopy Volume of the Brain of Adult Drosophila melanogaster , 2017, Cell.

[29]  G. Miesenböck,et al.  Excitatory Local Circuits and Their Implications for Olfactory Processing in the Fly Antennal Lobe , 2007, Cell.

[30]  Garrison W. Cottrell,et al.  DeePr-ESN: A deep projection-encoding echo-state network , 2020, Inf. Sci..

[31]  Louis K. Scheffer,et al.  A connectome of a learning and memory center in the adult Drosophila brain , 2017, eLife.

[32]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[33]  Gilles Laurent,et al.  Olfactory network dynamics and the coding of multidimensional signals , 2002, Nature Reviews Neuroscience.

[34]  Eleni Vasilaki,et al.  SpaRCe: Sparse reservoir computing , 2019, ArXiv.

[35]  Uri Hasson,et al.  Amplification of local changes along the timescale processing hierarchy , 2017, Proceedings of the National Academy of Sciences.

[36]  Shawn R. Olsen,et al.  Lateral presynaptic inhibition mediates gain control in an olfactory circuit , 2008, Nature.

[37]  D. Marr A theory of cerebellar cortex , 1969, The Journal of physiology.

[38]  Eleni Vasilaki,et al.  An alternative to backpropagation through time , 2020 .

[39]  Wolfgang Maass,et al.  A solution to the learning dilemma for recurrent networks of spiking neurons , 2019, Nature Communications.

[40]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[41]  Tao Li,et al.  Deep belief echo-state network and its application to time series prediction , 2017, Knowl. Based Syst..

[42]  Zhidong Deng,et al.  Collective Behavior of a Small-World Recurrent Neural System With Scale-Free Distribution , 2007, IEEE Transactions on Neural Networks.