Causality traces for retrospective learning in neural networks — Introduction of parallel and subjective time scales

We live in the flow of time, and the sensor signals we get not only have a huge amount in space, but also keep coming without a break in time. As a general method for effective retrospective learning in neural networks (NNs) in such a world based on the concept of "subjective time", "causality trace" is introduced in this paper. At each connection in each neuron, a trace is assigned. It takes in the corresponding input signal according to the temporal change in the neuron's output, and is held when the output does not change. This enables to memorize only past important events, to hold them in its local memory, and to learn the past processes effectively from the present reinforcement or training signals without tracing back to the past. The past events that the traces represent are different in each neuron, and so autonomous division of roles in the time axis among neurons is promoted through learning. From the viewpoint of time passage, there are parallel, non-uniform and subjective time scales for learning in the NN. Causality traces can be applied to value learning with a NN, and also applied to supervised learning of recurrent neural networks even though the way of application is a bit different. A new simulation result in a value-learning task shows the outstanding learning ability of causality traces and autonomous division of roles in the time axis among neurons through learning. Finally, several useful properties and concerns are discussed.

[1]  Catalin V. Buhusi,et al.  What makes us tick? Functional and neural mechanisms of interval timing , 2005, Nature Reviews Neuroscience.

[2]  Katsunari Shibata,et al.  Simple learning algorithm for recurrent networks to realize short-term memories , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[3]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[4]  Jürgen Schmidhuber,et al.  A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Katsunari Shibata,et al.  Improvement of Practical Recurrent Learning Method and Application to a Pattern Classification Task , 2008, ICONIP.

[8]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[9]  David S. Touretzky,et al.  Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.

[10]  Katsunari Shibata,et al.  Emergence of Intelligence through Reinforcement Learning with a Neural Network , 2011 .

[11]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[12]  J. Gibbon Origins of scalar timing , 1991 .

[13]  J. Staddon Interval timing: memory, not a clock , 2005, Trends in Cognitive Sciences.

[14]  Katsunari Shibata,et al.  Differential Trace in Learning of Value Function with a Neural Network , 2012, RiTA.

[15]  Hiroyuki Nakahara,et al.  Internal-Time Temporal Difference Model for Neural Value-Based Decision Making , 2010, Neural Computation.

[16]  D. Johnston,et al.  Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997 .