Two forms of immediate reward reinforcement learning for exploratory data analysis

We review two forms of immediate reward reinforcement learning: in the first of these, the learner is a stochastic node while in the second the individual unit is deterministic but has stochastic synapses. We illustrate the first method on the problem of Independent Component Analysis. Four learning rules have been developed from the second perspective and we investigate the use of these learning rules to perform linear projection techniques such as principal component analysis, exploratory projection pursuit and canonical correlation analysis. The method is very general and simply requires a reward function which is specific to the function we require the unit to perform. We also discuss how the method can be used to learn kernel mappings and conclude by illustrating its use on a topology preserving mapping.

[1]  L. B. Almeida,et al.  Linear and nonlinear ICA based on mutual information , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[2]  Colin Fyfe,et al.  Reinforcement Learning for Topographic Mappings , 2007, 2007 International Joint Conference on Neural Networks.

[3]  Xiaolong Ma,et al.  Global Reinforcement Learning in Neural Networks , 2007, IEEE Transactions on Neural Networks.

[4]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[5]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[6]  Xiaolong Ma,et al.  Global Reinforcement Learning in Neural Networks with Stochastic Synapses , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[7]  Ying Wu,et al.  Stochastic Weights Reinforcement Learning for Exploratory Data Analysis , 2007, ICANN.

[8]  Pragya Agarwal,et al.  Self-Organising Maps , 2008 .

[9]  Colin Fyfe,et al.  Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.

[10]  Aristidis Likas,et al.  A Reinforcement Learning Approach to Online Clustering , 1999, Neural Computation.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Colin Fyfe,et al.  Immediate Reward Reinforcement Learning for Projective Kernel Methods , 2007, ESANN.

[13]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[14]  Colin Fyfe,et al.  Two topographic maps for data visualisation , 2007, Data Mining and Knowledge Discovery.

[15]  Luís B. Almeida,et al.  MISEP -- Linear and Nonlinear ICA Based on Mutual Information , 2003, J. Mach. Learn. Res..

[16]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[17]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[18]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[19]  Erkki Oja,et al.  The nonlinear PCA learning rule in independent component analysis , 1997, Neurocomputing.

[20]  E. Oja,et al.  Independent Component Analysis , 2013 .

[21]  W. Hsieh Machine Learning Methods in the Environmental Sciences: Nonlinear canonical correlation analysis , 2009 .