Unsupervised Modeling of Partially Observable Environments

We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 × 20) under conditions of high-noise and stochastic actions.

[1]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[2]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[3]  Stephen R. Marsland,et al.  A self-organising network that grows when required , 2002, Neural Networks.

[4]  Risto Miikkulainen,et al.  Developing navigation behavior through self-organizing distinctive-state abstraction , 2006, Connect. Sci..

[5]  Inductive Modelling of Temporal Sequences by Means of Self-organization , 2007 .

[6]  Risto Miikkulainen,et al.  Reinforcement learning in high-diameter, continuous environments , 2007 .

[7]  Jan Koutník,et al.  Temporal Hebbian Self-Organizing Map for Sequences , 2008, ICANN.

[8]  Fernando Fernández,et al.  Two steps reinforcement learning , 2008, Int. J. Intell. Syst..

[9]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[10]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[11]  Marcello Ferro,et al.  Neurorobotics Original Research Article , 2022 .

[12]  Jurgen Schmidhuber,et al.  Artificial curiosity with planning for autonomous perceptual and cognitive development , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[13]  Jürgen Schmidhuber,et al.  Sequential Constant Size Compressors for Reinforcement Learning , 2011, AGI.