Optimal Curiosity-Driven Modular Incremental Slow Feature Analysis

Consider a self-motivated artificial agent who is exploring a complex environment. Part of the complexity is due to the raw high-dimensional sensory input streams, which the agent needs to make sense of. Such inputs can be compactly encoded through a variety of means; one of these is slow feature analysis (SFA). Slow features encode spatiotemporal regularities, which are information-rich explanatory factors (latent variables) underlying the high-dimensional input streams. In our previous work, we have shown how slow features can be learned incrementally, while the agent explores its world, and modularly, such that different sets of features are learned for different parts of the environment (since a single set of regularities does not explain everything). In what order should the agent explore the different parts of the environment? Following Schmidhuber’s theory of artificial curiosity, the agent should always concentrate on the area where it can learn the easiest-to-learn set of features that it has not already learned. We formalize this learning problem and theoretically show that, using our model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. We provide experimental results to support the theoretical analysis.

[1]  Jürgen Schmidhuber,et al.  Artificial curiosity based on discovering novel algorithmic predictability through coevolution , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[2]  Dacheng Tao,et al.  Slow Feature Analysis for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Corso Elvezia Neural Predictors for Detecting and Removing Redundant Information , 1998 .

[4]  Evan Dekker,et al.  Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.

[5]  Lin Sun,et al.  DL-SFA: Deeply-Learned Slow Feature Analysis for Action Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Dezhong Peng,et al.  A New Algorithm for Sequential Minor Component Analysis , 2006 .

[7]  Shun-ichi Amari,et al.  Sequential Extraction of Minor Components , 2001, Neural Processing Letters.

[8]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[9]  Marek Petrik,et al.  An Analysis of Laplacian Methods for Value Function Approximation in MDPs , 2007, IJCAI.

[10]  Laurenz Wiskott,et al.  Slowness and Sparseness Lead to Place, Head-Direction, and Spatial-View Cells , 2007, PLoS Comput. Biol..

[11]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[12]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[13]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[14]  Laurenz Wiskott,et al.  An extension of slow feature analysis for nonlinear blind source separation , 2014, J. Mach. Learn. Res..

[15]  Daoqiang Zhang,et al.  Improving the Robustness of ‘Online Agglomerative Clustering Method’ Based on Kernel-Induce Distance Measures , 2005, Neural Processing Letters.

[16]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[17]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[18]  Marijn F. Stollenga,et al.  Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots , 2017, Artif. Intell..

[19]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[20]  Stefanie N. Lindstaedt,et al.  Comparison of two Unsupervised Neural Network Models for Redundancy Reduction , 1993 .

[21]  Jürgen Schmidhuber,et al.  First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.

[22]  Benjamin Kuipers,et al.  Autonomous Learning of High-Level States and Actions in Continuous Environments , 2012, IEEE Transactions on Autonomous Mental Development.

[23]  Jürgen Schmidhuber,et al.  Incremental Slow Feature Analysis , 2011, IJCAI.

[24]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[25]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[26]  Jürgen Schmidhuber,et al.  Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[27]  Jürgen Schmidhuber,et al.  Explore to see, learn to perceive, get the actions for free: SKILLABILITY , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[28]  J. Weng,et al.  Convergence Analysis of Complementary Candid Incremental Principal Component Analysis ∗ , 2001 .

[29]  Varun Raj Kompella,et al.  Hierarchical Incremental Slow Feature Analysis , 2012 .

[30]  Matthieu Cord,et al.  Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Robert A. Legenstein,et al.  Reinforcement Learning on Slow Features of High-Dimensional Input Streams , 2010, PLoS Comput. Biol..

[32]  Konkoly Thege Multi-criteria Reinforcement Learning , 1998 .

[33]  Henning Sprekeler,et al.  On the Relation of Slow Feature Analysis and Laplacian Eigenmaps , 2011, Neural Computation.

[34]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[35]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[36]  Jürgen Schmidhuber,et al.  An intrinsic value system for developing multiple invariant representations with incremental slowness learning , 2013, Front. Neurorobot..

[37]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[38]  Scott Kuindersma,et al.  Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.

[39]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[40]  Jürgen Schmidhuber,et al.  PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..

[41]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[42]  Graeme Mitchison,et al.  Removing Time Variation with the Anti-Hebbian Differential Synapse , 1991, Neural Computation.

[43]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[44]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[45]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[46]  Jürgen Schmidhuber,et al.  AutoIncSFA and vision-based developmental learning for humanoid robots , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[47]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[48]  Jürgen Schmidhuber,et al.  Unsupervised Learning in LSTM Recurrent Neural Networks , 2001, ICANN.

[49]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[50]  Laurenz Wiskott Estimating Driving Forces of Nonstationary Time Series with Slow Feature Analysis Laurenz Wiskott Institute for Theoretical Biology , 2003 .

[51]  Jürgen Schmidhuber,et al.  Sequential Constant Size Compressors for Reinforcement Learning , 2011, AGI.

[52]  Jürgen Schmidhuber,et al.  Learning tactile skills through curious exploration , 2012, Front. Neurorobot..

[53]  Jürgen Schmidhuber,et al.  Learning Unambiguous Reduced Sequence Descriptions , 1991, NIPS.

[54]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[55]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[56]  Michael Werman,et al.  An On-Line Agglomerative Clustering Method for Nonstationary Data , 1999, Neural Computation.

[57]  Sridhar Mahadevan,et al.  Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..

[58]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[59]  Zhang Yi,et al.  Convergence analysis of a simple minor component analysis algorithm , 2007, Neural Networks.

[60]  Jürgen Schmidhuber,et al.  Autonomous learning of abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[61]  Jürgen Schmidhuber,et al.  Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.

[62]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[63]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[64]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[65]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[66]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[67]  Manfred Hild,et al.  Posture Recognition Based on Slow Feature Analysis , 2012, Language Grounding in Robots.

[68]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..