Continually adding self-invented problems to the repertoire: First experiments with POWERPLAY

Pure scientists do not only invent new methods to solve given problems. They also invent new problems. The recent POWERPLAY framework formalizes this type of curiosity and creativity in a new, general, yet practical way. To acquire problem solving prowess through playing, POWERPLAY-based artificial explorers by design continually come up with the fastest to find, initially novel, but eventually solvable problems. They also continually simplify or speed up solutions to previous problems. We report on results of first experiments with POWERPLAY. A self-delimiting recurrent neural network (SLIM RNN) is used as a general computational architecture to implement the system's solver. Its weights can encode arbitrary, self-delimiting, halting or non-halting programs affecting both environment (through effectors) and internal states encoding abstractions of event sequences. In open-ended fashion, our POWERPLAY-driven RNNs learn to become increasingly general problem solvers, continually adding new problem solving procedures to the growing repertoire, exhibiting interesting developmental stages.

[1]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[2]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[3]  K. Gödel Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .

[4]  Jürgen Schmidhuber,et al.  Optimal Ordered Problem Solver , 2002, Machine Learning.

[5]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[6]  Jürgen Schmidhuber,et al.  Exploring the predictable , 2003 .

[7]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[8]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[9]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[10]  T. Martin McGinnity,et al.  Novelty Detection as an Intrinsic Motivation for Cumulative Learning Robots , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[11]  J. Piaget The child's construction of reality , 1954 .

[12]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[13]  Andrew G. Barto,et al.  Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[14]  Gregory J. Chaitin,et al.  A recent technical report , 1974, SIGA.

[15]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[16]  Peter Dayan,et al.  Exploration from Generalization Mediated by Multiple Controllers , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[17]  Jürgen Schmidhuber,et al.  Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[18]  Jürgen Schmidhuber,et al.  A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .

[19]  Jürgen Schmidhuber,et al.  Self-Delimiting Neural Networks , 2012, ArXiv.

[20]  Jürgen Schmidhuber,et al.  Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts , 2006, Connect. Sci..

[21]  Jürgen Schmidhuber,et al.  Bias-Optimal Incremental Problem Solving , 2002, NIPS.

[22]  Jürgen Schmidhuber,et al.  PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..

[23]  Jürgen Schmidhuber,et al.  Artificial curiosity based on discovering novel algorithmic predictability through coevolution , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[24]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[25]  Ring Mark,et al.  Compression Progress-Based Curiosity Drive for Developmental Learning , 2011 .

[26]  Jürgen Schmidhuber,et al.  Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem , 1990 .