论文信息 - Information Driven Self-Organization of Complex Robotic Behaviors

Information Driven Self-Organization of Complex Robotic Behaviors

Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.

[1] S. Glickman,et al. Curiosity in zoo animals. , 1966, Behaviour.

[2] D. Berlyne. Curiosity and exploration. , 1966, Science.

[3] H. Risken. Fokker-Planck Equation , 1984 .

[4] S. Starkie. Free will , 1985, Nature.

[5] Editors , 1986, Brain Research Bulletin.

[6] P. Grassberger. Toward a quantitative theory of self-generated complexity , 1986 .

[7] J. Magnus,et al. Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition) , 1999 .

[8] Young,et al. Inferring statistical complexity. , 1989, Physical review letters.

[9] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[10] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[11] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[12] Karl J. Friston. Functional and effective connectivity in neuroimaging: A synthesis , 1994 .

[13] R. Mishra,et al. Self-Organization , 2021, Encyclopedic Dictionary of Archaeology.

[14] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[15] S. Hochreiter,et al. REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[16] H. Markram,et al. Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997, Science.

[17] A. Steele. Predictability , 1997, The British journal of ophthalmology.

[18] Niraj S. Desai,et al. Activity-dependent scaling of quantal amplitude in neocortical neurons , 1998, Nature.

[19] M. Bekoff,et al. Animal play : evolutionary, comparative, and ecological perspectives , 1998 .

[20] Ralf Der,et al. Self-organized acquisition of situated behaviors , 2001, Theory in Biosciences.

[21] Olaf Sporns,et al. Classes of network connectivity and dynamics , 2001, Complex..

[22] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[23] A. U.S.,et al. Predictability , Complexity , and Learning , 2002 .

[24] R. Der,et al. True autonomy from self-organized adaptivity , 2002 .

[25] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .

[26] Luc Steels,et al. The Autotelic Principle , 2003, Embodied Artificial Intelligence.

[27] Pierre-Yves Oudeyer,et al. Maximizing Learning Progress: An Internal Reward System for Development , 2003, Embodied Artificial Intelligence.

[28] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[29] Jochen Triesch,et al. A Gradient Rule for the Plasticity of a Neuron's Intrinsic Excitability , 2005, ICANN.

[30] Ralf Der,et al. Learning to feel the physics of a body , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32] Erkki Oja,et al. Artificial Neural Networks: Biological Inspirations - ICANN 2005, 15th International Conference, Warsaw, Poland, September 11-15, 2005, Proceedings, Part I , 2005, ICANN.

[33] Rolf Pfeifer,et al. How the body shapes the way we think - a new view on intelligence , 2006 .

[34] Ralf Der,et al. Let it roll - Emerging Sensorimotor Coordination in a Spherical Robot , 2006 .

[35] Ralf Der,et al. From Motor Babbling to Purposive Actions: Emerging Self-exploration in a Dynamical Systems Approach to Early Robot Development , 2006, SAB.

[36] Ralf Der,et al. Rocking Stamper and Jumping Snakes from a Dynamical Systems Approach to Artificial Life , 2006, Adapt. Behav..

[37] Gordon Pipa,et al. The combination of STDP and intrinsic plasticity yields complex dynamics in recurrent spiking networks , 2006, ESANN.

[38] Jochen Triesch,et al. Exploring the role of intrinsic plasticity for the learning of sensory representations , 2006, ESANN.

[39] T. Bugnyar,et al. Novel object exploration in ravens (Corvus corax): Effects of social relationships , 2006, Behavioural Processes.

[40] Olaf Sporns,et al. Mapping Information Flow in Sensorimotor Networks , 2006, PLoS Comput. Biol..

[41] M. Prokopenko,et al. Evolving Spatiotemporal Coordination in a Modular Robotic System , 2006, SAB.

[42] B. Brembs,et al. Order in Spontaneous Behavior , 2007, PloS one.