Biomimetic Approach to Tacit Learning Based on Compound Control

The remarkable capability of living organisms to adapt to unknown environments is due to learning mechanisms that are totally different from the current artificial machine-learning paradigm. Computational media composed of identical elements that have simple activity rules play a major role in biological control, such as the activities of neurons in brains and the molecular interactions in intracellular control. As a result of integrations of the individual activities of the computational media, new behavioral patterns emerge to adapt to changing environments. We previously implemented this feature of biological controls in a form of machine learning and succeeded to realize bipedal walking without the robot model or trajectory planning. Despite the success of bipedal walking, it was a puzzle as to why the individual activities of the computational media could achieve the global behavior. In this paper, we answer this question by taking a statistical approach that connects the individual activities of computational media to global network behaviors. We show that the individual activities can generate optimized behaviors from a particular global viewpoint, i.e., autonomous rhythm generation and learning of balanced postures, without using global performance indices.

[1]  Jih-Gau Juang,et al.  Fuzzy neural network approaches for robotic gait synthesis , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[2]  Kazuya Suzuki,et al.  Self-assembly through the local interaction between “embodied” nonlinear oscillators with simple motile function , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  宇野 洋二,et al.  Formation and control of optimal trajectory in human multijoint arm movement : minimum torque-change model , 1988 .

[4]  Gentaro Taga,et al.  A model of the neuro-musculo-skeletal system for human locomotion , 1995, Biological Cybernetics.

[5]  Geoffrey Rushworth REFLEXES AND MOTOR INTEGRATION—SHERRINGTON'S CONCEPT OF INTEGRATIVE ACTION , 1970 .

[6]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[8]  H. Kimura,et al.  Mathematical classification of regulatory logics for compound environmental changes. , 2008, Journal of theoretical biology.

[9]  Lotfi A. Zadeh,et al.  Outline of a New Approach to the Analysis of Complex Systems and Decision Processes , 1973, IEEE Trans. Syst. Man Cybern..

[10]  Hidenori Kimura,et al.  Neural Computation Scheme of Compound Control: Tacit Learning for Bipedal Locomotion , 2008 .

[11]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[12]  H. Okano,et al.  Stochastic approach to molecular interactions and computational theory of metabolic and genetic regulations. , 2006, Journal of theoretical biology.

[13]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[14]  M. Kawato,et al.  Formation and control of optimal trajectory in human multijoint arm movement , 1989, Biological Cybernetics.

[15]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[16]  Eric Klavins,et al.  Programmable parts: a demonstration of the grammatical approach to self-organization , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Florentin Wörgötter,et al.  Learning with Relevance: Using a Third Factor to Stabilize Hebbian Learning , 2007, Neural Computation.

[18]  Shigeru Kondo,et al.  Noise-resistant and synchronized oscillation of the segmentation clock , 2006, Nature.

[19]  Gentaro Taga,et al.  A model of the neuro-musculo-skeletal system for human locomotion , 1995, Biological Cybernetics.

[20]  H. Sebastian Seung,et al.  Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[21]  B. McNaughton,et al.  Theta phase precession in hippocampal neuronal populations and the compression of temporal sequences , 1996, Hippocampus.

[22]  A. Winfree The geometry of biological time , 1991 .

[23]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[24]  Alexander Rm,et al.  A minimum energy cost hypothesis for human arm trajectories. , 1997 .

[25]  Yoko Yamaguchi,et al.  Human cortical circuits for central executive function emerge by theta phase synchronization , 2007, NeuroImage.

[26]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[27]  Masao Ito,et al.  Long-lasting depression of parallel fiber-Purkinje cell transmission induced by conjunctive stimulation of parallel fibers and climbing fibers in the cerebellar cortex , 1982, Neuroscience Letters.

[28]  Karl Johan Åström,et al.  Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.

[29]  Y. Kuniyoshi,et al.  Emergence and development of embodied cognition: a constructivist approach using robots. , 2007, Progress in brain research.

[30]  H. Okano,et al.  Mathematical description of gene regulatory units. , 2006, Biophysical journal.

[31]  Hiroaki Wagatsuma,et al.  Cognitive Map Formation Through Sequence Encoding by Theta Phase Precession , 2004, Neural Computation.

[32]  Rodney A. Brooks,et al.  Elephants don't play chess , 1990, Robotics Auton. Syst..

[33]  Rolf Pfeifer,et al.  Understanding intelligence , 2020, Inequality by Design.

[34]  Loukia D. Loukopoulos,et al.  Planning reaches by evaluating stored postures. , 1995, Psychological review.