论文信息 - The eMOSAIC model for humanoid robot control

The eMOSAIC model for humanoid robot control

In this study, we propose an extension of the MOSAIC architecture to control real humanoid robots. MOSAIC was originally proposed by neuroscientists to understand the human ability of adaptive control. The modular architecture of the MOSAIC model can be useful for solving nonlinear and non-stationary control problems. Both humans and humanoid robots have nonlinear body dynamics and many degrees of freedom. Since they can interact with environments (e.g., carrying objects), control strategies need to deal with non-stationary dynamics. Therefore, MOSAIC has strong potential as a human motor-control model and a control framework for humanoid robots. Yet application of the MOSAIC model has been limited to simple simulated dynamics since it is susceptive to observation noise and also cannot be applied to partially observable systems. Our approach introduces state estimators into MOSAIC architecture to cope with real environments. By using an extended MOSAIC model, we are able to successfully generate squatting and object-carrying behaviors on a real humanoid robot.

Jun Morimoto | Mitsuo Kawato | Norikazu Sugimoto | Sang-Ho Hyon

[1] Christopher G. Atkeson,et al. Multiple balance strategies from one optimization criterion , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[2] Weiping Li,et al. Applied Nonlinear Control , 1991 .

[3] B. Pasik-Duncan,et al. Adaptive Control , 1996, IEEE Control Systems.

[4] Wilson J. Rugh,et al. Research on gain scheduling , 2000, Autom..

[5] K. Ikeuchi,et al. An Efficient Method for Composing Whole Body Motions of a Humanoid Robot , 2004 .

[6] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[8] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..

[9] Graziano Chesi,et al. LMI Techniques for Optimization Over Polynomials in Control: A Survey , 2010, IEEE Transactions on Automatic Control.

[10] M. Kawato,et al. Behavioral/systems/cognitive Functional Magnetic Resonance Imaging Examination of Two Modular Architectures for Switching Multiple Internal Models , 2022 .

[11] J. Aplevich,et al. Lecture Notes in Control and Information Sciences , 1979 .

[12] Michael I. Jordan,et al. An internal model for sensorimotor integration. , 1995, Science.

[13] Rieko Osu,et al. Integration of multi-level postural balancing on humanoid robots , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.

[15] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[16] Geoffrey E. Hinton,et al. Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[17] Mitsuo Kawato,et al. MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.

[18] J. Cruz,et al. A new approach to the sensitivity problem in multivariable feedback system design , 1964 .

[19] J. Cruz,et al. Feedback properties of linear regulators , 1971 .

[20] M. Kawato,et al. Explicit contextual information selectively contributes to predictive switching of internal models , 2007, Experimental Brain Research.

[21] Mitsuo Kawato,et al. Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.

[22] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..

[23] Rajesh P. N. Rao,et al. Dynamic Imitation in a Humanoid Robot through Nonparametric Probabilistic Inference , 2006, Robotics: Science and Systems.

[24] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[25] R. E. Kalman,et al. New Results in Linear Filtering and Prediction Theory , 1961 .

[26] Jorma Rissanen. Optimal Estimation , 2011, ALT.

[27] Jonathan P. How,et al. Control of LPV systems using a quasi-piecewise affine parameter-dependent Lyapunov function , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[28] Benjamin J. Stephens,et al. Humanoid push recovery , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[29] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[30] M. Athans,et al. Gain Scheduling: Potential Hazards and Possible Remedies , 1992, 1991 American Control Conference.

[31] Takao Fujii,et al. A complete optimally condition in the inverse problem of optimal control , 1984, The 23rd IEEE Conference on Decision and Control.

[32] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.

[33] Wilson J. Rugh,et al. Analytical Framework for Gain Scheduling , 1990, 1990 American Control Conference.