论文信息 - Reinforcement Learning in Topology-based Representation for Human Body Movement with Whole Arm Manipulation

Reinforcement Learning in Topology-based Representation for Human Body Movement with Whole Arm Manipulation

Moving a human body or a large and bulky object may require the strength of whole arm manipulation (WAM). This type of manipulation places the load on the robot’s arms and relies on global properties of the interaction to succeed— rather than local contacts such as grasping or non-prehensile pushing. In this paper, we learn to generate motions that enable WAM for holding and transporting of humans in certain rescue or patient care scenarios. We model the task as a reinforcement learning problem in order to provide a robot behavior that can directly respond to external perturbation and human motion. For this, we represent global properties of the robot-human interaction with topology-based coordinates that are computed from arm and torso positions. These coordinates also allow transferring the learned policy to other body shapes and sizes. For training and evaluation, we simulate a dynamic sea rescue scenario and show in quantitative experiments that the policy can solve unseen scenarios with differently-shaped humans, floating humans, or with perception noise. Our qualitative experiments show the subsequent transporting after holding is achieved and we demonstrate that the policy can be directly transferred to a real world setting.

[1] Xinyu Liu,et al. Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[2] Danica Kragic,et al. Cooperative grasping through topological object representation , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[3] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[4] Zhongwei Jiang,et al. Towards Whole Arm Manipulation by Contact State Transition , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5] Danica Kragic,et al. A Framework for Optimal Grasp Contact Planning , 2017, IEEE Robotics and Automation Letters.

[6] Giulio Sandini,et al. Robots and Biological Systems: Towards a New Bionics? , 2012, NATO ASI Series.

[7] Taku Komura,et al. A finite state machine based on topology coordinates for wrestling games , 2011, Comput. Animat. Virtual Worlds.

[8] Danica Kragic,et al. A topology-based object representation for clasping, latching and hooking , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Taku Komura,et al. Topology-based representations for motion planning and generalization in dynamic environments with interactions , 2013, Int. J. Robotics Res..

[11] Taku Komura,et al. Hierarchical Motion Planning in Topological Representations , 2012, Robotics: Science and Systems.

[12] Toshio Tsuji,et al. Hugging walk , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[13] U. Feige,et al. Spectral Graph Theory , 2015 .

[14] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[15] John Kenneth Salisbury,et al. Preliminary design of a whole-arm manipulation system (WAMS) , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[16] Taku Komura,et al. Controlling humanoid robots in topology coordinates , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] Andrew Howard,et al. Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[19] Antonio Bicchi,et al. Force distribution in multiple whole-limb manipulation , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.

[20] J. Kenneth Salisbury,et al. Mechanical Design for Whole-Arm Manipulation , 1993 .

[21] Klaus Gärtner,et al. Meshing Piecewise Linear Complexes by Constrained Delaunay Tetrahedralizations , 2005, IMR.

[22] Akansel Cosgun,et al. Push planning for object placement on cluttered table surfaces , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23] Zhiwei Luo,et al. Generation of Human Care Behaviors by Human-Interactive Robot RI-MAN , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[24] Siddhartha S. Srinivasa,et al. Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[25] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[26] Danica Kragic,et al. Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[27] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .

[28] Edmond S. L. Ho,et al. Spatial relationship preserving character motion adaptation , 2010, ACM Trans. Graph..

[29] Danica Kragic,et al. Integrated motion and clasp planning with virtual linking , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] S. LaValle. Rapidly-exploring random trees : a new tool for path planning , 1998 .

[31] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[32] Vijay Kumar,et al. Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[33] Christian Ott,et al. Humanoid compliant whole arm dexterous manipulation: Control design and experiments , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34] Sergey Levine,et al. Collective robot reinforcement learning with distributed asynchronous guided policy search , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35] Danica Kragic,et al. Grasping objects with holes: A topological approach , 2013, 2013 IEEE International Conference on Robotics and Automation.

[36] Masayuki Inaba,et al. A full-body motion control method for a humanoid robot based on on-line estimation of the operational force of an object with an unknown weight , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[37] Kun Zhou,et al. Large mesh deformation using the volumetric graph Laplacian , 2005, ACM Trans. Graph..

[38] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[39] Yasuo Kuniyoshi,et al. Humanoid robot which can lift a 30kg box by whole body contact and tactile feedback , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40] Vijay Kumar,et al. Dynamic simulation for grasping and whole arm manipulation , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).