Learning by imitation with the STIFF-FLOP surgical robot: a biomimetic approach inspired by octopus movements

Transferring skills from a biological organism to a hyper-redundant system is a challenging task, especially when the two agents have very different structure/embodiment and evolve in different environments. In this article, we propose to address this problem by designing motion primitives in the form of probabilistic dynamical systems. We take inspiration from invertebrate systems in nature to seek for versatile representations of motion/behavior primitives in continuum robots. We take the perspective that the incredibly varied skills achieved by the octopus can guide roboticists toward the design of robust motor skill encoding schemes and present our ongoing work that aims at combining statistical machine learning, dynamical systems, and stochastic optimization to study the problem of transferring movement patterns from an octopus arm to a flexible surgical robot (STIFF-FLOP) composed of two modules with constant curvatures. The approach is tested in simulation by imitation and self-refinement of an octopus reaching motion.

[1]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[2]  Meirav Galun,et al.  Nearly automatic motion capture system for tracking octopus arm movements in 3D space , 2009, Journal of Neuroscience Methods.

[3]  Y Gutfreund,et al.  Organization of Octopus Arm Movements: A Model System for Studying the Control of Flexible Arms , 1996, The Journal of Neuroscience.

[4]  I. Walker Some Issues in Creating ‘ Invertebrate ’ Robots , 2000 .

[5]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6]  Long Wang,et al.  Integration and preliminary evaluation of an Insertable Robotic Effectors Platform for Single Port Access Surgery , 2012, 2012 IEEE International Conference on Robotics and Automation.

[7]  B. Hochner,et al.  Patterns of Arm Muscle Activation Involved in Octopus Reaching Movements , 1998, The Journal of Neuroscience.

[8]  Ian D. Walker,et al.  Design and implementation of a multi-section continuum robot: Air-Octor , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Gordon Cheng,et al.  Coaching: An Approach to Efficiently and Intuitively Create Humanoid Robot Behaviors , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[10]  Tamar Flash,et al.  Motor primitives in vertebrates and invertebrates , 2005, Current Opinion in Neurobiology.

[11]  Kaspar Althoefer,et al.  Design of a variable stiffness flexible manipulator with composite granular jamming and membrane coupling , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Pedro Larrañaga,et al.  Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[13]  Jan Peters,et al.  Imitation and Reinforcement Learning , 2010, IEEE Robotics & Automation Magazine.

[14]  Tamar Flash,et al.  Dynamic model of the octopus arm. I. Biomechanics of the octopus reaching movement. , 2005, Journal of neurophysiology.

[15]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[16]  Guang-Zhong Yang,et al.  Emerging Robotic Platforms for Minimally Invasive Surgery , 2013, IEEE Reviews in Biomedical Engineering.

[17]  Walker Reynolds The First Laparoscopic Cholecystectomy , 2001, JSLS : Journal of the Society of Laparoendoscopic Surgeons.

[18]  Tom Schaul,et al.  Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.

[19]  Darwin G. Caldwell,et al.  Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning , 2013, Robotics Auton. Syst..

[20]  Paolo Dario,et al.  Design of a Novel Bimanual Robotic System for Single-Port Laparoscopy , 2010, IEEE/ASME Transactions on Mechatronics.

[21]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[22]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[23]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[24]  Nikolaos G. Tsagarakis,et al.  Statistical dynamical systems for skills acquisition in humanoids , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[25]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[26]  Ian D. Walker,et al.  Field trials and testing of the OctArm continuum manipulator , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[27]  Darwin G. Caldwell,et al.  Human-robot skills transfer interfaces for a flexible surgical robot , 2014, Comput. Methods Programs Biomed..

[28]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[29]  Jan Peters,et al.  Imitation and Reinforcement Learning: Practical Algorithms for Motor Primitives in Robotics , 2010 .

[30]  M. K. Luhandjula Studies in Fuzziness and Soft Computing , 2013 .

[31]  Arianna Menciassi,et al.  STIFF-FLOP surgical manipulator: Mechanical design and experimental characterization of the single module , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[33]  Germán Sumbre,et al.  Neurobiology: Motor control of flexible octopus arms , 2005, Nature.

[34]  Darwin G. Caldwell,et al.  Skills transfer across dissimilar robots by learning context-dependent rewards , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  Tamar Flash,et al.  Kinematic decomposition and classification of octopus arm movements , 2013, Front. Comput. Neurosci..

[36]  J. Peters,et al.  Using Reward-weighted Regression for Reinforcement Learning of Task Space Control , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[37]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[38]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[39]  H. Choset,et al.  Highly articulated robotic probe for minimally invasive surgery , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[40]  W. Kier,et al.  Trunks, Tongues, and Tentacles: Moving with Skeletons of Muscle , 1989 .

[41]  Dirk P. Kroese,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .

[42]  Giorgio Metta,et al.  Learning the skill of archery by a humanoid robot iCub , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[43]  Olivier Sigaud,et al.  Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.

[44]  Jamie L. Branch,et al.  Robotic Tentacles with Three‐Dimensional Mobility Based on Flexible Elastomers , 2013, Advanced materials.