论文信息 - Efficient Planning for Near-Optimal Compliant Manipulation Leveraging Environmental Contact

Efficient Planning for Near-Optimal Compliant Manipulation Leveraging Environmental Contact

Path planning classically focuses on avoiding environmental contact. However, some assembly tasks permit contact through compliance, and such contact may allow for more efficient and reliable solutions under action uncertainty. But, optimal manipulation plans that leverage environmental contact are difficult to compute. Environmental contact produces complex kinematics that create difficulties for planning. This complexity is usually addressed by discretization over state and action space, but discretization quickly becomes computationally intractable. To overcome the challenge, we use the insight that only actions on configurations near the contact manifold are likely to involve complex kinematics, while segments of the plan through free space do not. Leveraging this structure can greatly reduce the number of states considered and scales much better with problem complexity. We develop an algorithm based on this idea and show that it performs comparably to full MDP solutions at a fraction of the computational cost.

[1] Kei Senda,et al. Reinforcement Learning of Robotic Manipulators , 2016 .

[2] Alberto Rodriguez,et al. Prehensile pushing: In-hand manipulation with push-primitives , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3] Matthew T. Mason,et al. Compliance and Force Control for Computer Controlled Manipulators , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[4] Dmitry Berenson,et al. Planning and Resilient Execution of Policies For Manipulation in Contact with Actuation Uncertainty , 2016, WAFR.

[5] Gaurav S. Sukhatme,et al. Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning , 2017, ICML.

[6] Emilio Frazzoli,et al. An incremental sampling-based algorithm for stochastic optimal control , 2012, 2012 IEEE International Conference on Robotics and Automation.

[7] Michael S. Branicky,et al. Search strategies for peg-in-hole assemblies with position uncertainty , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[8] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[9] H. Kushner. Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[10] K. Maycock,et al. A passive compliant wrist for chamferless peg-in-hole assembly operation from vertical and horizontal directions , 1998 .

[11] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[12] Nolan Wagener,et al. Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[13] Steven M. LaValle,et al. Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[14] Michael A. Peshkin,et al. Programmed compliance for error corrective assembly , 1990, IEEE Trans. Robotics Autom..

[15] Joris De Schutter,et al. Peg-on-hole: a model based solution to peg and hole alignment , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[16] Pieter Abbeel,et al. LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2010, Int. J. Robotics Res..

[17] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[18] Warren P. Seering,et al. Assembly strategies for chamferless parts , 1989, Proceedings, 1989 International Conference on Robotics and Automation.

[19] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.

[20] Nicholas Roy,et al. Asymptotically Optimal Planning under Piecewise-Analytic Constraints , 2016, WAFR.

[21] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[22] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[23] Russell H. Taylor,et al. Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .

[24] Yangmin Li,et al. Hybrid control approach to the peg-in hole problem , 1997, IEEE Robotics Autom. Mag..

[25] W. Marsden. I and J , 2012 .