Sequence-based multimodal apprenticeship learning for robot perception and decision making

Apprenticeship learning has recently attracted a wide attention due to its capability of allowing robots to learn physical tasks directly from demonstrations provided by human experts. Most previous techniques assumed that the state space is known a priori or employed simple state representations that usually suffer from perceptual aliasing. Different from previous research, we propose a novel approach named Sequence-based Multimodal Apprenticeship Learning (SMAL), which is capable to simultaneously fusing temporal information and multimodal data, and to integrate robot perception with decision making. To evaluate the SMAL approach, experiments are performed using both simulations and real-world robots in the challenging search and rescue scenarios. The empirical study has validated that our SMAL approach can effectively learn plans for robots to make decisions using sequence of multimodal observations. Experimental results have also showed that SMAL outperforms the baseline methods using individual images.

[1]  Alexander Zelinsky,et al.  Programing by Demonstration: Coping with Suboptimal Teaching Actions , 2003 .

[2]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[3]  Leslie Pack Kaelbling,et al.  Making Reinforcement Learning Work on Real Robots , 2002 .

[4]  Han Wang,et al.  Appearance-Based Topological Bayesian Inference for Loop-Closing Detection in a Cross-Country Environment , 2006, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[6]  Daniel H. Grollman,et al.  Sparse incremental learning for interactive robot control policy estimation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[7]  Jeff G. Schneider,et al.  Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[8]  Hua Wang,et al.  Enforcing Template Representability and Temporal Consistency for Adaptive Sparse Tracking , 2016, IJCAI.

[9]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[10]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[11]  Pradeep K. Khosla,et al.  Learning by observation with mobile robots: a computational approach , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[12]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[13]  Wolfram Burgard,et al.  Socially compliant mobile robot navigation via inverse reinforcement learning , 2016, Int. J. Robotics Res..

[14]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[15]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[16]  Jean-Arcady Meyer,et al.  Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words , 2008, IEEE Transactions on Robotics.

[17]  Gordon Wyeth,et al.  RatSLAM: a hippocampal model for simultaneous localization and mapping , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[18]  Bojan Nemec,et al.  Learning of a ball-in-a-cup playing robot , 2010, 19th International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD 2010).

[19]  Olivier Michel,et al.  Cyberbotics Ltd. Webots™: Professional Mobile Robot Simulation , 2004, ArXiv.

[20]  Xue Yang,et al.  SRAL: Shared Representative Appearance Learning for Long-Term Visual Place Recognition , 2017, IEEE Robotics and Automation Letters.

[21]  Guang-Zhong Yang,et al.  Feature Co-occurrence Maps: Appearance-based localisation throughout the day , 2013, 2013 IEEE International Conference on Robotics and Automation.

[22]  F. Michaud,et al.  Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation , 2013, IEEE Transactions on Robotics.

[23]  R. Amit,et al.  Learning movement sequences from demonstration , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.

[24]  Hua Wang,et al.  Robust Multimodal Sequence-Based Loop Closure Detection via Structured Sparsity , 2016, Robotics: Science and Systems.

[25]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[26]  Andrea Lockerd Thomaz,et al.  Tutelage and socially guided robot learning , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[27]  Michael Bosse,et al.  The gist of maps - summarizing experience for lifelong localization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[29]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[30]  Luis Miguel Bergasa,et al.  Towards life-long visual localization using an efficient matching of binary sequences from images , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Roderic A. Grupen,et al.  A model of shared grasp affordances from demonstration , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[32]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[33]  Pravesh Ranchod,et al.  Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[35]  John J. Leonard,et al.  An Online Sparsity-Cognizant Loop-Closure Algorithm for Visual Navigation , 2014, Robotics: Science and Systems.

[36]  Csaba Szepesvári,et al.  Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.

[37]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[38]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[39]  Juan D. Tardós,et al.  Fast relocalisation and loop closing in keyframe-based SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).