Learning Effective Navigational Strategies for Active Monocular Simultaneous Localization and Mapping

Simultaneous Localization and Mapping (SLAM) refers to the problem of mapping an unknown environment that the robot is operating in and localizing itself in the unknown environment at the same time. Out of the various methods of performing SLAM, using a single monocular camera as the sole sensory input is highly preferred due to its simplicity and low power consumption. Range sensors such as laser range finders, depth cameras etc require much more power to operate and performing SLAM with them is more computationally intensive as compared to SLAM with a single camera. However, when compared to trajectory planning methods using depth-based SLAM, Monocular SLAM in loop does need additional considerations. One main reason being that for a robust optimization of the map and robot trajectory, using Bundle Adjustment (BA) in the case of most monocular SLAM methods, the SLAM system needs to scan the area for a reasonable duration to gather more information about the area to improve the map and pose estimates. Additionally, due to the way monocular SLAM methods work, they do not tolerate large camera rotations between successive views and tend to breakdown. Other reasons for Monocular SLAM failure include ambiguities in decomposition of the Essential Matrix, feature-sparse scenes and more layers of non linear optimization apart from BA. Learning a complex task such as low-level robot manoeuvres while preventing failure of monocular SLAM is a challenging problem for both robots and humans. The data-driven identification of basic motion strategies in preventing monocular SLAM failure is a largely unexplored problem. In this thesis, a computational model is devised for representing and inferring strategies for the problem, formulated as a Markov Decision Process (MDP), where the reward function models the goal of the task as well as information about the strategy. Reinforcement Learning (RL) is used with an intuitive, handcrafted reward function to generates fail safe trajectories wherein the SLAM generated outputs (scene structure and camera motion) do not deviate largely from their true values. This model is expanded on by treating it as an expert and try to learn an underlying true reward function for the given task at hand using Inverse Reinforcement Learning (IRL). Quintessentially, the framework successfully learns the otherwise complex relation between motor actions and perceptual inputs and uses this knowledge to generate trajectories that do not cause failure of SLAM. It also learns how a few chosen parameters affect the task at hand. This complex relation is almost intractable to capture in an obvious mathematical formulation. The framework allows one to identify the way in which a few chosen parameters affect the quality of monocular SLAM estimates. The estimated reward function was able to capture expert demonstration information and the inherent expert

[1]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[2]  Don Ray Murray,et al.  Using Real-Time Stereo Vision for Mobile Robot Navigation , 2000, Auton. Robots.

[3]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[4]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[5]  Wolfram Burgard,et al.  Robust Monte Carlo localization for mobile robots , 2001, Artif. Intell..

[6]  Hugh F. Durrant-Whyte,et al.  A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..

[7]  S. LaValle,et al.  Randomized Kinodynamic Planning , 2001 .

[8]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[9]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[10]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[13]  Gamini Dissanayake,et al.  Planning under uncertainty using model predictive control for information gathering , 2006, Robotics Auton. Syst..

[14]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[15]  Gamini Dissanayake,et al.  Active SLAM using Model Predictive Control and Attractor based Exploration , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[17]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[19]  D. Barrios-Aranibar,et al.  LEARNING FROM DELAYED REWARDS USING INFLUENCE VALUES APPLIED TO COORDINATION IN MULTI-AGENT SYSTEMS , 2007 .

[20]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[21]  Frank Dellaert,et al.  iSAM: Incremental Smoothing and Mapping , 2008, IEEE Transactions on Robotics.

[22]  Nicholas Roy,et al.  Efficient Optimization of Information-Theoretic Exploration in SLAM , 2008, AAAI.

[23]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Kurt Konolige,et al.  Double window optimisation for constant time visual SLAM , 2011, 2011 International Conference on Computer Vision.

[26]  Roland Siegwart,et al.  Monocular‐SLAM–based navigation for autonomous micro helicopters in GPS‐denied environments , 2011, J. Field Robotics.

[27]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[28]  Gamini Dissanayake,et al.  A review of recent developments in Simultaneous Localization and Mapping , 2011, 2011 6th International Conference on Industrial and Information Systems.

[29]  Daniel Cremers,et al.  Camera-based navigation of a low-cost quadrocopter , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Andreas Zell,et al.  Markerless Visual Control of a Quad-Rotor Micro Aerial Vehicle by Means of On-Board Stereo Processing , 2012, AMS.

[31]  Marc Pollefeys,et al.  Vision-based autonomous mapping and exploration using a quadrotor MAV , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Meng Joo Er,et al.  A survey of inverse reinforcement learning techniques , 2012, Int. J. Intell. Comput. Cybern..

[33]  François Michaud,et al.  Online global loop closure detection for large-scale multi-session graph-based SLAM , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Horst Bischof,et al.  Active monocular localization: Towards autonomous monocular exploration for multirotor MAVs , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  K. Madhava Krishna,et al.  Time scaled collision cone based trajectory optimization approach for reactive planning in dynamic environments , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Daniel Cremers,et al.  Scale-aware navigation of a low-cost quadrocopter with a monocular camera , 2014, Robotics Auton. Syst..

[37]  Frank Dellaert,et al.  Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments , 2015, Int. J. Robotics Res..

[38]  Vijay Kumar,et al.  Information-Theoretic Planning with Trajectory Optimization for Dense 3D Mapping , 2015, Robotics: Science and Systems.

[39]  Paolo Valigi,et al.  Perception-aware Path Planning , 2016, ArXiv.