Data Association and Localization of Classified Objects in Visual SLAM

Maps generated by many visual Simultaneous Localization and Mapping algorithms consist of geometric primitives such as points, lines or planes. These maps offer a topographic representation of the environment, but they contain no semantic information about the environments. Object classifiers leveraging advances in machine learning are highly accurate and reliable, capable of detecting and classifying thousands of objects. Classifiers can be incorporated into a SLAM pipeline to add semantic information to a scene. Frequently, this semantic information is conducted for each frame of the image, but semantic labeling is not persistent over time. In this work, we present a nonparametric statistical approach to perform matching/association of objects detected over consecutive image frames. These associated classified objects are then localized in the accrued map using an unsupervised clustering method. We test our approach on multiple data sets, and it shows strong performance in terms of objects correctly associated from frame to frame. We also have tested our algorithm on three data sets in our lab environment using tag markers to demonstrate the accuracy of classified object localization process.

[1]  Esra Ataer Cansizoglu,et al.  Object detection and tracking in RGB-D SLAM via hierarchical feature grouping , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  John J. Leonard,et al.  Monocular SLAM Supported Object Recognition , 2015, Robotics: Science and Systems.

[3]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[4]  Sven Behnke,et al.  Registration with the Point Cloud Library: A Modular Framework for Aligning in 3-D , 2015, IEEE Robotics & Automation Magazine.

[5]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Jonathan P. How,et al.  SLAM with objects using a nonparametric pose graph , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Ben Glocker,et al.  Real-time RGB-D camera relocalization , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[8]  David W. Murray,et al.  Combining monoSLAM with object recognition for scene augmentation using a wearable camera , 2010, Image Vis. Comput..

[9]  Roland Siegwart,et al.  Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery , 2019, IEEE Robotics and Automation Letters.

[10]  Jean-Arcady Meyer,et al.  Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words , 2008, IEEE Transactions on Robotics.

[11]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Frank Dellaert,et al.  SLAM with object discovery, modeling and mapping , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[14]  J. J. Higgins Introduction to Modern Nonparametric Statistics , 2003 .

[15]  Javier Civera,et al.  Towards semantic SLAM using a monocular camera , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Asif Iqbal,et al.  Localization of Classified Objects in SLAM using Nonparametric Statistics and Clustering , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[18]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[19]  Paul H. J. Kelly,et al.  SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Michael Milford,et al.  QuadricSLAM: Dual Quadrics From Object Detections as Landmarks in Object-Oriented SLAM , 2018, IEEE Robotics and Automation Letters.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[23]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[24]  Michael Milford,et al.  Meaningful maps with object-oriented semantic mapping , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Danica Kragic,et al.  Integrating Active Mobile Robot Object Recognition and SLAM in Natural Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Sebastian Thrun,et al.  Robotic mapping: a survey , 2003 .

[27]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[28]  Juan D. Tardós,et al.  Probabilistic Semi-Dense Mapping from Highly Accurate Feature-Based Monocular SLAM , 2015, Robotics: Science and Systems.

[29]  Stefan Leutenegger,et al.  SemanticFusion: Dense 3D semantic mapping with convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Rafael Muñoz-Salinas,et al.  Mapping and Localization from Planar Markers , 2016, Pattern Recognit..

[31]  Du Q. Huynh,et al.  Metrics for 3D Rotations: Comparison and Analysis , 2009, Journal of Mathematical Imaging and Vision.

[32]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[34]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[35]  David W. Murray,et al.  Towards simultaneous recognition, localization and mapping for hand-held and wearable cameras , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[36]  Wolfram Burgard,et al.  Conceptual spatial representations for indoor mobile robots , 2008, Robotics Auton. Syst..

[37]  Simone Frintrop,et al.  Attentional Landmarks and Active Gaze Control for Visual SLAM , 2008, IEEE Transactions on Robotics.

[38]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[39]  Sean L. Bowman,et al.  Probabilistic data association for semantic SLAM , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[41]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[42]  Paolo Pirjanian,et al.  A Visual Front-end for Simultaneous Localization and Mapping , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.