Asynchronous Collaborative Localization by Integrating Spatiotemporal Graph Learning with Model-Based Estimation

Collaborative localization is an essential capability for a team of robots such as connected vehicles to collaboratively estimate object locations from multiple perspectives with reliant cooperation. To enable collaborative localization, four key challenges must be addressed, including modeling complex relationships between observed objects, fusing observations from an arbitrary number of collaborating robots, quantifying localization uncertainty, and addressing latency of robot communications. In this paper, we introduce a novel approach that integrates uncertainty-aware spatiotemporal graph learning and modelbased state estimation for a team of robots to collaboratively localize objects. Specifically, we introduce a new uncertaintyaware graph learning model that learns spatiotemporal graphs to represent historical motions of the objects observed by each robot over time and provides uncertainties in object localization. Moreover, we propose a novel method for integrated learning and model-based state estimation, which fuses asynchronous observations obtained from an arbitrary number of robots for collaborative localization. We evaluate our approach in two collaborative object localization scenarios in simulations and on real robots. Experimental results show that our approach outperforms previous methods and achieves state-of-the-art performance on asynchronous collaborative localization.

[1]  Peng Gao,et al.  Regularized Graph Matching for Correspondence Identification under Uncertainty in Collaborative Perception , 2020, Robotics: Science and Systems.

[2]  Zhichao Yin,et al.  GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Zhaoxin Li,et al.  STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Jong-Eun Ha,et al.  Extrinsic calibration of a camera and laser range finder using a new calibration structure of a plane with a triangular hole , 2012 .

[5]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[7]  Rui Guo,et al.  Cooperative LIDAR Object Detection via Feature Sharing in Deep Networks , 2020, 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall).

[8]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[9]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Luis Enrique Sucar,et al.  View planning for 3D object reconstruction with a mobile manipulator robot , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Qi Yu,et al.  Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning , 2020, ACM Multimedia.

[12]  Marco Pavone,et al.  The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Balaji Lakshminarayanan,et al.  Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.

[14]  Simo Särkkä,et al.  Batch Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression , 2014, Robotics: Science and Systems.

[15]  Faouzi Alaya Cheikh,et al.  A hierarchical feature model for multi-target tracking , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[16]  Qiang Li,et al.  Kalman Filter and Its Application , 2015, 2015 8th International Conference on Intelligent Networks and Intelligent Systems (ICINIS).

[17]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jeffrey K. Uhlmann,et al.  Using covariance intersection for SLAM , 2007, Robotics Auton. Syst..

[19]  Davide Scaramuzza,et al.  A comparison of volumetric information gain metrics for active 3D object reconstruction , 2017, Autonomous Robots.

[20]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qi Zhang,et al.  Single-Frame based Deep View Synchronization for Unsynchronized Multi-Camera Surveillance , 2020, ArXiv.

[22]  Kourosh Khoshelham,et al.  Accuracy analysis of kinect depth data , 2012 .

[23]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[24]  Sean L. Bowman,et al.  Probabilistic data association for semantic SLAM , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[26]  Tara Javidi,et al.  SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Timothy Bretl,et al.  PoseRBPF: A Rao–Blackwellized Particle Filter for 6-D Object Pose Tracking , 2019, IEEE Transactions on Robotics.

[29]  Chengjin Zhang,et al.  Master-followed Multiple Robots Cooperation SLAM Adapted to Search and Rescue Environment , 2018 .

[30]  K. Madhava Krishna,et al.  Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Liu Dan,et al.  Survey of connected automated vehicle perception mode: from autonomy to interaction , 2018, IET Intelligent Transport Systems.

[32]  Jianren Wang,et al.  3D Multi-Object Tracking: A Baseline and New Evaluation Metrics , 2019 .

[33]  Lu Fang,et al.  SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Jesús Capitán,et al.  A Dynamic Weighted Area Assignment Based on a Particle Filter for Active Cooperative Perception , 2020, IEEE Robotics and Automation Letters.

[35]  Blake Hannaford,et al.  Surgical Instrument Segmentation for Endoscopic Vision with Data Fusion of rediction and Kinematic Pose , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[36]  Jan Kautz,et al.  Geometry-Aware Learning of Maps for Camera Localization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Hao Zhang,et al.  Multi-view Sensor Fusion by Integrating Model-based Estimation and Graph Learning for Collaborative Object Localization , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Hiroshi Murase,et al.  Hybrid Localization using Model- and Learning-Based Methods: Fusion of Monte Carlo and E2E Localizations via Importance Sampling , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[39]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[40]  Peng Gao,et al.  Collaborative Localization for Occluded Objects in Connected Vehicular Platform , 2019, 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall).