Smooth Mesh Estimation from Depth Data using Non-Smooth Convex Optimization

Meshes are commonly used as 3D maps since they encode the topology of the scene while being lightweight. Unfortunately, 3D meshes are mathematically difficult to handle directly because of their combinatorial and discrete nature. Therefore, most approaches generate 3D meshes of a scene after fusing depth data using volumetric or other representations. Nevertheless, volumetric fusion remains computationally expensive both in terms of speed and memory. In this paper, we leapfrog these intermediate representations and build a 3D mesh directly from a depth map and the sparse landmarks triangulated with visual odometry. To this end, we formulate a non-smooth convex optimization problem that we solve using a primal-dual method. Our approach generates a smooth and accurate 3D mesh that substantially improves the state-of-the-art on direct mesh reconstruction while running in real-time.

[1]  Marc Levoy,et al.  Zippered polygon meshes from range images , 1994, SIGGRAPH.

[2]  Joachim Hertzberg,et al.  3D Navigation Mesh Generation for Path Planning in Uneven Terrain , 2016 .

[3]  Maxime Lhuillier,et al.  Manifold surface reconstruction of an environment from sparse Structure-from-Motion data , 2013, Comput. Vis. Image Underst..

[4]  Luc Van Gool,et al.  Efficient edge-aware surface mesh reconstruction for urban scenes , 2017, Comput. Vis. Image Underst..

[5]  Davide Scaramuzza,et al.  Primal-Dual Mesh Convolutional Neural Networks , 2020, NeurIPS.

[6]  Paul Newman,et al.  Dense mono reconstruction: Living with the pain of the plain plane , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Matteo Matteucci,et al.  Real-time CPU-based large-scale 3D mesh reconstruction , 2018 .

[8]  Luca Carlone,et al.  Kimera: From SLAM to spatial perception with 3D dynamic scene graphs , 2021, Int. J. Robotics Res..

[9]  Maxime Lhuillier Surface reconstruction from a sparse point cloud by enforcing visibility consistency and topology constraints , 2018, Comput. Vis. Image Underst..

[10]  John J. Leonard,et al.  High-performance and tunable stereo reconstruction , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[13]  Emili Hernández,et al.  OVPC Mesh: 3D Free-space Representation for Local Ground Vehicle Navigation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[14]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[15]  John Fox,et al.  PROforma: a general technology for clinical decision support systems. , 1997, Computer methods and programs in biomedicine.

[16]  Margarita Chli,et al.  Real-time mesh-based scene estimation for aerial inspection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Steven Lovegrove,et al.  Parametric dense visual SLAM , 2011 .

[18]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Torsten Sattler,et al.  Incremental Visual-Inertial 3D Mesh Generation with Structural Regularities , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[21]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[22]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[23]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[24]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[25]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[26]  Nikolay Atanasov,et al.  Mesh Reconstruction from Aerial Images for Outdoor Terrain Mapping Using Joint 2D-3D Learning , 2021, ArXiv.

[27]  Shuji Oishi,et al.  VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Luca Carlone,et al.  Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .

[30]  Nicholas Roy,et al.  Multi-level mapping: Real-time dense monocular SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Luca Carlone,et al.  3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans , 2020, RSS 2020.

[32]  Thomas Pock,et al.  Non-local Total Generalized Variation for Optical Flow Estimation , 2014, ECCV.

[33]  Thomas Ertl,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 2014 .

[34]  Karl Kunisch,et al.  Total Generalized Variation , 2010, SIAM J. Imaging Sci..

[35]  F. Frances Yao,et al.  Computational Geometry , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[36]  Nicholas Roy,et al.  FLaME: Fast Lightweight Mesh Estimation Using Variational Smoothing on Delaunay Graphs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Olivier D. Faugeras,et al.  Representing stereo data with the Delaunay triangulation , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[38]  Torsten Sattler,et al.  A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Stefan Leutenegger,et al.  Learning Meshes for Dense Visual SLAM , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[41]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.