Range sensor and silhouette fusion for high-quality 3D Scanning

We consider the problem of building high-quality 3D object models from commodity RGB and depth sensors. Applications of such a database include instance and object recognition, robot grasping, virtual reality, graphics, and online shopping. Unfortunately, modern reconstruction approaches have difficulties in reconstructing objects with major transparencies (e.g., KinectFusion [22]) and/or concavities (e.g., visual hull). This paper presents a method to fuse visual hull information from off-the-shelf RGB cameras and KinectFusion cues from commodity depth sensors to produce models that are substantially better than either approach on its own. Extensive experiments on the recently published BigBIRD dataset [25] demonstrate that our reconstructions recover more accurate shape and detail than competing approaches, particularly on challenging objects with transparencies and/or concavities. Quantitative evaluations indicate that our approach consistently outperforms competing methods and achieves under 2 mm RMS error. We plan to release our code after the review process.

[1]  Bruce G. Baumgart,et al.  Geometric modeling for computer vision. , 1974 .

[2]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[3]  Jules Bloomenthal,et al.  Polygonization of implicit surfaces , 1988, Comput. Aided Geom. Des..

[4]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Andrew H. Gee,et al.  Regularised marching tetrahedra: improved iso-surface extraction , 1999, Comput. Graph..

[6]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[7]  Leif Kobbelt,et al.  √3-subdivision , 2000, SIGGRAPH.

[8]  Francis Schmitt,et al.  Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[9]  Edmond Boyer,et al.  Exact polyhedral visual hulls , 2003, BMVC.

[10]  C. Strecha,et al.  Wide-baseline stereo from multiple views: A probabilistic account , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Anthon Voigt,et al.  Visual hulls from single uncalibrated snapshots using two planar mirrors , 2004 .

[12]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[13]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Jean Ponce,et al.  Projective Visual Hulls , 2007, International Journal of Computer Vision.

[15]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[16]  Jean Ponce,et al.  Carved Visual Hulls for Image-Based Modeling , 2006, International Journal of Computer Vision.

[17]  Nassir Navab,et al.  Efficient visual hull computation for real-time 3D reconstruction using CUDA , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[18]  Jean-Philippe Pons,et al.  Robust and Efficient Surface Reconstruction From Range Data , 2009, Comput. Graph. Forum.

[19]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  S. Süsstrunk,et al.  SLIC Superpixels ? , 2010 .

[21]  Michael Goesele,et al.  Surface Reconstruction from Multi-resolution Sample Points , 2011, VMV.

[22]  Daniel Cremers,et al.  Real-time visual odometry from dense RGB-D images , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[23]  Tomás Pajdla,et al.  3D with Kinect , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[24]  Tomás Pajdla,et al.  Multi-view reconstruction preserving weakly-supported surfaces , 2011, CVPR 2011.

[25]  Mario Fritz,et al.  Improving the Kinect by Cross-Modal Stereo , 2011, BMVC.

[26]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[27]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[28]  Jianfei Cai,et al.  Kinect-Based Easy 3D Object Reconstruction , 2012, PCM.

[29]  Vladlen Koltun,et al.  Elastic Fragments for Dense Scene Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Andrea Fossati,et al.  Consumer Depth Cameras for Computer Vision , 2013, Advances in Computer Vision and Pattern Recognition.

[31]  Vladlen Koltun,et al.  Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[33]  David C. Schneider Visual Hull , 2014, Computer Vision, A Reference Guide.