On Calibration and Alignment of Point Clouds in a Network of RGB-D Sensors for Tracking

This paper investigates the integration of multiple time-of-flight (ToF) depth sensors for the purposes of general 3D tracking and specifically of the hands. The advantage of using a network with multiple sensors is in the increased viewing coverage as well as being able to capture a more complete 3D point cloud representation of the object. Given an ideal point cloud representation, tracking can be accomplished without having to first reconstruct a mesh representation of the object. In utilizing a network of depth sensors, calibration between the sensors and the subsequent data alignment of the point clouds poses key challenges. While there has been research on the merging and alignment of scenes with larger objects such as the human body, there is little research available focusing on a smaller and more complicated object such as the human hand. This paper presents a study on ways to merge and align the point clouds from a network of sensors for object and feature tracking from the combined point clouds.

[1]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Lale Akarun,et al.  Hierarchically constrained 3D hand pose estimation using regression forests from single frame depth data , 2014, Pattern Recognit. Lett..

[4]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[6]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[7]  Zoltan-Csaba Marton,et al.  On fast surface reconstruction methods for large and noisy point clouds , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[9]  Marcus A. Magnor,et al.  Markerless Motion Capture using multiple Color-Depth Sensors , 2011, VMV.

[10]  Young Min Kim,et al.  Design and calibration of a multi-view TOF sensor fusion system , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[12]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[14]  Marc Pollefeys,et al.  Live Metric 3D Reconstruction on Mobile Phones , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Young Min Kim,et al.  Multi-view image and ToF sensor fusion for dense 3D reconstruction , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16]  Shahram Payandeh,et al.  Sensitivity study for object reconstruction using a network of time-of-flight depth sensors , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.