In-hand Object Scanning via RGB-D Video Segmentation

This paper proposes a technique for 3D object scanning via in-hand manipulation, in which an object reoriented in front of a video camera with multiple grasps and regrasps. In-hand object tracking is a significant challenge under fast movement, rapid appearance changes, and occlusions. This paper proposes a novel video-segmentation-based object tracking algorithm that tracks arbitrary in-hand objects more effectively than existing techniques. It also describes a novel RGB-D in-hand object manipulation dataset consisting of several common household objects. Experiments show that the new method achieves 6% increase in accuracy compared to top performing video tracking algorithms and results in noticeably higher quality reconstructed models. Moreover, testing with a novice user on a set of 200 objects demonstrates relatively rapid construction of complete 3D object models.

[1]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Ming-Hsuan Yang,et al.  SegFlow: Joint Learning for Video Object Segmentation and Optical Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Shin'ichi Satoh,et al.  VabCut: A video extension of GrabCut for unsupervised video foreground object segmentation , 2014, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[5]  Markus Vincze,et al.  RGB-D object modelling for object recognition and tracking , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Shin'ichi Satoh,et al.  Unsupervised learning of supervoxel embeddings for video Segmentation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[7]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[8]  Hamid Tairi,et al.  Automatic Human Segmentation in Video Using Convex Active Contours , 2016, 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV).

[9]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[10]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[11]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[12]  Luc Van Gool,et al.  One-Shot Video Object Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[14]  Holly E. Rushmeier,et al.  The 3D Model Acquisition Pipeline , 2002, Comput. Graph. Forum.

[15]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Luc Van Gool,et al.  Online loop closure for real-time interactive 3D scanning , 2011, Comput. Vis. Image Underst..

[18]  P. Abbeel,et al.  Benchmarking in Manipulation Research , 2015 .

[19]  Luc Van Gool,et al.  In-hand scanning with online loop closure , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[20]  Antti Oulasvirta,et al.  Real-Time Joint Tracking of a Hand Manipulating an Object from RGB-D Input , 2016, ECCV.

[21]  Luc Van Gool,et al.  Accurate and robust registration for in-hand modeling , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Dimitrios Tzionas,et al.  3D Object Reconstruction from Hand-Object Interactions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Antonis A. Argyros,et al.  3D Tracking of Human Hands in Interaction with Unknown Objects , 2015, BMVC.

[25]  Jenq-Neng Hwang,et al.  Inter-camera tracking based on fully unsupervised online learning , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[26]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[27]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[28]  Michael J. Black,et al.  Video Segmentation via Object Flow , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Vladlen Koltun,et al.  Colored Point Cloud Registration Revisited , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Bastian Leibe,et al.  Online Adaptation of Convolutional Neural Networks for Video Object Segmentation , 2017, BMVC.

[31]  Guoheng Huang,et al.  Unsupervised video co-segmentation based on superpixel co-saliency and region merging , 2016, Multimedia Tools and Applications.

[32]  Yong Jae Lee,et al.  Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Bernt Schiele,et al.  Learning Video Object Segmentation from Static Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Dieter Fox,et al.  Manipulator and object tracking for in-hand 3D object modeling , 2011, Int. J. Robotics Res..

[35]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..