Inferring 3D Volumetric Shape of Both Moving Objects and Static Background Observed by a Moving Camera

We present a novel approach to inferring 3D volumetric shape of both moving objects and static background from video sequences shot by a moving camera, with the assumption that the objects move rigidly on a ground plane. The 3D scene is divided into a set of volume elements, termed as voxels, organized in an adaptive octree structure. Each voxel is assigned a label at each time instant, either as empty, or belonging to background structure, or a moving object. The task of shape inference is then formulated as assigning each voxel a dynamic label which minimizes photo and motion variance between voxels and the original sequence. We propose a three-step voxel labeling method based on a robust photo-motion variance measure. First, a sparse set of surface points are utilized to initialize a subset of voxels. Then, a deterministic voxel coloring scheme carves away the voxels with large variance. Finally, the labeling results are refined by a graph cuts based optimization method to enforce global smoothness. Experimental results on both indoor and outdoor sequences demonstrate the effectiveness and robustness of our method.

[1]  Roberto Cipolla,et al.  Multi-view stereo via volumetric graph-cuts , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[4]  Gérard G. Medioni,et al.  Detecting Motion Regions in the Presence of a Strong Parallax from a Moving Camera by Multiview Geometric Constraints , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[7]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[8]  R. Zabih,et al.  Exact voxel occupancy with graph cuts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[10]  Gérard G. Medioni,et al.  3D Reconstruction of Background and Objects Moving on Ground Plane Viewed from a Moving Camera , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Takeo Kanade,et al.  Shape and motion carving in 6D , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[13]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[15]  Atsuto Maki,et al.  Geotensity: Combining Motion and Lighting for 3D Surface Reconstruction , 2004, International Journal of Computer Vision.

[16]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[17]  Mark R. Stevens,et al.  Methods for Volumetric Reconstruction of Visual Scenes , 2004, International Journal of Computer Vision.

[18]  Larry S. Davis,et al.  3D Surface Reconstruction Using Graph Cuts with Surface Constraints , 2006, ECCV.

[19]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[20]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[21]  Roberto Cipolla,et al.  Camera Self-Calibration from Unknown Planar Structures Enforcing the Multiview Constraints between Collineations , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Leif Kobbelt,et al.  Robust and Efficient Photo-Consistency Estimation for Volumetric 3D Reconstruction , 2006, ECCV.