Reconstructing non-stationary articulated objects in monocular video using silhouette information

This paper presents an approach to reconstruct non-stationary, articulated objects from silhouettes obtained with a monocular video sequence. We introduce the concept of motion blurred scene occupancies, a direct analogy of motion blurred images but in a 3D object scene occupancy space resulting from the motion/deformation of the object. Our approach starts with an image based fusion step that combines color and silhouette information from multiple views. To this end we propose to use a novel construct: the temporal occupancy point (TOP), which is the estimated 3D scene location of a silhouette pixel and contains information about duration of time it is occupied. Instead of explicitly computing the TOP in 3D space we directly obtain itpsilas imaged(projected) locations in each view. This enables us to handle monocular video and arbitrary camera motion in scenarios where complete camera calibration information may not be available. The result is a set of blurred scene occupancy images in the corresponding views, where the values at each pixel correspond to the fraction of total time duration that the pixel observed an occupied scene location. We then use a motion de-blurring approach to de-blur the occupancy images. The de-blurred occupancy images correspond to a silhouettes of the mean/motion compensated object shape and are used to obtain a visual hull reconstruction of the object. We show promising results on challenging monocular datasets of deforming objects where traditional visual hull intersection approaches fail to reconstruct the object correctly.

[1]  Mubarak Shah,et al.  A Homographic Framework for the Fusion of Multi-view Silhouettes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Jiaya Jia,et al.  Single Image Motion Deblurring Using Transparency , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  R. Cipolla,et al.  A probabilistic framework for space carving , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  Shree K. Nayar,et al.  Motion-based motion deblurring , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Stefano Soatto,et al.  A variational approach to scene reconstruction and image segmentation from motion-blur cues , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Yasuyuki Matsushita,et al.  Removing Non-Uniform Motion Blur from Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ramesh Raskar,et al.  Coded exposure photography: motion deblurring using fluttered shutter , 2006, SIGGRAPH '06.

[11]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[12]  Olivier D. Faugeras,et al.  On the geometry and algebra of the point and line correspondences between N images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[13]  Aaron Hertzmann,et al.  Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Shree K. Nayar,et al.  Shape from Focus , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Stefano Soatto,et al.  Stereoscopic Segmentation , 2001, ICCV.

[16]  Ramesh Raskar,et al.  Coded exposure photography: motion deblurring using fluttered shutter , 2006, SIGGRAPH 2006.

[17]  Roberto Cipolla,et al.  Structure and motion from silhouettes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  David J. Kriegman,et al.  Structure and motion of curved 3D objects from monocular silhouettes , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Roberto Cipolla,et al.  A Probabilistic Framework for Space Carving , 2001, ICCV.

[20]  Jean Ponce,et al.  Automatic model construction, pose estimation, and object recognition from photographs using triangular splines , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[21]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[22]  Takeo Kanade,et al.  Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[23]  Sundaresh Ram,et al.  Removing Camera Shake from a Single Photograph , 2009 .

[24]  Amnon Shashua,et al.  Model-based brightness constraints: on direct estimation of structure and motion , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  W. RICHARD STARK Automatic model construction , 1980, Inf. Sci..