Visual Hull Construction, Alignment and Refinement Across Time

Visual Hull (VH) construction is a popular method of shape estimation. The method, also known as Shape from Silhouette (SFS), approximates shape of an object from multiple silhouette images by constructing an upper bound of the shape called the Visual Hull. SFS is used in many applications such as non-invasive 3D object digitization, 3D object recognition and more recently human motion tracking and analysis. Though SFS is straightforward to implement and easy to use, it has several limitations. Most existing SFS methods are too slow for real-time applications and the estimated shape is sensitive to silhouette noise and camera calibration errors. Moreover, VH is only a conservative approximation of the actual shape of the object and the approximation can be very coarse when there are only a few cameras. In my thesis, I propose to investigate some of these shortcomings and suggest solutions to overcome them. First, a voxel-based real-time SFS algorithm called SPOT is proposed and its behavior under noisy silhouette images is analyzed. Secondly, the conservative nature of SFS is improved by incorporating silhouette images across time. The improvement is achieved by first estimating the rigid motions between visual hulls formed at different time instants (visual hull alignment) and then combining them (visual hull refinement) to get a tighter bound of the object’s shape. The ambiguity issue of visual hull alignment is identified and addressed. This study is presented here in 2D. In the thesis I proposed to extend it to 3D objects. Thirdly, an algorithm that uses color consistency to resolve alignment ambiguity problem is proposed. This algorithm constructs entities called bounding edges of the VH and utilizes the Fundamental Theorem of Visual Hull to find points on the surface of the object. Using an idea similar to image alignment, these surface points are used to find the motion between two visual hulls. Once the rigid motion across time is known, all the silhouette images are treated as being captured at the same time instant and the shape of the object is refined. The algorithm is validated by both synthetic and real data. Finally the advantages and disadvantages of representing VH by three different ways : bounding cones intersection, voxels and bounding edges are discussed.

[1]  Lawrence G. Roberts,et al.  Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[2]  Thomas O. Binford,et al.  Depth from Edge and Intensity Based Stereo , 1981, IJCAI.

[3]  Jake K. Aggarwal,et al.  TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2008 .

[4]  Jake K. Aggarwal,et al.  Rectangular parallelepiped coding: A volumetric representation of three-dimensional objects , 1986, IEEE J. Robotics Autom..

[5]  Michael Potmesil Generating octree models of 3D objects from their silhouettes in a sequence of images , 1987, Comput. Vis. Graph. Image Process..

[6]  Arun K. Pujari,et al.  Volume intersection with optimal set of directions , 1991, Pattern Recognit. Lett..

[7]  Takeo Kanade,et al.  A multiple-baseline stereo , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  A D Marshall,et al.  Geometric Modelling for Computer Vision , 1992 .

[9]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[10]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Richard Szeliski,et al.  Image mosaicing for tele-reality applications , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[12]  J. Ponce,et al.  Towards structure and motion estimation from dynamic silhouettes , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[13]  Marie-Odile Berger,et al.  3D Surface Reconstruction Using Occluding Contours , 1995, International Journal of Computer Vision.

[14]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[15]  Aldo Laurentini,et al.  How Far 3D Shapes Can Be Understood from 2D Silhouettes , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  S. P. Mudur,et al.  Three-dimensional computer vision: a geometric viewpoint , 1993 .

[17]  David J. Kriegman,et al.  Structure and motion of curved 3D objects from monocular silhouettes , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Aldo Laurentini,et al.  How Many 2D Silhouettes Does It Take to Reconstruct a 3D Object? , 1997, Comput. Vis. Image Underst..

[19]  Takeo Kanade,et al.  Virtual ized reality: constructing time-varying virtual worlds from real world events , 1997 .

[20]  Saied Moezzi,et al.  Virtual View Generation for 3D Digital Video , 1997, IEEE Multim..

[21]  Takeo Kanade,et al.  Virtualized reality: constructing time-varying virtual worlds from real world events , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[22]  Roberto Cipolla,et al.  The visual motion of curves and surfaces , 1998, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[23]  Andrea Bottino,et al.  Toward Non-intrusive Motion Capture , 1998, ACCV.

[24]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Aldo Laurentini The visual hull of curved objects , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[26]  Wojciech Matusik,et al.  Creating and Rendering Image-Based Visual Hulls , 1999 .

[27]  Paul A. Viola,et al.  Roxels: responsibility weighted 3D volume reconstruction , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[29]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[30]  Paulo R. S. Mendonça,et al.  Camera Pose Estimation and Reconstruction from Image Profiles under Circular Motion , 2000, ECCV.

[31]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[32]  Aldo Laurentini,et al.  Interactive reconstruction of 3D objects from silhouettes , 2001 .

[33]  Jean Ponce,et al.  On computing exact visual hulls of solids bounded by smooth surfaces , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[34]  Roberto Cipolla,et al.  Structure and motion from silhouettes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[35]  A Probabilistic Framework for Space Carving , 2001, ICCV.

[36]  Takeo Kanade,et al.  A Characterization of Inherent Stereo Ambiguities , 2001, ICCV.

[37]  Paulo R. S. Mendonça,et al.  Epipolar geometry from profiles under circular motion , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.