Image-based multi-view scene analysis using 'conexels'

Multi-camera environments allow constructing volumetric models of the scene to improve the analysis performance of computer vision algorithms (e.g. disambiguating occlusion). When representing volumetric results of image-based multi-camera analysis, a direct approach is to scan the 3D space with regular voxels. Regular voxelization is good at high spatial resolutions for applications such as volume visualization and rendering of synthetic scenes generated by geometric models, or to represent data resulting from direct 3D data capture (e.g. MRI). However, regular voxelization shows a number of drawbacks for visual scene analysis, where direct measurements on 3D voxels are not usually available. In this case, voxel values are computed rather as a result of the analysis on 'projected' image data. In this paper, we first provide some statistics to show how voxels project to 'unbalanced' sets of image data in common multi-view analysis settings. Then, we propose a 3D geometry for multi-view scene analysis providing a better balance in terms of the number of pixels used to analyse each elementary volumetric unit. The proposed geometry is non-regular in 3D space, but becomes regular once projected onto camera images, adapting the sampling to the images. The aim is to better exploit multi-view image data by balancing its usage across multiple cameras instead of focusing in regular sampling of 3D space, from which we do not have direct measurements. An efficient recursive algorithm using the proposed geometry is outlined. Experimental results reflect better balance and higher accuracy for multi-view analysis than regular voxelization with equivalent restrictions.

[1]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[2]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[3]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[4]  Mircea Nicolescu,et al.  Visual Hull Construction Using Adaptive Sampling , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[5]  Montse Pardàs,et al.  Foreground Regions Extraction and Characterization Towards Real-Time Object Tracking , 2005, MLMI.

[6]  Josep R. Casas,et al.  FUNCTIONALITIES FOR MAPPING 2 D IMAGES AND 3 D WORLD OBJECTS IN A MULTICAMERA SYSTEM , 2005 .

[7]  Taojun Lu The enumeration of trees with and without given limbs , 1996, Discret. Math..

[8]  Daniel Cohen-Or,et al.  Volume graphics , 1993, Computer.

[9]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Wojciech Matusik,et al.  Polyhedral Visual Hulls for Real-Time Rendering , 2001, Rendering Techniques.

[11]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[12]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision , 2004 .

[13]  Ioannis Pitas,et al.  Projection distortion analysis for flattened image mosaicing from straight uniform generalized cylinders , 2001, Pattern Recognit..

[14]  Edmond Boyer,et al.  A hybrid approach for computing visual hulls of complex objects , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  R. Cipolla,et al.  A probabilistic framework for space carving , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Zhengyou Zhang,et al.  Determining the Epipolar Geometry and its Uncertainty: A Review , 1998, International Journal of Computer Vision.