Pose estimation, model refinement, and enhanced visualization using video

In this paper we present methods for exploitation and enhanced visualization of video given a prior coarse untextured polyhedral model of a scene. Since it is necessary to estimate the 3D poses of the moving camera, we develop an algorithm where tracked features are used to predict the pose between frames and the predicted poses are refined by a coarse to fine process of aligning projected 3D model line segments to oriented image gradient energy pyramids. The estimated poses can be used to update the model with information derived from video, and to re-project and visualize the video from different points of view with a larger scene context. Via image registration, we update the placement of objects in the model and the 3D shape of new or erroneously modeled objects, then map video texture to the model. Experimental results are presented for long aerial and ground level videos of a large-scale urban scene.

[1]  Harpreet S. Sawhney,et al.  VideoBrush/sup TM/: experiences with consumer video mosaicing , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[2]  Paul A. Beardsley,et al.  3D Model Acquisition from Extended Image Sequences , 1996, ECCV.

[3]  P. Anandan,et al.  Direct Recovery of Planar-Parallax from Multiple Frames , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[5]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 1989, Proceedings CVPR '89: IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Seth J. Teller,et al.  Extracting textured vertical facades from controlled close-range imagery , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[7]  Mei Han,et al.  Interactive 3D modeling from multiple images using scene regularities , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[8]  Patrick Bouthemy,et al.  Robust real-time visual tracking using a 2D-3D model-based approach , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Roberto Cipolla,et al.  Real-Time Tracking of Complex Structures for Visual Servoing , 1999, Workshop on Vision Algorithms.

[10]  Alex Pentland,et al.  3D structure from 2D motion , 1999, IEEE Signal Process. Mag..

[11]  Hans-Hellmut Nagel,et al.  3D pose estimation by fitting image gradients directly to polyhedral models , 1995, Proceedings of IEEE International Conference on Computer Vision.

[12]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[13]  Dimitris N. Politis,et al.  Computer-intensive methods in statistical analysis , 1998, IEEE Signal Process. Mag..

[14]  P. Anandan,et al.  Direct recovery of shape from multiple views: a parallax based approach , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[15]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[16]  Harpreet S. Sawhney,et al.  Registration of video to geo-referenced imagery , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).