A trifocal sensor system for 3D reconstruction of architectural models

This article describes a fully automated video sensor system for the 3D reconstruction of architectural models from image sequences. The hybrid system unifies triangulation methods of three high-resolution cameras (spatial stereo) with feature-tracking methods over time (temporal stereo). Because standard tracking techniques suffer from outliers between subsequent images, we improve the outlier detection using temporal as well as spatial trifocal information. Furthermore, these spatio-temporal constraints allow the system to perform a guided matching that increases the number of tracked features and is robust to partial occlusions. The fully automatic and reliable calculation of the camera path from those tie points is still a challenging task. The use of multiple calibrated cameras that are fixed on a rig leads to additional constraints, which significantly stabilize the pose estimation process. To achieve a dense surface reconstruction, we propose an efficient spatial image-matching algorithm, which is based on trifocal image rectification and semi-global optimization using mutual information. Our improvements include a symmetric and hierarchical matching strategy with sub-pixel accuracy.

[1]  M. Heinrichs,et al.  ARTIST : Architectural Model Refinement using Terrestrial Image Sequences from a Trifocal Sensor , 2008 .

[2]  Wolfgang Förstner,et al.  A Framework for Low Level Feature Extraction , 1994, ECCV.

[3]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[4]  V. Rodehorst,et al.  EVALUATION OF RELATIVE POSE ESTIMATION METHODS FOR MULTI-CAMERA SETUPS , 2008 .

[5]  V. Rodehorst,et al.  ROBUST SPATIO-TEMPORAL FEATURE TRACKING , 2008 .

[6]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[8]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[9]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[10]  C. Tao,et al.  Semi-Automated Object Measurement Using Multiple-Image Matching from Mobile Mapping Image Sequences , 2006 .

[11]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[12]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[13]  Uwe Franke,et al.  Improving Stereo Sub-Pixel Accuracy for Long Range Stereo , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[15]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  David Nister,et al.  Recent developments on direct relative orientation , 2006 .

[17]  Geoffrey Egnal,et al.  Mutual Information as a Stereo Correspondence Measure , 2000 .

[18]  Hirokazu Kato,et al.  Marker tracking and HMD calibration for a video-based augmented reality conferencing system , 1999, Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99).

[19]  Yunde Jia,et al.  An efficient rectification method for trinocular stereovision , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[20]  Reinhard Koch,et al.  3D-SCENE MODELING FROM IMAGE SEQUENCES , 2003 .

[21]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[22]  Narendra Ahuja,et al.  Motion and Structure From Two Perspective Views: Algorithms, Error Analysis, and Error Estimation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  A. Koschan,et al.  COMPARISON AND EVALUATION OF FEATURE POINT DETECTORS , 2006 .

[24]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[25]  Kenichi Kanatani,et al.  Extending Interrupted Feature Point Tracking for 3-D Affine Reconstruction , 2004, IEICE Trans. Inf. Syst..

[26]  Yiannis Aloimonos,et al.  Spatio-Temporal Stereo Using Multi-Resolution Subdivision Surfaces , 2004, International Journal of Computer Vision.

[27]  Reinhard Koch,et al.  Calibration of a Multi-camera Rig from Non-overlapping Views , 2007, DAGM-Symposium.

[28]  Jan-Michael Frahm,et al.  Towards Urban 3D Reconstruction from Video , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[29]  C. A. HART,et al.  Manual of Photogrammetry , 1947, Nature.

[30]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[31]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Olaf Hellwich,et al.  Genetic Algorithm SAmple Consensus (GASAC) - A Parallel Strategy for Robust Parameter Estimation , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[33]  H. Hirschmüller Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Stereo Processing by Semi-global Matching and Mutual Information , 2022 .

[34]  V. Rodehorst,et al.  EFFICIENT SEMI-GLOBAL MATCHING FOR TRINOCULAR STEREO , 2007 .

[35]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .