Reconstruction and Accurate Alignment of Feature Maps for Augmented Reality

This paper focuses on the preparative process of retrieving accurate feature maps for a camera-based tracking system. With this system it is possible to create ready-to use Augmented Reality applications with a very easy setup work-flow, which in practice only involves three steps: filming the object or environment from various viewpoints, defining a transformation between the reconstructed map and the target coordinate frame based on a small number of 3D-3D correspondences and, finally, initiating a feature learning and Bundle Adjustment step. Technically, the solution comprises several sub-algorithms. Given the image sequence provided by the user, a feature map is initially reconstructed and incrementally extended using a Simultaneous-Localization-and-Mapping (SLAM) approach. For the automatic initialization of the SLAM module, a method for detecting the amount of translation is proposed. Since the initially reconstructed map is defined in an arbitrary coordinate system, we present a method for optimally aligning the feature map to the target coordinated frame of the augmentation models based on 3D-3D correspondences defined by the user. As an initial estimate we solve for a rigid transformation with scaling, known as Absolute Orientation. For refinement of the alignment we present a modification of the well-known Bundle Adjustment, where we include these 3D-3D-correspondences as constraints. Compared to ordinary Bundle Adjustment we show that this leads to significantly more accurate reconstructions, since map deformations due to systematic errors such as small camera calibration errors or outliers are well compensated. This again results in a better alignment of the augmentations during run-time of the application, even in large-scale environments.

[1]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Vincent Lepetit,et al.  Feature Harvesting for Tracking-by-Detection , 2006, ECCV.

[3]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[4]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[5]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[6]  Walterio W. Mayol-Cuevas,et al.  Real-Time and Robust Monocular SLAM Using Predictive Multi-resolution Descriptors , 2006, ISVC.

[7]  Ian D. Reid,et al.  Real-Time SLAM Relocalisation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Vincent Lepetit,et al.  Randomized trees for real-time keypoint recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Naokazu Yokoya,et al.  Extrinsic Camera Parameter Recovery from Multiple Image Sequences Captured by an Omni-Directional Multi-camera System , 2004, ECCV.

[10]  David Nister,et al.  Bundle Adjustment Rules , 2006 .

[11]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[12]  Richard I. Hartley,et al.  In Defense of the Eight-Point Algorithm , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ulrich Neumann,et al.  Extendible tracking by line auto-calibration , 2001, Proceedings IEEE and ACM International Symposium on Augmented Reality.

[14]  David W. Murray,et al.  Real-time localization and mapping with wearable active vision , 2003, The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings..

[15]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Naokazu Yokoya,et al.  Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera , 2004, International Journal of Computer Vision.

[18]  Michal Havlena,et al.  Measuring camera translation by the dominant apical angle , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[20]  K. S. Arun,et al.  Least-Squares Fitting of Two 3-D Point Sets , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Didier Stricker,et al.  Online camera pose estimation in partially known and dynamic scenes , 2006, 2006 IEEE/ACM International Symposium on Mixed and Augmented Reality.