Handheld Augmented Reality involving gravity measurements

This article is a revised version of an earlier work on Gravity-Aware Handheld Augmented Reality (AR) (Kurz and Benhimane, 2011 [1]), which investigates how different stages in handheld AR applications can benefit from knowing the direction of the gravity measured with inertial sensors. It presents approaches to improve the description and matching of feature points, detection and tracking of planar templates, and the visual quality of the rendering of virtual 3D objects by incorporating the gravity vector. In handheld AR, both the camera and the display are located in the user's hand and therefore can be freely moved. The pose of the camera is generally determined with respect to piecewise planar objects that have a static and known orientation with respect to gravity. In the presence of (close to) vertical surfaces, we show how Gravity-Aligned Feature Descriptors (GAFDs) improve the initialization of tracking algorithms relying on feature point descriptor-based approaches in terms of quality and performance. For (close to) horizontal surfaces, we propose to use the gravity vector to rectify the camera image and detect and describe features in the rectified image. The resulting Gravity-Rectified Feature Descriptors (GREFDs) provide an improved precision-recall characteristic and enable faster initialization, in particular under steep viewing angles. Gravity-rectified camera images also allow for real-time 6 DoF pose estimation using an edge-based object detection algorithm handling only 4 DoF similarity transforms. Finally, the rendering of virtual 3D objects can be made more realistic and plausible by taking into account the orientation of the gravitational force in addition to the relative pose between the handheld device and a real object. In comparison to the original paper, this work provides a more elaborate evaluation of the presented algorithms. We propose a method enabling the evaluation of inertial-sensor aided visual tracking methods without real inertial sensor data. By synthesizing gravity measurements from ground truth camera poses, we benchmark our algorithms on a large existing dataset. Based on this approach, we also develop and evaluate a gravity-adaptive approach that performs image-rectification only when beneficial.

[1]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[2]  Dieter Schmalstieg,et al.  First steps towards handheld augmented reality , 2003, Seventh IEEE International Symposium on Wearable Computers, 2003. Proceedings..

[3]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[4]  Jiri Matas,et al.  Matching with PROSAC - progressive sample consensus , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Satoh Kiyohide,et al.  A Fast Initialization Method for Edge-based Registration Using an Inclination Constraint , 2008 .

[6]  Nassir Navab,et al.  A dataset and evaluation methodology for template-based tracking algorithms , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[7]  Richard Szeliski,et al.  Construction of Panoramic Image Mosaics with Global and Local Alignment , 2001 .

[8]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Vincent Lepetit,et al.  Fast Keypoint Recognition in Ten Lines of Code , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Steven K. Feiner,et al.  Rolling and shooting: two augmented reality games , 2010, CHI Extended Abstracts.

[11]  Marc Pollefeys,et al.  Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition , 2011, International Journal of Computer Vision.

[12]  Jorge Dias,et al.  Vision and Inertial Sensor Cooperation Using Gravity as a Vertical Reference , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[14]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[15]  Dieter Schmalstieg,et al.  Pose tracking from natural features on mobile phones , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[16]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Hideyuki Tamura,et al.  A hybrid registration method for outdoor augmented reality , 2001, Proceedings IEEE and ACM International Symposium on Augmented Reality.

[18]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[19]  Tom Drummond,et al.  Going out: robust model-based tracking for outdoor augmented reality , 2006, 2006 IEEE/ACM International Symposium on Mixed and Augmented Reality.

[20]  Richard Szeliski,et al.  Systems and Experiment Paper: Construction of Panoramic Image Mosaics with Global and Local Alignment , 2000, International Journal of Computer Vision.

[21]  Tom Drummond,et al.  Sensor fusion and occlusion refinement for tablet-based AR , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[22]  Vincent Lepetit,et al.  Video-Based In Situ Tagging on Mobile Phones , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Selim Benhimane,et al.  Benchmarking Inertial Sensor-Aided Localization and Tracking Methods , 2011 .

[24]  Selim Benhimane,et al.  Gravity-aware handheld Augmented Reality , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[25]  C. Steger OCCLUSION , CLUTTER , AND ILLUMINATION INVARIANT OBJECT RECOGNITION , 2002 .

[26]  Simon Baker,et al.  Equivalence and efficiency of image alignment algorithms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[27]  Selim Benhimane,et al.  Real-time image-based tracking of planes using efficient second-order minimization , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[28]  Jihad El-Sana,et al.  Shape recognition and pose estimation for mobile augmented reality , 2009, ISMAR.

[29]  Matthew Turk,et al.  Multisensory embedded pose estimation , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[30]  Oliver Bimber,et al.  Video see-through and optical tracking with consumer cell phones , 2004, SIGGRAPH '04.

[31]  Didier Stricker,et al.  Advanced tracking through efficient image processing and visual-inertial sensor fusion , 2008, 2008 IEEE Virtual Reality Conference.

[32]  K. Satoh,et al.  A hybrid and linear registration method utilizing inclination constraint , 2005, Fourth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR'05).

[33]  Florian Baumann,et al.  Ego-motion compensated face detection on a mobile device , 2011, CVPR 2011 WORKSHOPS.

[34]  Jan-Michael Frahm,et al.  3D model matching with Viewpoint-Invariant Patches (VIP) , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Tobias Höllerer,et al.  Evaluation of Interest Point Detectors and Feature Descriptors for Visual Tracking , 2011, International Journal of Computer Vision.

[36]  Selim Benhimane,et al.  Inertial sensor-aligned visual feature descriptors , 2011, CVPR 2011.

[37]  Didier Stricker,et al.  Advanced tracking through efficient image processing and visual-inertial sensor fusion , 2008, VR.

[38]  Vincent Lepetit,et al.  Noname manuscript No. (will be inserted by the editor) Learning Real-Time Perspective Patch Rectification , 2022 .