The MPEG7 Visual Search Solution for image recognition based positioning using 3D models

This paper describes a location algorithm for mobile phones based on image recognition. The use of image recognition based (IRB) positioning in mobile applications is characterized by the availability of a single camera; under this constraint, to estimate the camera position and orientation, a prior knowledge of 3D environment is needed in the form of a database of images with associated spatial information; this database can be built projecting the 3D model, acquired for instance with a LiDAR (Light Detection And Ranging), on a set of synthetic images. The herein proposed procedure to locate the camera can be divided in two steps, a first step is the selection from a database of the most similar image to the query image used to locate the camera, and a second step for estimation of the position and orientation of the camera based on available 3D information on the reference image. In designing the proposed location procedure, we have reused as much as possible the MPEG standard Compact Descriptors for Visual Search. For processing load optimization, similarly to the retrieval procedure defined by MPEG, we have introduced also in the position estimation step a preliminary statistical geometric check for coarse rejection of wrong matches (where a match represents two views in the respective images of the same point). We present the position and orientation accuracy results of the location methodology, for indoor and outdoor environment, that reaches respectively few decimeters and tenth of radians of precision

[1]  James L. Crowley,et al.  BetaSAC: A New Conditional Sampling For RANSAC , 2010, BMVC.

[2]  Jiri Matas,et al.  Matching with PROSAC - progressive sample consensus , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Torsten Sattler,et al.  SCRAMSAC: Improving RANSAC's efficiency with a spatial consistency filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Gianluca Francini,et al.  Statistical modelling of outliers for fast visual search , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[5]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[8]  Marco Piras,et al.  Indoor positioning using low cost GPS receivers: Tests and statistical analyses , 2010, 2010 International Conference on Indoor Positioning and Indoor Navigation.

[9]  Vincent Lepetit,et al.  Keypoint Signatures for Fast Learning and Recognition , 2008, ECCV.