Efficient Content-based Image Retrieval for Position Estimation on GPU

We propose an efficient content-based image retrieval (CBIR) method for position estimation of mobile devices. The idea is to use videos of first-person vision associated with geographical position information as the database. When a user sends a current subjective image, the system estimates the position using CBIR. Since features extracted from images are in general high-dimensional vectors, thousands of vectors are extracted even from a single image, resulting in high processing cost. To tackle this problem, we have proposed a method in which features are compressed using LSH, and GPU is used for accelerating CBIR. Nevertheless, it suffered from performance degradation due to write conflicts among different threads. This paper presents an improved method which avoids write conflicts and modifies LSH algorithm to improve accuracy. Also, we demonstrate the efficiency and accuracy of the proposed scheme through experiments using a video dataset.

[1]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[4]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[5]  Hiroyuki Kitagawa,et al.  GPU Acceleration of Content-Based Image Retrieval Based on SIFT Descriptors , 2016, 2016 19th International Conference on Network-Based Information Systems (NBiS).

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[8]  Jun Ishikawa,et al.  Indoor-Outdoor Navigation System for Visually-Impaired Pedestrians: Preliminary Evaluation of Position Measurement and Obstacle Display , 2011, 2011 15th Annual International Symposium on Wearable Computers.

[9]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[10]  Yuichi Ohta,et al.  Image Retrieval of First-Person Vision for Pedestrian Navigation in Urban Area , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Nobuo Ezaki,et al.  A Spot Reminder System for the Visually Impaired Based on a Smartphone Camera , 2017, Sensors.

[12]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[13]  Adrien Bartoli,et al.  Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[14]  Hanqing Lu,et al.  Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Sam S. Tsai,et al.  Survey of SIFT Compression Schemes , 2010 .

[16]  Ali Cevahir,et al.  GPU-Enabled High Performance Online Visual Search with High Accuracy , 2012, 2012 IEEE International Symposium on Multimedia.

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.