Hybrid Indexes to Expedite Spatial-Visual Search

Due to the growth of geo-tagged images, recent web and mobile applications provide search capabilities for images that are similar to a given query image and simultaneously within a given geographical area. In this paper, we focus on designing index structures to expedite these spatial-visual searches. We start by baseline indexes that are straightforward extensions of the current popular spatial (R*-tree) and visual (LSH) index structures. Subsequently, we propose hybrid index structures that evaluate both spatial and visual features in tandem. The unique challenge of this type of query is that there are inaccuracies in both spatial and visual features. Therefore, different traversals of the index structures may produce different images as output, some of which more relevant to the query than the others. We compare our hybrid structures with a set of baseline indexes in both performance and result accuracy using three real world datasets from Flickr, Google Street View, and GeoUGV.

[1]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[2]  Alan F. Smeaton,et al.  Automatically augmenting lifelog events using pervasively generated content from millions of people. , 2010 .

[3]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[4]  Mark S. Nixon,et al.  Feature extraction & image processing for computer vision , 2012 .

[5]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[6]  Mark Sanderson,et al.  Spatio-textual Indexing for Geographical Search on the Web , 2005, SSTD.

[7]  Cyrus Shahabi,et al.  GeoUGV: user-generated mobile video dataset with fine granularity spatial metadata , 2016, MMSys.

[8]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[11]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[12]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[13]  Chen Li,et al.  Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents , 2010, DEXA.

[14]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[15]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[16]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[17]  Alan F. Smeaton,et al.  Automatically Augmenting Lifelog Events Using Pervasively Generated Content from Millions of People , 2010, Sensors.

[18]  Zhe Wang,et al.  Modeling LSH for performance tuning , 2008, CIKM '08.

[19]  Mubarak Shah,et al.  Image Geo-Localization Based on MultipleNearest Neighbor Feature Matching UsingGeneralized Graphs , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[21]  Cyrus Shahabi,et al.  Efficient indexing and retrieval of large-scale geo-tagged video databases , 2016, GeoInformatica.

[22]  Christopher Joseph Pal,et al.  YouTube Scale, Large Vocabulary Video Annotation , 2010, Video Search and Mining.

[23]  Timos K. Sellis,et al.  Efficient Cost Models for Spatial Queries Using R-Trees , 2000, IEEE Trans. Knowl. Data Eng..

[24]  Xing Xie,et al.  Location sensitive indexing for image-based advertising , 2009, MM '09.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[28]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[29]  Xing Xie,et al.  Hybrid index structures for location-based web search , 2005, CIKM '05.

[30]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[31]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[32]  Pengpeng Zhao,et al.  Scalable Top- k Spatial Image Search on Road Networks , 2015, DASFAA.

[33]  Chen Li,et al.  Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[34]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.