Keyword Search on Spatial Databases

Many applications require finding objects closest to a specified location that contains a set of keywords. For example, online yellow pages allow users to specify an address and a set of keywords. In return, the user obtains a list of businesses whose description contains these keywords, ordered by their distance from the specified address. The problems of nearest neighbor search on spatial data and keyword search on text data have been extensively studied separately. However, to the best of our knowledge there is no efficient method to answer spatial keyword queries, that is, queries that specify both a location and a set of keywords. In this work, we present an efficient method to answer top-k spatial keyword queries. To do so, we introduce an indexing structure called IR2-Tree (Information Retrieval R-Tree) which combines an R-Tree with superimposed text signatures. We present algorithms that construct and maintain an IR2-Tree, and use it to answer top-k spatial keyword queries. Our algorithms are experimentally compared to current methods and are shown to have superior performance and excellent scalability.

[1]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[2]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[3]  Torsten Suel,et al.  Efficient query processing in geographic web search engines , 2006, SIGMOD Conference.

[4]  Christos Faloutsos,et al.  Signature files: an access method for documents and its analytical performance evaluation , 1984, TOIS.

[5]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS '01.

[6]  Mark Sanderson,et al.  Spatio-textual Indexing for Geographical Search on the Web , 2005, SSTD.

[7]  David Salomon,et al.  Data Compression: The Complete Reference , 2006 .

[8]  Hyoung-Joo Kim,et al.  An Enhanced Technique for k-Nearest Neighbor Queries with Non-Spatial Selection Predicates , 2004, Multimedia Tools and Applications.

[9]  Walid G. Aref,et al.  SEA-CNN: scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases , 2005, 21st International Conference on Data Engineering (ICDE'05).

[10]  Christos Faloutsos,et al.  Design of a Signature File Method that Accounts for Non-Uniform Occurrence and Query Frequencies , 1985, VLDB.

[11]  Mário J. Silva,et al.  Indexing and ranking in Geo-IR systems , 2005, GIR '05.

[12]  Xing Xie,et al.  Hybrid index structures for location-based web search , 2005, CIKM '05.

[13]  Christos Faloutsos,et al.  Signature files: design and performance comparison of some signature extraction methods , 1985, SIGMOD Conference.

[14]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[15]  Christos Faloutsos,et al.  A survey of information retrieval and filtering methods , 1995 .

[16]  John L. Pfaltz,et al.  Partial-match retrieval using indexed descriptor files , 1980, CACM.

[17]  Uwe Deppisch,et al.  S-tree: a dynamic balanced signature index for office retrieval , 1986, SIGIR '86.

[18]  Hans-Jörg Schek,et al.  A Signature Access Method for the Starburst Database System , 1989, VLDB.

[19]  Andreas Henrich A Distance Scan Algorithm for Spatial Access Structures , 1994, ACM-GIS.

[20]  Luis Gravano,et al.  Evaluating top-k queries over web-accessible databases , 2004, TODS.

[21]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[22]  LeeDik Lun,et al.  Efficient Signature File Methods for Text Retrieval , 1995 .

[23]  Alan J. Broder Strategies for efficient incremental nearest neighbor search , 1990, Pattern Recognit..

[24]  Ricardo A. Baeza-Yates,et al.  Adding Compression to Block Addressing Inverted Indexes , 2000, Information Retrieval.

[25]  Yufei Tao,et al.  Continuous Nearest Neighbor Search , 2002, VLDB.

[26]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[27]  Kotagiri Ramamohanarao,et al.  Inverted files versus signature files for text indexing , 1998, TODS.

[28]  Malcolm Campbell The Design of Text Signatures for Text Retrieval Systems , 1994 .

[29]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[30]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[31]  Kotagiri Ramamohanarao,et al.  A two level superimposed coding scheme for partial match retrieval , 1983, Inf. Syst..