Topical and Spatio-Temporal Search over Distributed Online Databases

In this chapter, the authors propose a novel framework for the support of multi-faceted searches over distributed Web-accessible databases. Towards this goal, the authors introduce a method for analyzing and processing a sample of the database contents in order to deduce the topical, the geographic, and the temporal orientation of the entire database contents. To extract the database topics, the authors apply techniques leveraged from the NLP community. To identify the database geographic footprints, the authors first rely on geographic ontologies in order to extract toponyms from the database content samples and then employ geo-spatial similarity metrics to estimate the geographic coverage of the identified toponyms. Finally, to determine the time aspects associated with the database entities, the authors extract temporal expressions from the entities’ contextual elements and utilize a time ontology against which the temporal similarity between the identified entities is estimated. DOI: 10.4018/978-1-61692-868-1.ch013

[1]  Petros Zerfos,et al.  Downloading textual hidden web content through keyword queries , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[2]  Guoray Cai,et al.  GeoVSM: An Integrated Retrieval Model for Geographic Information , 2002, GIScience.

[3]  Jian Xu,et al.  Database selection techniques for routing bibliographic queries , 1998, DL '98.

[4]  Carlo Strapparava,et al.  Unsupervised and supervised exploitation of semantic domains in lexical disambiguation , 2004, Comput. Speech Lang..

[5]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[6]  Luis Gravano,et al.  Classification-aware hidden-web text database selection , 2008, TOIS.

[7]  Suk I. Yoo,et al.  Text Database Discovery on the Web: Neural Net Based Approach , 2004, Journal of Intelligent Information Systems.

[8]  Norbert Fuhr,et al.  A decision-theoretic approach to database selection in networked IR , 1999, TOIS.

[9]  John Morgan,et al.  A Neural Network for Modeling Multicategorical Parcel Use Change , 2011, Int. J. Appl. Geospat. Res..

[10]  James P. Callan,et al.  Collection selection and results merging with topically organized U.S. patents and TREC data , 2000, CIKM '00.

[11]  Milad Shokouhi,et al.  Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval , 2007, ECIR.

[12]  Robert Dale,et al.  The DANTE Temporal Expression Tagger , 2009, LTC.

[13]  King-Lup Liu,et al.  Efficient and effective metasearch for text databases incorporating linkages among documents , 2001, SIGMOD '01.

[14]  Luis Gravano,et al.  Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection , 2002, VLDB.

[15]  Luis Gravano,et al.  Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.

[16]  Luo Si,et al.  Modeling search engine effectiveness for federated search , 2005, SIGIR '05.

[17]  Jérôme Gensel,et al.  A Multidimensional Model for Correct Aggregation of Geographic Measures , 2010 .

[18]  King-Lup Liu,et al.  Determining Text Databases to Search in the Internet , 1998, VLDB.

[19]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[20]  Uznir Ujang,et al.  3D Hilbert Space Filling Curves in 3D City Modeling for Faster Spatial Queries , 2014, Int. J. 3 D Inf. Model..

[21]  Arthur Getis,et al.  Research Commentary: Increasing the Flexibility of Legacy Systems , 2011, Int. J. Appl. Geospat. Res..

[22]  Oren Etzioni,et al.  Query routing for Web search engines: architecture and experiments , 2000, Comput. Networks.

[23]  Paolo Rosso,et al.  Geo-WordNet: Automatic Georeferencing of WordNet , 2008, LREC.