W-Ray: A Strategy to Publish Deep Web Geographic Data

This paper introduces an approach to address the problem of accessing conventional and geographic data from the Deep Web. The approach relies on describing the relevant data through well-structured sentences, and on publishing the sentences as Web pages, following the W3C and the Google recommendations. For conventional data, the sentences are generated with the help of database views. For vector data, the topological relationships between the objects represented are first generated, and then sentences are synthesized to describe the objects and their topological relationships. Lastly, for raster data, the geographic objects overlapping the bounding box of the data are first identified with the help of a gazetteer, and then sentences describing such objects are synthesized. The Web pages thus generated are easily indexed by traditional search engines, but they also facilitated the task of more sophisticated engines that support semantic search based on natural language features.

[1]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[2]  Anand Rajaraman Kosmix: High-Performance Topic Exploration using the Deep Web , 2009, Proc. VLDB Endow..

[3]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[4]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[5]  Kaarel Kaljurand,et al.  Attempto Controlled English for Knowledge Representation , 2008, Reasoning Web.

[6]  Jean Praninskas Rapid Review of English Grammar: A Text for Students of English As a Second Language , 1975 .

[7]  Gregg C. Vanderheiden,et al.  Web Content Accessibility Guidelines (WCAG) 2.0 , 2008 .

[8]  C. Halaschek-Wiener,et al.  Effective NL Paraphrasing of Ontologies on the Semantic Web , 2005 .

[9]  Sonia Bergamaschi,et al.  Schema Normalization for Improving Schema Matching , 2009, ER.

[10]  Zhiping Zheng,et al.  AnswerBus question answering system , 2002 .

[11]  Loredana Afanasiev,et al.  Harnessing the Deep Web: Present and Future , 2009, CIDR.

[12]  Luís Fernando Costa Esfinge - Resposta a perguntas usando a Rede , 2005 .

[13]  Antonio L. Furtado,et al.  W-Ray: A strategy to publish deep web geographic data , 2009 .

[14]  A. T. Schreiber,et al.  Semantic Annotation of Image Collections , 2003 .

[15]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[16]  King-Lup Liu,et al.  Building efficient and effective metasearch engines , 2002, CSUR.

[17]  Marco A. Casanova,et al.  A Software Architecture for Automated Geographic Metadata Annotation Generation , 2007 .

[18]  Christian Kop,et al.  Guideline based evaluation and verbalization of OWL class and property labels , 2010, Data Knowl. Eng..

[19]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[20]  Jayant Madhavan,et al.  Google's Deep Web crawl , 2008, Proc. VLDB Endow..