Query evaluation with asymmetric web services

KnowItAll [4], and others have successfully constructed semantic knowledge bases of large scale. Factual knowledge is typically represented in RDF, the W3C standard for Semantic-Web contents. RDF data can be seen as a graph whose nodes are entities These knowledge bases can be queried using the W3C-endorsed SPARQL [33] language. Yet, a knowledge base about entities can never be fully complete or always up to date. With the ANGIE system [23], we have shown that Web services can step in to fill this gap. Web services lend themselves to the extension of knowledge bases, because they deliver structured data. This eliminates the need for noisy information extraction techniques. Furthermore, there are Web services that offer a wide repertoire of data of good quality, well maintained and up to date. This makes Web services an interesting device for complementing knowledge bases. The ANGIE system incorporates Web services as follows: When a user asks a query, ANGIE tries to find the answer in the local knowledge base and resorts to Web services whenever the local knowledge base is not sufficient. ANGIE composes Web services and data from the local knowledge base on the fly, so that the user does not notice that some of the data was not present in the knowledge base before. For example, assume that the user asks for all songs by Canadian singers:

[1]  Laura M. Haas,et al.  Clio: Schema Mapping Creation and Data Exchange , 2009, Conceptual Modeling: Foundations and Applications.

[2]  Anand Rajaraman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS.

[3]  Alin Deutsch,et al.  Rewriting queries using views with access patterns under integrity constraints , 2005, Theor. Comput. Sci..

[4]  Wolfgang Gatterbauer,et al.  Towards domain-independent information extraction from web tables , 2007, WWW '07.

[5]  Raghav Kaushik,et al.  A grammar-based entity representation framework for data cleaning , 2009, SIGMOD Conference.

[6]  Vagelis Hristidis,et al.  Syntactic Rule Based Approach toWeb Service Composition , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[8]  HAMISH CUNNINGHAM,et al.  Software architecture for language engineering , 2000 .

[9]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..

[10]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[11]  Hyoil Han,et al.  A survey on ontology mapping , 2006, SGMD.

[12]  Wei-Ying Ma,et al.  2D Conditional Random Fields for Web information extraction , 2005, ICML.

[13]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[14]  Gerhard Weikum,et al.  NAGA: Searching and Ranking Knowledge , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  Wei-Ying Ma,et al.  Simultaneous record detection and attribute labeling in web data extraction , 2006, KDD '06.

[16]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[17]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[18]  Michael R. Genesereth,et al.  Answering recursive queries using views , 1997, PODS '97.

[19]  Luis Gravano,et al.  Snowball: a prototype system for extracting relations from large text collections , 2001, SIGMOD '01.

[20]  Alin Deutsch,et al.  Specification and verification of data-driven web services , 2004, PODS.

[21]  Marios D. Dikaiakos,et al.  MashQL: a query-by-diagram topping SPARQL , 2008, ONISW '08.

[22]  Volker Markl,et al.  Damia: data mashups for intranet applications , 2008, SIGMOD Conference.

[23]  Daisy Zhe Wang,et al.  Uncovering the Relational Web , 2008, WebDB.

[24]  Wei-Kuan Shih,et al.  Semantic search on Internet tabular information extraction for answering queries , 2000, CIKM '00.

[25]  Sunita Sarawagi,et al.  Information Extraction , 2008 .

[26]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[27]  Gerhard Weikum,et al.  Active knowledge: dynamically enriching RDF knowledge bases by web services , 2010, SIGMOD Conference.

[28]  Edward Fredkin,et al.  Trie memory , 1960, Commun. ACM.

[29]  Daniel S. Weld,et al.  Planning to Gather Information , 1996, AAAI/IAAI, Vol. 1.

[30]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[31]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[32]  Gerhard Weikum,et al.  SOFIE: a self-organizing framework for information extraction , 2009, WWW '09.

[33]  Subbarao Kambhampati,et al.  Optimizing Recursive Information Gathering Plans in EMERAC , 2004, Journal of Intelligent Information Systems.

[34]  Craig A. Knoblock,et al.  Composing, optimizing, and executing plans for bioinformatics web services , 2005, The VLDB Journal.

[35]  Jayant Madhavan,et al.  Harvesting relational tables from lists on the web , 2009, The VLDB Journal.

[36]  Daniel S. Weld,et al.  Planning to gather inforrnation , 1996, AAAI 1996.