Anytime Query Answering in RDF through Evolutionary Algorithms

We present a technique for answering queries over RDF data through an evolutionary search algorithm, using fingerprinting and Bloom filters for rapid approximate evaluation of generated solutions. Our evolutionary approach has several advantages compared to traditional database-style query answering. First, the result quality increases monotonically and converges with each evolution, offering "anytime" behaviour with arbitrary trade-off between computation time and query results; in addition, the level of approximation can be tuned by varying the size of the Bloom filters. Secondly, through Bloom filter compression we can fit large graphs in main memory, reducing the need for disk I/O during query evaluation. Finally, since the individuals evolve independently, parallel execution is straightforward. We present our prototype that evaluates basic SPARQL queries over arbitrary RDF graphs and show initial results over large datasets.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Dean Allemang,et al.  The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5-9, 2006, Proceedings , 2006, SEMWEB.

[3]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[4]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[5]  James A. Hendler,et al.  The Semantic Web — ISWC 2002 , 2002, Lecture Notes in Computer Science.

[6]  Ivan Jelínek,et al.  Benchmarking RDF Production Tools , 2007, DEXA.

[7]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[8]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[9]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[10]  Jorge Pérez,et al.  Minimal Deductive Systems for RDF , 2007, ESWC.

[11]  Andreas Harth,et al.  Optimized index structures for querying RDF from the Web , 2005, Third Latin American Web Congress (LA-WEB'2005).

[12]  Amit P. Sheth,et al.  SwetoDblp ontology of Computer Science publications , 2007, J. Web Semant..

[13]  Abraham Bernstein,et al.  The Fundamentals of iSPARQL: A Virtual Triple Approach for Similarity-Based Semantic Web Tasks , 2007, ISWC/ASWC.

[14]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[15]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[16]  Eyal Oren,et al.  An Evolutionary Perspective on Approximate RDF Query Answering , 2008, SUM.

[17]  Frank van Harmelen,et al.  Approximating Terminological Queries , 2002, FQAS.

[18]  Raphael Volz,et al.  A Comparison of RDF Query Languages , 2004, SEMWEB.

[19]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[20]  Michael Sintek,et al.  RDFBroker: A Signature-Based High-Performance RDF Store , 2006, ESWC.

[21]  Dave Reynolds,et al.  Efficient RDF Storage and Retrieval in Jena2 , 2003, SWDB.

[22]  Dave J. Beckett,et al.  The design and implementation of the redland RDF application framework , 2001, WWW '01.

[23]  Marc Parizeau,et al.  Genericity in Evolutionary Computation Software Tools: Principles and Case-study , 2006, Int. J. Artif. Intell. Tools.

[24]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..