Query Processing in a Self-Organized Storage System

The amount of scalability and robustness provided by current solutions for the storage and retrieval of data might not be sufficient to support ever-larger web applications. By using swarm intelligence to route operations in a distributed storage system, these limitations can be overcome. However, the possibilities for the efficient evaluation of complex queries in this kind of system are scarce and have not been researched yet. Based on a schema-less data model along with the building blocks for complex queries on this model, I present an approach for complex query processing as part of my PhD work. Here, complex queries are moved through the distributed storage system, while constantly being re-optimized using strictly local information. The approach is described along with an evaluation methodology and a test protocol. My goal is to contribute complex query processing for a fully distributed storage system with unreliable basic read operations and probabilistic directional content routing.

[1]  Philipp Obermeier,et al.  RDFSwarms: selforganized distributed RDF triple store , 2010, SAC '10.

[2]  Katja Hose,et al.  Distributed Query Processing in P2P Systems with Incomplete Schema Information , 2004, DIWeb.

[3]  Vassilis Christophides,et al.  Query Processing in RDF/S-Based P2P Database Systems , 2006, Semantic Web and Peer-to-Peer.

[4]  Michael Stonebraker,et al.  Distributed query processing in a relational data base system , 1978, SIGMOD Conference.

[5]  Ronaldo Menezes,et al.  A new approach to scalable Linda-systems based on swarms , 2003, SAC '03.

[6]  David R. Karger,et al.  Looking up data in P2P systems , 2003, CACM.

[7]  Robert Tolksdorf,et al.  A self-organized semantic storage service , 2010, iiWAS.

[8]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[9]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[10]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[11]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[12]  Thomas Stützle,et al.  A short convergence proof for a class of ant colony optimization algorithms , 2002, IEEE Trans. Evol. Comput..

[13]  Caspar Treijtel AmbientDB: Complex Query Processing for P2P Networks , 2003, VLDB PhD Workshop.

[14]  Erik Buchmann,et al.  Best Effort Query Processing in DHT-based P2P Systems , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[15]  Scott Shenker,et al.  Complex Queries in Dht-based Peer-to-peer Networks , 2002 .

[16]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[17]  David Maier,et al.  Distributed Query Processing and Catalogs for Peer-to-Peer Systems , 2003, CIDR.

[18]  Ulf Leser,et al.  Querying Distributed RDF Data Sources with SPARQL , 2008, ESWC.

[19]  Dominic Battré,et al.  Efficient query processing in DHT-based RDF stores , 2008 .