Query processing in a DHT-based universal storage: the world as a peer-to-peer database

With the growth of the Internet, novel applications scale significantly in both, data and the number of users. In parallel, the size and local expansion of underlying network infrastructures increase as well. A wide range of such applications provides access to huge data collections to a likewise huge number of participants. If these data are of public interest and open to the public, according applications are classified as Public Data Management. Often, data are also open to extension and modification. Thus, data get into existence through collaboration and are maintained by the community. Structured data and according access methods are often a mandatory requirement to make these systems work. In doing so, a high degree of heterogeneity is very probable. Features known from traditional database and data integration systems are often wanted, expected, or even required to handle such data collections meaningfully. The applicability of centralized approaches is limited in scale and cannot keep with the ever-growing requirements. Distribution of data and load is an intuitive and mandatory consequence. Distribution in small scale, such as in clusters and locally encapsulated systems, gets to its limits as well. The “kill it with iron” strategy is not applicable, because often no instance or single user is willed and enabled to bear the immense costs of resulting systems. In turn, Public Data Management applications also result in lowered requirements, such as relaxed guarantees in consistency and availability. This turns the idea of a total distribution of all data and load into a promising and feasible approach. Even if existing systems already provide according features, these are in general by far simplified and strongly limited. A wide range of requirements of database-like distributed systems is still neither supported nor enabled. This work focuses on the requirements and unsolved challenges that occur specifically in the context of query processing. Besides the pure efficiency of query processing, this also includes issues like query expressiveness and practicability of certain concepts. We first identify these and other challenges and motivate the contributions of this work. Afterwards, we propose according techniques and approaches for efficient query processing in widely distributed systems. We suggest a layered architecture that makes use of the advantages of decentralized Peer-to-Peer approaches. Based on this, a vertical triple model, and an according query algebra, we propose a physical query algebra. This algebra comprises operators that make excessive use of parallelism and other paradigms of distributed query processing. Further, we present techniques for processing complex query plans that represent user queries. The query framework is extended by an appropriate cost model to enable meaningful query planning. We also discuss guarantees and predictability. In this context, we introduce a light-weight and flexible approach for estimating the completeness of partly query answers. An extensive evaluation based on a reference implementation shows the applicability and correctness of the proposed concepts. This highlights the valuable contribution for the development of large-scale distributed data systems. Marcel Karnstedt: Query Processing in a DHT-Based Universal Storage vii

[1]  Steve Battle Gloze : XML to RDF and back again , 2006 .

[2]  Erik Buchmann,et al.  Best Effort Query Processing in DHT-based P2P Systems , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[3]  Roman Schmidt,et al.  Sharing of probabilistically correlated data in peer-to-peer networks , 2009 .

[4]  C. J. Date,et al.  The third manifesto , 1995, SGMD.

[5]  Peter A. Boncz,et al.  AmbientDB: Relational Query Processing in a P2P Network , 2003, DBISP2P.

[6]  Divyakant Agrawal,et al.  Approximate Range Selection Queries in Peer-to-Peer Systems , 2003, CIDR.

[7]  Diomidis Spinellis,et al.  A survey of peer-to-peer content distribution technologies , 2004, CSUR.

[8]  Katja Hose,et al.  Decentralized managing of replication objects in massively distributed systems , 2008, DaMaP '08.

[9]  Eric A. Brewer,et al.  Towards robust distributed systems (abstract) , 2000, PODC '00.

[10]  Beng Chin Ooi,et al.  DB-Enabled Peers for Managing Distributed Data , 2003, APWeb.

[11]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[12]  Karl Aberer,et al.  Range queries in trie-structured overlays , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[13]  Athman Bouguettaya,et al.  An overview of multidatabase systems: past and present , 1998 .

[14]  Tim Kraska,et al.  Building a database on S3 , 2008, SIGMOD Conference.

[15]  Evaggelia Pitoura,et al.  Peer-to-peer management of XML data: issues and research challenges , 2005, SGMD.

[16]  Ulf Leser,et al.  Querying Distributed RDF Data Sources with SPARQL , 2008, ESWC.

[17]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[18]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[19]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[20]  David J. DeWitt,et al.  Locating Data Sources in Large Distributed Systems , 2003, VLDB.

[21]  David Maier,et al.  Distributed queries without distributed state , 2002, WebDB.

[22]  Alfons Kemper,et al.  StreamGlobe: Processing and Sharing Data Streams in Grid-Based P2P Infrastructures , 2005, VLDB.

[23]  Karl Aberer,et al.  Start making sense: The Chatty Web approach for global semantic , 2011 .

[24]  Scott Shenker,et al.  Querying the Internet with PIER , 2003, VLDB.

[25]  David E. Culler,et al.  PlanetLab: an overlay testbed for broad-coverage services , 2003, CCRV.

[26]  Heiner Stuckenschmidt,et al.  Index structures and algorithms for querying distributed RDF repositories , 2004, WWW '04.

[27]  Felix Naumann,et al.  Completeness of integrated information sources , 2004, Inf. Syst..

[28]  Anthony K. H. Tung,et al.  Skyframe: a framework for skyline query processing in peer-to-peer systems , 2008, The VLDB Journal.

[29]  Krys J. Kochut,et al.  SPARQLeR: Extended Sparql for Semantic Association Discovery , 2007, ESWC.

[30]  Beng Chin Ooi,et al.  Skyline Queries Against Mobile Lightweight Devices in MANETs , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[31]  Wolfgang Nejdl,et al.  Distributed Queries and Query Optimization in Schema-Based P2P-Systems , 2003, DBISP2P.

[32]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[33]  Vassilis Christophides,et al.  Viewing the Semantic Web through RVL Lenses , 2003, SEMWEB.

[34]  S. Boag,et al.  XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003 , 2003 .

[35]  Jeanne W. Ross,et al.  Preparing for utility computing: The role of IT architecture and relationship management , 2004, IBM Syst. J..

[36]  Katja Hose,et al.  Processing Top-N Queries in P2P-based Web Integration Systems with Probabilistic Guarantees , 2005, WebDB.

[37]  Vassilis Christophides,et al.  RQL: a declarative query language for RDF , 2002, WWW.

[38]  Tadao Takaoka,et al.  Approximate Pattern Matching with Samples , 1994, ISAAC.

[39]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[40]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[41]  Katja Hose,et al.  An Extensible, Distributed Simulation Environment for Peer Data Management Systems , 2006, EDBT.

[42]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[43]  Karl Aberer,et al.  Advanced Peer-to-Peer Networking: The P-Grid System and its Applications , 2003, PIK Prax. Informationsverarbeitung Kommun..

[44]  Manfred Hauswirth,et al.  Similarity Queries on Structured Data in Structured Overlays , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[45]  Marcel Karnstedt,et al.  Cost-Aware Skyline Queries in Structured Overlays , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[46]  Beng Chin Ooi,et al.  PeerDB: a P2P-based system for distributed data sharing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[47]  Kai-Uwe Sattler,et al.  Concept-based querying in mediator systems , 2005, The VLDB Journal.

[48]  Karl Aberer Scalable Data Access in Peer-to-Peer Systems Using Unbalanced Search Trees , 2002, WDAS.

[49]  Erik Buchmann,et al.  A Physical Query Algebra for DHT-based P 2 P Systems , 2004 .

[50]  Katja Hose,et al.  Processing rank aware queries in schema based P2P systems , 2009 .

[51]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[52]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[53]  Surajit Chaudhuri,et al.  Estimating Progress of Long Running SQL Queries , 2004, SIGMOD Conference.

[54]  Verena Kantere,et al.  The hyperion project: from data integration to data coordination , 2003, SGMD.

[55]  Martin Richtarsky CouPé: Ein Query Processor für UniStore , 2007, BTW Studierendenprogramm.

[56]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[57]  David Maier,et al.  Mutant Query Plans , 2002, Inf. Softw. Technol..

[58]  Amihai Motro,et al.  Completeness Information and Its Application to Query Processing , 1986, VLDB.

[59]  Farnoush Banaei-Kashani,et al.  SWAM : A Family of Access Methods for Similarity Search in Querical Data Networks , 2004 .

[60]  Beng Chin Ooi,et al.  One table stores all: Enabling painless free-and-easy data publishing and sharing , 2007, CIDR.

[61]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[62]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[63]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[64]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[65]  Luis Gravano,et al.  Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.

[66]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[67]  Michael Gertz,et al.  Authentic Data Publication Over the Internet , 2003, J. Comput. Secur..

[68]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[69]  M. Hauswirth,et al.  Efficient processing of rare queries in Gnutella using a hybrid infrastructure ? , 2007 .

[70]  Michael B. Jones,et al.  SkipNet: A Scalable Overlay Network with Practical Locality Properties , 2003, USENIX Symposium on Internet Technologies and Systems.

[71]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[72]  Stefano Ceri,et al.  Horizontal data partitioning in database design , 1982, SIGMOD '82.

[73]  Pedro A. Szekely,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Proceedings. First Latin American Web Congress.

[74]  Beng Chin Ooi,et al.  PeerDB: peering into personal databases , 2003, SIGMOD '03.

[75]  B. Pierce,et al.  Chaining , Referral , Subscription , Leasing : New Mechanisms in Distributed Query Optimization , 2000 .

[76]  Manfred Hauswirth,et al.  Cost-Aware Processing of Similarity Queries in Structured Overlays , 2006, Sixth IEEE International Conference on Peer-to-Peer Computing (P2P'06).

[77]  Felix Heine Scalable p2p based RDF querying , 2006, InfoScale '06.

[78]  Luis Gravano,et al.  Top-k selection queries over relational databases: Mapping strategies and performance evaluation , 2002, TODS.

[79]  Alon Y. Halevy,et al.  Piazza: data management infrastructure for semantic web applications , 2003, WWW '03.

[80]  Aris M. Ouksel,et al.  Distributed databases and peer-to-peer databases: past and present , 2008, SGMD.

[81]  Nicholas Kushmerick,et al.  Automated index management for distributed web search , 2003, CIKM '03.

[82]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[83]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[84]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[85]  Manfred Hauswirth,et al.  Approximating query completeness by predicting the number of answers in DHT-based web applications , 2008, WIDM '08.

[86]  Ryan Huebsch,et al.  PIER on PlanetLab: Initial Experience and Open Problems , 2003 .

[87]  Alfons Kemper,et al.  ObjectGlobe: Ubiquitous query processing on the Internet , 2001, The VLDB Journal.

[88]  Zhichen Xu,et al.  PeerSearch: Efficient Information Retrieval in Peer-to-Peer Networks , 2002 .

[89]  Wang-Chien Lee,et al.  Efficient progressive processing of skyline queries in peer-to-peer systems , 2006, InfoScale '06.

[90]  Tim Moors,et al.  Survey of research towards robust peer-to-peer networks: Search methods , 2006, Comput. Networks.

[91]  Kai-Uwe Sattler,et al.  Supporting Similarity Operations Based on Approximate String Matching on the Web , 2004, CoopIS/DOA/ODBASE.

[92]  Tore Risch,et al.  Scalable Distributed Data Structures for High-Performance Databases , 2000, WDAS.

[93]  Theoni Pitoura,et al.  Towards a Unifying Framework for Complex Query Processing over Structured Peer-to-Peer Data Networks , 2003, DBISP2P.

[94]  David R. Karger,et al.  Looking up data in P2P systems , 2003, CACM.

[95]  Stéphane Bressan,et al.  Efficient Range Queries and Fast Lookup Services for Scalable P2P Networks , 2004, DBISP2P.

[96]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[97]  John Kubiatowicz,et al.  Extracting guarantees from chaos , 2003, CACM.

[98]  Karl Aberer,et al.  GridVine: Building Internet-Scale Semantic Overlay Networks , 2004, SEMWEB.

[99]  Marcel Waldvogel,et al.  Bringing efficient advanced queries to distributed hash tables , 2004, 29th Annual IEEE International Conference on Local Computer Networks.

[100]  Mayank Bawa,et al.  LSH forest: self-tuning indexes for similarity search , 2005, WWW '05.

[101]  Karl Aberer,et al.  P-Grid: A Self-Organizing Access Structure for P2P Information Systems , 2001, CoopIS.

[102]  Richard Wolski,et al.  Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[103]  Donald D. Chamberlin,et al.  SEQUEL: A structured English query language , 1974, SIGFIDET '74.

[104]  David Moore,et al.  Replication Strategies for Highly Available Peer-to-Peer Storage , 2002, Future Directions in Distributed Computing.

[105]  Jon Crowcroft,et al.  A survey and comparison of peer-to-peer overlay network schemes , 2005, IEEE Communications Surveys & Tutorials.

[106]  GhemawatSanjay,et al.  The Google file system , 2003 .

[107]  Justin Cappos,et al.  San Fermín: Aggregating Large Data Sets Using a Binomial Swap Forest , 2008, NSDI.

[108]  Jonathan Kirsch,et al.  Load balancing and locality in range-queriable data structures , 2004, PODC '04.

[109]  Vassilis Christophides,et al.  Query Processing in RDF/S-Based P2P Database Systems , 2006, Semantic Web and Peer-to-Peer.

[110]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[111]  Karl Aberer,et al.  Efficient Processing of XPath Queries with Structured Overlay Networks , 2005, OTM Conferences.

[112]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[113]  Scott Shenker,et al.  Range Queries over DHTs , 2003 .

[114]  William W. Cohen Data integration using similarity joins and a word-based information representation language , 2000, TOIS.

[115]  Joseph M. Hellerstein,et al.  Toward network data independence , 2003, SGMD.

[116]  Karl Aberer,et al.  Indexing Data-oriented Overlay Networks , 2005, VLDB.

[117]  Gerhard Weikum,et al.  Top-k Query Evaluation with Probabilistic Guarantees , 2004, VLDB.

[118]  David J. DeWitt,et al.  Tuple Routing Strategies for Distributed Eddies , 2003, VLDB.

[119]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[120]  David Maier,et al.  On the foundations of the universal relation model , 1984, TODS.

[121]  Karl Aberer,et al.  Probabilistic Message Passing in Peer Data Management Systems , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[122]  Alon Y. Levy Obtaining Complete Answers from Incomplete Databases , 1996, VLDB 1996.

[123]  Vagelis Hristidis,et al.  Algorithms and applications for answering ranked queries using ranked views , 2003, The VLDB Journal.

[124]  Xiuqi Li,et al.  Searching Techniques in Peer-to-Peer Networks , 2005, Handbook on Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless, and Peer-to-Peer Networks.

[125]  Antony I. T. Rowstron,et al.  Delay aware querying with Seaweed , 2007, The VLDB Journal.

[126]  Beng Chin Ooi,et al.  BestPeer: a self-configurable peer-to-peer system , 2002, Proceedings 18th International Conference on Data Engineering.

[127]  Scott Shenker,et al.  The Architecture of PIER: an Internet-Scale Query Processor , 2005, CIDR.

[128]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[129]  Alfons Kemper,et al.  Hyperqueries: Dynamic Distributed Query Processing on the Internet , 2001, VLDB.

[130]  Bharadwaj Veeravalli,et al.  A Robust Spanning Tree Topology for Data Collection and Dissemination in Distributed Environments , 2007, IEEE Transactions on Parallel and Distributed Systems.

[131]  Beng Chin Ooi,et al.  Just-in-time query retrieval over partially indexed data on structured P2P overlays , 2008, SIGMOD Conference.

[132]  Karl Aberer,et al.  Infrastructure for Data Processing in Large-Scale Interconnected Sensor Networks , 2007, 2007 International Conference on Mobile Data Management.

[133]  Felix Naumann,et al.  System P: Query Answering in PDMS under Limited Resources , 2006 .

[134]  Clement T. Yu,et al.  Priniples of Database Query Processing for Advanced Applications , 1997 .

[135]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[136]  Karl Aberer,et al.  Updates in highly unreliable, replicated peer-to-peer systems , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[137]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[138]  Ricardo A. Baeza-Yates,et al.  A Practical q -Gram Index for Text Retrieval Allowing Errors , 2018, CLEI Electron. J..

[139]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[140]  Tore Risch,et al.  EDUTELLA: a P2P networking infrastructure based on RDF , 2002, WWW.

[141]  Bernhard Bauer,et al.  HiSbase: Histogram-based P2P Main Memory Data Management , 2007, VLDB.

[142]  Siegfried Handschuh,et al.  The NEPOMUK Project - On the way to the Social Semantic Desktop , 2007 .

[143]  Zhe Wang,et al.  Efficient top-K query calculation in distributed networks , 2004, PODC '04.

[144]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[145]  Karl Aberer,et al.  Self-organized construction of distributed access structures: A comparative evaluation of P-Grid and FreeNet , 2002 .

[146]  Duc A. Tran A Hierarchical Semantic Overlay Approach to P2P Similarity Search , 2005, USENIX Annual Technical Conference, General Track.

[147]  Scott Shenker,et al.  Enhancing P2P File-Sharing with an Internet-Scale Query Processor , 2004, VLDB.

[148]  Marcel Karnstedt,et al.  Completeness Estimation of Range Queries in Structured Overlays , 2007 .

[149]  Katja Hose,et al.  Processing relaxed skylines in PDMS using distributed data summaries , 2006, CIKM '06.

[150]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[151]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[152]  Andy Seaborne,et al.  Three Implementations of SquishQL, a Simple RDF Query Language , 2002, SEMWEB.

[153]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[154]  Karl Aberer,et al.  GridVine: An Infrastructure for Peer Information Management , 2007, IEEE Internet Computing.

[155]  Karl Aberer,et al.  Efficient Search in Structured Peer-to-Peer Systems: Binary v.s. K-Ary Unbalanced Tree Structures , 2003 .

[156]  Rajeev Motwani,et al.  The price of validity in dynamic networks , 2004, SIGMOD '04.

[157]  Anne-Marie Kermarrec,et al.  Peer counting and sampling in overlay networks: random walk methods , 2006, PODC '06.

[158]  Norman W. Paton,et al.  Adaptive Query Processing: A Survey , 2002, BNCOD.

[159]  Werner Vogels,et al.  Data Access Patterns in The Amazon.com Technology Platform , 2007, VLDB.

[160]  Stefan Saroiu,et al.  Finding Content in File-Sharing Networks When You Can't Even Spell , 2007, IPTPS.

[161]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[162]  Yuhong Xiong,et al.  Estimating device availability in pervasive peer-to-peer environment , 2004, Proceedings. 10th IEEE International Workshop on Future Trends of Distributed Computing Systems, 2004. FTDCS 2004..

[163]  Odej Kao,et al.  Processing complex RDF queries over P2P networks , 2005, P2PIR '05.

[164]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[165]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[166]  Martin Richtarsky,et al.  UniStore: Querying a DHT-based Universal Storage , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[167]  Jeff Z. Pan,et al.  Querying the Semantic Web with Preferences , 2006, SEMWEB.

[168]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[169]  Felix Naumann,et al.  A research agenda for query processing in large-scale peer data management systems , 2008, Inf. Syst..

[170]  Wolf-Tilo Balke,et al.  Progressive distributed top-k retrieval in peer-to-peer networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[171]  Luis Gravano,et al.  Evaluating top-k queries over Web-accessible databases , 2002, Proceedings 18th International Conference on Data Engineering.

[172]  Ben Y. Zhao,et al.  Parallelizing Skyline Queries for Scalable Distribution , 2006, EDBT.

[173]  Beng Chin Ooi,et al.  BATON: A Balanced Tree Structure for Peer-to-Peer Networks , 2005, VLDB.

[174]  Divyakant Agrawal,et al.  A peer-to-peer framework for caching range queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[175]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[176]  Dan Suciu,et al.  Dynamically distributed query evaluation , 2001, PODS.

[177]  Karl Aberer,et al.  P2P information systems , 2002, Proceedings / International Conference on Data Engineering.

[178]  Sriram Ramabhadran,et al.  Brief announcement: prefix hash tree , 2004, PODC '04.

[179]  Stefan Decker,et al.  Enabling Networked Knowledge , 2008, CIA.

[180]  Shamkant B. Navathe,et al.  Vertical partitioning algorithms for database design , 1984, TODS.

[181]  Miguel Castro,et al.  Performance and dependability of structured peer-to-peer overlays , 2004, International Conference on Dependable Systems and Networks, 2004.