Supporting quantified queries in distributed databases

We show that some relational queries, which we call quantified queries are not well supported in distributed environments. We give a formal definition of quantified queries, propose a language in which to express said queries and provide a procedure to compute answers in this new language in the context of distributed databases. The proposed language is made up of high-level, declarative operators (called generalised quantifiers), and therefore it can be used in combination with several distributed frameworks. Our approach is designed to be as general as possible; it assumes horizontally partitioned relations, but nothing else, so no data placement or replication is used. We present an implementation and algorithms for the new language, propose some basic optimisations and give experimental results which show that the new approach is indeed quite efficient and scales well.

[1]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[2]  Ying Zhang,et al.  XRPC: Interoperable and Efficient Distributed XQuery , 2007, VLDB.

[3]  Christoph Mangold,et al.  Laws for Rewriting Queries Containing Division Operators , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[4]  M. Tamer Özsu,et al.  Generating Efficient Execution Plans for Vertically Partitioned XML Databases , 2010, Proc. VLDB Endow..

[5]  Yahiko Kambayashi,et al.  Efficient processing of distributed set queries , 1990, Proceedings. PARBASE-90: International Conference on Databases, Parallel Architectures, and Their Applications.

[6]  Lauri Hella,et al.  The hierarchy theorem for generalized quantifiers , 1996, Journal of Symbolic Logic.

[7]  Alon Y. Halevy,et al.  Piazza: data management infrastructure for semantic web applications , 2003, WWW '03.

[8]  Keishi Tajima,et al.  Answering XPath Queries over Networks by Sending Minimal Views , 2004, VLDB.

[9]  Ying Zhang,et al.  XRPC: distributed XQuery and update processing with heterogeneous XQuery engines , 2008, SIGMOD Conference.

[10]  Antonio Badia,et al.  Providing better support for a class of decision support queries , 1996, SIGMOD '96.

[11]  Ashish Gupta,et al.  Aggregate-Query Processing in Data Warehousing Environments , 1995, VLDB.

[12]  Antonio Badia Quantifiers in Action: Generalized Quantification in Query, Logical and Natural Languages , 2009 .

[13]  Tore Risch,et al.  Query Decomposition for a Distributed Object-Oriented Mediator System , 2002, Distributed and Parallel Databases.

[14]  Volker Markl,et al.  Progressive optimization in a shared-nothing parallel database , 2007, SIGMOD '07.

[15]  Arie Segev,et al.  Set query optimization in distributed database systems , 1986, TODS.

[16]  Nikos Mamoulis,et al.  Efficient processing of joins on set-valued attributes , 2003, SIGMOD '03.

[17]  Jan Van den Bussche,et al.  On the complexity of division and set joins in the relational algebra , 2005, PODS '05.

[18]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[19]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[20]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[21]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[22]  Ping-Yu Hsu,et al.  Improving SQL with generalized quantifiers , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[23]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[24]  Jeffrey F. Naughton,et al.  Set Containment Joins: The Good, The Bad and The Ugly , 2000, VLDB.

[25]  Maurice Bruynooghe,et al.  Towards a logical reconstruction of a theory for locally closed databases , 2010, TODS.

[26]  Norman May,et al.  Nested queries and quantifiers in an ordered context , 2004, Proceedings. 20th International Conference on Data Engineering.

[27]  Anirban Mondal,et al.  E-ARL: An Economic incentive scheme for Adaptive Revenue-Load-based dynamic replication of data in Mobile-P2P networks , 2010, Distributed and Parallel Databases.

[28]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[29]  Dag Westerstaåhl,et al.  Quantifiers in Formal and Natural Languages , 1989 .

[30]  Laks V. S. Lakshmanan,et al.  Efficient OLAP Query Processing in Distributed Data Warehouses , 2002, EDBT.

[31]  Alfredo Cuzzocrea,et al.  Query Optimization over Parallel Relational Data Warehouses in Distributed Environments by Simultaneous Fragmentation and Allocation , 2010, ICA3PP.

[32]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[33]  Wenfei Fan,et al.  Using partial evaluation in distributed query evaluation , 2006, VLDB.

[34]  Norman May,et al.  Unnesting Scalar SQL Queries in the Presence of Disjunction , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[35]  Michael H. Böhlen,et al.  Efficient computation of subqueries in complex OLAP , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[36]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[37]  Antonio Badia,et al.  Quantifiers in Action , 2009, Advances in Database Systems.

[38]  Renée J. Miller,et al.  Mapping data in peer-to-peer systems: semantics and algorithmic issues , 2003, SIGMOD '03.

[39]  Wenfei Fan,et al.  Distributed query evaluation with performance guarantees , 2007, SIGMOD '07.