Topology Dependent Bounds For FAQs

In this paper, we prove topology dependent bounds on the number of rounds needed to compute Functional Aggregate Queries ($\FAQ$s) studied by Abo Khamis et al. [PODS 2016] in a synchronous distributed network under the model considered by Chattopadhyay et al. [FOCS 2014, SODA 2017]. Unlike the recent work on computing database queries in the Massively Parallel Computation model, in the model of Chattopadhyay et al., nodes can communicate only via private point-to-point channels and we are interested in bounds that work over an \em arbitrary communication topology. This model, which is closer to the well-studied $\congest$ model in distributed computing and generalizes Yao's two party communication complexity model, has so far only been studied for problems that are common in the two-party communication complexity literature. This is the first work to consider more practically motivated problems in this distributed model. For the sake of exposition, we focus on two specific problems in this paper: Boolean Conjunctive Query ($\BCQ$) and computing variable/factor marginals in Probabilistic Graphical Models (PGMs). We obtain tight bounds on the number of rounds needed to compute such queries as long as the underlying hypergraph of the query is $O(1)$-degenerate and has $O(1)$-arity. In particular, the $O(1)$-degeneracy condition covers most well-studied queries that are efficiently computable in the centralized computation model like queries with constant treewidth. These tight bounds depend on a new notion of 'width' (namely \em internal-node-width ) for Generalized Hypertree Decompositions (GHDs) of acyclic hypergraphs, which minimizes the number of internal nodes in a sub-class of GHDs. To the best of our knowledge, this width has not been studied explicitly in the theoretical database literature. Finally, we consider the problem of computing the product of a vector with a chain of matrices and prove tight bounds on its round complexity (over a finite field of two elements) using a novel min-entropy based argument.

[1]  Prasoon Tiwari,et al.  Lower bounds on communication complexity in distributed computer networks , 1984, JACM.

[2]  Christopher Ré,et al.  AJAR: Aggregations and Joins over Annotated Relations , 2016, PODS.

[3]  David Zuckerman Simulating BPP using a general weak random source , 2005, Algorithmica.

[4]  Dmitri Akatov,et al.  Exploiting parallelism in decomposition methods for constraint satisfaction , 2010 .

[5]  David P. Woodruff,et al.  Tight bounds for distributed functional monitoring , 2011, STOC '12.

[6]  Alexandr V. Kostochka On almost (k-1)-degenerate (k+1)-chromatic graphs and hypergraphs , 2013, Discret. Math..

[7]  Arkadev Chattopadhyay,et al.  Simulation beats richness: new data-structure lower bounds , 2018, Electron. Colloquium Comput. Complex..

[8]  Jakub Závodný,et al.  Aggregation and Ordering in Factorised Databases , 2013, Proc. VLDB Endow..

[9]  Dan Suciu,et al.  A Guide to Formal Analysis of Join Processing in Massively Parallel Systems , 2017, SGMD.

[10]  Amnon Ta-Shma Almost Optimal Dispersers , 2002, Comb..

[11]  Jakub Závodný,et al.  Size Bounds for Factorised Representations of Query Results , 2015, TODS.

[12]  Mohammad Hayajneh,et al.  Data Management for the Internet of Things: Design Primitives and Solution , 2013, Sensors.

[13]  Ravi Kumar,et al.  Two applications of information complexity , 2003, STOC '03.

[14]  Mark Braverman,et al.  Tight Bounds for Set Disjointness in the Message Passing Model , 2013, ArXiv.

[15]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[16]  Atri Rudra,et al.  Topology Matters in Communication , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[17]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[18]  Christopher Ré,et al.  It's All a Matter of Degree: Using Degree Information to Optimize Multiway Joins , 2016, ICDT.

[19]  Toniann Pitassi,et al.  Query-to-Communication Lifting for BPP , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[20]  Dan Suciu,et al.  Worst-Case Optimal Algorithms for Parallel Query Processing , 2016, ICDT.

[21]  Atri Rudra,et al.  FAQ: Questions Asked Frequently , 2015, PODS.

[22]  Nikhil R. Devanur,et al.  ProjecToR: Agile Reconfigurable Data Center Interconnect , 2016, SIGCOMM.

[23]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[24]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[25]  Atri Rudra,et al.  The Range of Topological Effects on Communication , 2015, ICALP.

[26]  Yehoshua Sagiv,et al.  Generating Relations from XML Documents , 2003, ICDT.

[27]  Mohsen Ghaffari,et al.  Improved Distributed Algorithms for Fundamental Graph Problems , 2017 .

[28]  Magnús M. Halldórsson,et al.  Independent sets in bounded-degree hypergraphs , 2009, Discret. Appl. Math..

[29]  Robin Thomas,et al.  A menger-like property of tree-width: The finite case , 1990, J. Comb. Theory, Ser. B.

[30]  Yevgeniy Dodis,et al.  On Extracting Private Randomness over a Public Channel , 2003, RANDOM-APPROX.

[31]  Francesco Scarcello,et al.  On Weighted Hypertree Decompositions , 2004, SEBD.

[32]  Kesheng Wu,et al.  Bitmap Index Design Choices and Their Performance Implications , 2007, 11th International Database Engineering and Applications Symposium (IDEAS 2007).

[33]  Hans L. Bodlaender,et al.  NC-Algorithms for Graphs with Small Treewidth , 1988, WG.

[34]  Atri Rudra,et al.  Tight Network Topology Dependent Bounds on Rounds of Communication , 2017, SODA.

[35]  Qin Zhang,et al.  Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy , 2011, SIAM J. Comput..

[36]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[37]  Johannes Gehrke,et al.  Query Processing in Sensor Networks , 2003, CIDR.

[38]  Shachar Lovett,et al.  Rectangles Are Nonnegative Juntas , 2015, SIAM J. Comput..

[39]  Andrew Chi-Chih Yao,et al.  Informational complexity and the direct sum problem for simultaneous message complexity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[40]  Wei Hong,et al.  TinyDB: an acquisitional query processing system for sensor networks , 2005, TODS.

[41]  Joshua Erde,et al.  A unified treatment of linked and lean tree-decompositions , 2017, J. Comb. Theory, Ser. B.

[42]  Frank Thomson Leighton,et al.  Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms , 1999, JACM.

[43]  Noga Alon,et al.  The Moore Bound for Irregular Graphs , 2002, Graphs Comb..

[44]  Christopher Ré,et al.  GYM: A Multiround Distributed Join Algorithm , 2017, ICDT.

[45]  David Peleg,et al.  Distributed Computing: A Locality-Sensitive Approach , 1987 .

[46]  Lap Chi Lau,et al.  An Approximate Max-Steiner-Tree-Packing Min-Steiner-Cut Theorem* , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[47]  Atri Rudra,et al.  Juggling Functions Inside a Database , 2017, SGMD.

[48]  Reinhard Diestel,et al.  Two Short Proofs Concerning Tree-Decompositions , 2002, Combinatorics, Probability and Computing.

[49]  Georg Gottlob,et al.  Hypertree Decompositions: Questions and Answers , 2016, PODS.

[50]  Clement T. Yu,et al.  An algorithm for tree-query membership of a distributed query , 1979, COMPSAC.

[51]  Philippe Bonnet,et al.  Towards Sensor Database Systems , 2001, Mobile Data Management.

[52]  Francesco Scarcello,et al.  Weighted hypertree decompositions and optimal query plans , 2004, PODS '04.

[53]  Dan Suciu,et al.  Skew in parallel query processing , 2014, PODS.

[54]  Renato Renner,et al.  Simple and Tight Bounds for Information Reconciliation and Privacy Amplification , 2005, ASIACRYPT.