FAQ: Questions Asked Frequently

We define and study the Functional Aggregate Query (FAQ) problem, which encompasses many frequently asked questions in constraint satisfaction, databases, matrix operations, probabilistic graphical models and logic. This is our main conceptual contribution. We then present a simple algorithm called "InsideOut" to solve this general problem. InsideOut is a variation of the traditional dynamic programming approach for constraint programming based on variable elimination. Our variation adds a couple of simple twists to basic variable elimination in order to deal with the generality of FAQ, to take full advantage of Grohe and Marx's fractional edge cover framework, and of the analysis of recent worst-case optimal relational join algorithms. As is the case with constraint programming and graphical model inference, to make InsideOut run efficiently we need to solve an optimization problem to compute an appropriate variable ordering. The main technical contribution of this work is a precise characterization of when a variable ordering is `semantically equivalent' to the variable ordering given by the input FAQ expression. Then, we design an approximation algorithm to find an equivalent variable ordering that has the best `fractional FAQ-width'. Our results imply a host of known and a few new results in graphical model inference, matrix operations, relational joins, and logic. We also briefly explain how recent algorithms on beyond worst-case analysis for joins and those for solving SAT and #SAT can be viewed as variable elimination to solve FAQ over compactly represented input functions.

[1]  Christoph Koch,et al.  Incremental query evaluation in a ring of databases , 2010, PODS.

[2]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[3]  Nevin Lianwen Zhang,et al.  Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..

[4]  Nevin L. Zhang,et al.  A simple approach to Bayesian network computations , 1994 .

[5]  Enrico Macii,et al.  Algebric Decision Diagrams and Their Applications , 1997, ICCAD '93.

[6]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[7]  Anand Rajaraman,et al.  Conjunctive query containment revisited , 1997, Theor. Comput. Sci..

[8]  Linda C. van der Gaag,et al.  Probabilistic Graphical Models , 2014, Lecture Notes in Computer Science.

[9]  Georg Gottlob,et al.  General and Fractional Hypertree Decompositions: Hard and Easy Cases , 2016, AMW.

[10]  Pedro M. Domingos,et al.  Structured Message Passing , 2013, UAI.

[11]  Sergei Vassilvitskii,et al.  Counting triangles and the curse of the last reducer , 2011, WWW.

[12]  Hubie Chen,et al.  Constraint satisfaction with succinctly specified relations , 2010, J. Comput. Syst. Sci..

[13]  Nic Wilson,et al.  Semiring induced valuation algebras: Exact and approximate local computation algorithms , 2008, Artif. Intell..

[14]  Gene H. Golub,et al.  Matrix computations , 1983 .

[15]  Leslie Ann Goldberg,et al.  A Complexity Dichotomy for Partition Functions with Mixed Signs , 2008, SIAM J. Comput..

[16]  Georg Gottlob,et al.  Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width , 2001, PODS '01.

[17]  Dániel Marx,et al.  Approximating fractional hypertree width , 2009, TALG.

[18]  Reinhard Pichler,et al.  Tractable Counting of the Answers to Conjunctive Queries , 2013, AMW.

[19]  Atri Rudra,et al.  Beyond worst-case analysis for joins with minesweeper , 2014, PODS.

[20]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[21]  Clement Yu,et al.  On determining tree query membership of a distributed query , 1980 .

[22]  Raphael Yuster,et al.  Fast sparse matrix multiplication , 2004, TALG.

[23]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[24]  Javier Larrosa,et al.  Semiring-Based Mini-Bucket Partitioning Schemes , 2013, IJCAI.

[25]  C. R. Rao,et al.  SOLUTIONS TO SOME FUNCTIONAL EQUATIONS AND THEIR APPLICATIONS TO CHARACTERIZATION OF PROBABILITY DISTRIBUTIONS , 2016 .

[26]  Jin-Yi Cai,et al.  The complexity of complex weighted Boolean #CSP , 2014, J. Comput. Syst. Sci..

[27]  Dániel Marx,et al.  Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries , 2009, JACM.

[28]  Daniël Paulusma,et al.  Satisfiability of acyclic and almost acyclic CNF formulas , 2011, Theor. Comput. Sci..

[29]  Arnaud Durand,et al.  The complexity of weighted counting for acyclic conjunctive queries , 2011, J. Comput. Syst. Sci..

[30]  Dan Suciu,et al.  What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another? , 2016, PODS.

[31]  Georg Gottlob,et al.  Hypertree Decompositions: Questions and Answers , 2016, PODS.

[32]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[33]  Francesco Scarcello,et al.  Counting solutions to conjunctive queries: structural and hybrid tractability , 2014, SEBD.

[34]  Hung Q. Ngo,et al.  In-Database Factorized Learning , 2017, AMW.

[35]  Richard Hull,et al.  Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems , 2014, PODS.

[36]  Atri Rudra,et al.  Join Processing for Graph Patterns: An Old Dog with New Tricks , 2015, GRADES@SIGMOD/PODS.

[37]  Marc Gyssens,et al.  A Decomposition Methodology for Cyclic Databases , 1982, Advances in Data Base Theory.

[38]  Atri Rudra,et al.  Skew strikes back: new developments in the theory of join algorithms , 2013, SGMD.

[39]  Paul D. Seymour,et al.  Graph Minors. II. Algorithmic Aspects of Tree-Width , 1986, J. Algorithms.

[40]  Hubie Chen,et al.  Quantified Constraint Satisfaction and Bounded Treewidth , 2004, ECAI.

[41]  Dániel Marx Tractable Structures for Constraint Satisfaction with Truth Tables , 2009, Theory of Computing Systems.

[42]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[43]  Dániel Marx,et al.  Constraint solving via fractional edge covers , 2006, SODA '06.

[44]  Hubie Chen,et al.  A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries , 2014, ICDT.

[45]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[46]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[47]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[48]  Marc Gyssens,et al.  Decomposing Constraint Satisfaction Problems Using Database Techniques , 1994, Artif. Intell..

[49]  Javier Larrosa,et al.  Unifying tree decompositions for reasoning in graphical models , 2005, Artif. Intell..

[50]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[51]  Stefan Mengel,et al.  Understanding Model Counting for beta-acyclic CNF-formulas , 2015, STACS.

[52]  Enkatesan G Uruswami Unbalanced expanders and randomness extractors from Parvaresh-Vardy codes , 2008 .

[53]  Daniël Paulusma,et al.  Satisfiability of Acyclic and almost Acyclic CNF Formulas (II) , 2011, SAT.

[54]  Atri Rudra,et al.  Joins via Geometric Resolutions: Worst-case and Beyond , 2014, PODS.

[55]  Ronald Fagin,et al.  Degrees of acyclicity for hypergraphs and relational database schemes , 1983, JACM.

[56]  Dan Suciu,et al.  Computing Join Queries with Functional Dependencies , 2016, PODS.

[57]  Jin-Yi Cai,et al.  Graph Homomorphisms with Complex Values: A Dichotomy Theorem , 2009, SIAM J. Comput..

[58]  Stefan Arnborg,et al.  Linear time algorithms for NP-hard problems restricted to partial k-trees , 1989, Discret. Appl. Math..

[59]  Francesco Scarcello,et al.  Query answering exploiting structural properties , 2005, SGMD.

[60]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[61]  Jakub Závodný,et al.  Aggregation and Ordering in Factorised Databases , 2013, Proc. VLDB Endow..

[62]  B. Bollobás,et al.  Projections of Bodies and Hereditary Properties of Hypergraphs , 1995 .

[63]  Stefan Mengel,et al.  Understanding model counting for $β$-acyclic CNF-formulas , 2014, ArXiv.

[64]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[65]  Eugene C. Freuder Complexity of K-Tree Structured Constraint Satisfaction Problems , 1990, AAAI.

[66]  Todd L. Veldhuizen,et al.  Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm , 2012, 1210.0481.

[67]  Amin Shokrollahi,et al.  Matrix-vector product for confluent Cauchy-like matrices with application to confluent rational interpolation , 2000, STOC '00.

[68]  Jakub Závodný,et al.  Size Bounds for Factorised Representations of Query Results , 2015, TODS.

[69]  Rina Dechter,et al.  Bucket Elimination: A Unifying Framework for Reasoning , 1999, Artif. Intell..

[70]  Victor Y. Pan,et al.  Nearly optimal computations with structured matrices , 2014, Theor. Comput. Sci..

[71]  Hans L. Bodlaender,et al.  Treewidth: Characterizations, Applications, and Computations , 2006, WG.

[72]  Atri Rudra,et al.  Efficiently Decodable Error-Correcting List Disjunct Matrices and Applications - (Extended Abstract) , 2011, ICALP.

[73]  Enrico Macii,et al.  Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[74]  Jeffrey D. Ullman,et al.  Principles of database and knowledge-base systems, Vol. I , 1988 .

[75]  Rina Dechter,et al.  Tree Clustering for Constraint Networks , 1989, Artif. Intell..

[76]  Venkatesan Guruswami,et al.  Unbalanced expanders and randomness extractors from Parvaresh--Vardy codes , 2009, JACM.

[77]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[78]  Rina Dechter,et al.  Directional Resolution: The Davis-Putnam Procedure, Revisited , 1994, KR.

[79]  Solomon W. Golomb,et al.  Backtrack Programming , 1965, JACM.

[80]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[81]  Derrick S. Tracy,et al.  A new matrix product and its applications in partitioned matrix differentiation , 1972 .

[82]  Dániel Marx,et al.  Size Bounds and Query Plans for Relational Joins , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[83]  S. Aji Graphical Models and Iterative Decoding , 2000 .

[84]  Francesca Rossi,et al.  Semiring-based constraint satisfaction and optimization , 1997, JACM.

[85]  Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems , 2016, PODS.

[86]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[87]  Andries E. Brouwer,et al.  A super-balanced hypergraph has a nest point , 1980 .

[88]  Arnaud Durand,et al.  Structural Tractability of Counting of Solutions to Conjunctive Queries , 2013, ICDT '13.

[89]  Dan Olteanu,et al.  Learning Linear Regression Models over Factorized Joins , 2016, SIGMOD Conference.

[90]  Charalampos E. Tsourakakis,et al.  Colorful triangle counting and a MapReduce implementation , 2011, Inf. Process. Lett..

[91]  Lars Otten,et al.  On the Practical Significance of Hypertree vs. TreeWidth , 2008, ECAI.

[92]  Toby Walsh,et al.  Handbook of Constraint Programming (Foundations of Artificial Intelligence) , 2006 .

[93]  Leslie Ann Goldberg,et al.  A Complexity Dichotomy for Partition Functions with Mixed Signs , 2010, SIAM J. Comput..

[94]  Emir Pasalic,et al.  Design and Implementation of the LogicBlox System , 2015, SIGMOD Conference.

[95]  Kunle Olukotun,et al.  EmptyHeaded: Boolean Algebra Based Graph Processing , 2015, ArXiv.

[96]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[97]  Dan Olteanu,et al.  F: Regression Models over Factorized Views , 2016, Proc. VLDB Endow..

[98]  Venkatesan Guruswami,et al.  Algorithmic Results in List Decoding , 2006, Found. Trends Theor. Comput. Sci..

[99]  Yuri Gurevich,et al.  The Classical Decision Problem , 1997, Perspectives in Mathematical Logic.

[100]  Hubie Chen,et al.  Decomposing Quantified Conjunctive (or Disjunctive) Formulas , 2012, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[101]  Isolde Adler,et al.  Tree-Width for First Order Formulae , 2009, CSL.