Local Computation of Answers to Table Queries on Summary Databases

We address the problem of evaluating table queries from a summary database formed by a collection of pre-computed tables on certain measure variables. We assume that every table query asks for the distribution of a measure variable of interest, and that the summary database contains tables on the variable of interest as well as on other measure variables. If the requested distribution is none of the base tables and cannot be exactly derivable from none of them, then the answer to the query will be the result of an estimation procedure, which may bring up another measure variable that is correlated to the measure variable of interest. We give an estimation procedure that combines the “divide-and-conquer” principle with tree computations.

[1]  Francesco M. Malvestuto,et al.  An implementation of the iterative proportional fitting procedure by propagation trees , 2001 .

[2]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[3]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[4]  Michael Green Biproportional Matrices and Input‐Output Change , 1971 .

[5]  Elaheh Pourabbas,et al.  Customized answers to summary queries via aggregate views , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[6]  Christos Faloutsos,et al.  Recovering Information from Summary Data , 1997, VLDB.

[7]  Francesco M. Malvestuto,et al.  A universal-scheme approach to statistical databases containing homogeneous summary tables , 1993, TODS.

[8]  Marina Moscarini,et al.  Decomposition of a hypergraph by partial-edge separators , 2000, Theor. Comput. Sci..

[9]  Malay Ghosh,et al.  Small Area Estimation: An Appraisal , 1994 .

[10]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[11]  Marina Moscarini,et al.  A Fast Algorithm for Query Optimization in Universal-Relation Databases , 1995, J. Comput. Syst. Sci..

[12]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[13]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[14]  Francesco M. Malvestuto A universal table model for categorical databases , 1989, Inf. Sci..

[15]  Michael Bacharach,et al.  Biproportional matrices & input-output change , 1970 .

[16]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[17]  Francesco M. Malvestuto,et al.  A hypergraph-theoretic analysis of collapsibility and decomposability for extended log-linear models , 2001, Stat. Comput..

[18]  W. Leontief,et al.  Multiregional Input-Output Analysis , 1963 .

[19]  Francesco M. Malvestuto Answering queries in categorical databases , 1987, PODS '87.