论文信息 - Aggregate Evaluability in Statistical Databases

Aggregate Evaluability in Statistical Databases

Usually a statistical database contains many summary tables representing the distribution of the same statistical variable over the classes of as many partitions of a certain universe of objects. Existing query systems allow only queries on single tables. Indeed, in most cases additional queries can be evaluated by combining the information contained in similar tables in a suitable way. attribute” [ 14,201) relat.ed to a given universe of 0bject.s or individuals, partitioned according to a set of (category) attributes, referred to as the scheme of the table. Example 1. Untuerse: Soviet people in the year 1959. Variable: Population (1000 individuals). Scheme: {Sex, Schooling, Part,y-Membership} (the data is obtained by processing data from Bishop et al. [4]). In order to improve the responsiveness of the database and allow an integrated use of the stored informat.ion, we propose to inform t,he database system of the relationship among the partitions adopted in the tables. Such a relationship, called zntersection dependency, states which classes of the partitions have a nonempty intersection and can be represented by a uniform multipartite hypergraph, called intersection hypergraph. On the grounds of the algebraic properties of the intel Jection hypergraph and under the assumption of data additivity, we shall provide a characteriration of evaluable queries, which allows us to define polynomial-time procedures both for testing evaluability and for evaluating queries. Table: Distribution of the soviet populatiion by schooling, sex and party (1000 individuals) 1959 Sex / Schooling Party-Membership Yes No

Marina Moscarini | Francesco M. Malvestuto | F. M. Malvestuto | M. Moscarini