Evaluating Aggregate Operations Over Imprecise Data

Imprecise data in databases were originally denoted as null values, which represent the meaning of "values unknown at present." More generally, a partial value corresponds to a finite set of possible values for an attribute in which exactly one of the values is the "true" value. We define a set of extended aggregate operations, namely sum, average, count, maximum, and minimum, which can be applied to an attribute containing partial values. Two types of aggregate operators are considered: scalar aggregates and aggregate functions. We study the properties of the aggregate operations and develop efficient algorithms for count, maximum and minimum. However, for sum and average, we point out that in general it takes exponential time complexity to do the computations.

[1]  P. Hall On Representatives of Subsets , 1935 .

[2]  E. F. Codd,et al.  Understanding relations , 1973, SGMD.

[3]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[4]  E. F. Codd,et al.  Understanding Relations (Installment #7) , 1974, FDT Bull. ACM SIGFIDET SIGMOD.

[5]  J. A. Bondy,et al.  Graph Theory with Applications , 1978 .

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  J. A. Bondy,et al.  Graph Theory with Applications , 1978 .

[8]  John Grant,et al.  Partial Values in a Tabular Database Model , 1979, Inf. Process. Lett..

[9]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[10]  Michael Stonebraker,et al.  Locking granularity revisited , 1979, ACM Trans. Database Syst..

[11]  Witold Lipski,et al.  On semantic issues connected with incomplete information databases , 1979, ACM Trans. Database Syst..

[12]  B. Buckles,et al.  A fuzzy representation of data for relational databases , 1982 .

[13]  Donald J. Haderle,et al.  IBM Database 2 Overview , 1984, IBM Syst. J..

[14]  Henri Prade,et al.  Generalizing Database Relational Algebra for the Treatment of Incomplete/Uncertain Information and Vague Queries , 1984, Inf. Sci..

[15]  Abraham Kandel,et al.  Implementing Imprecision in Information Systems , 1985, Inf. Sci..

[16]  John Grant,et al.  Answering Queries in Indefinite Databases and the Null Value Problem , 1986, Adv. Comput. Res..

[17]  E. F. Codd,et al.  Missing information (applicable and inapplicable) in relational databases , 1986, SGMD.

[18]  J. T. Robinson,et al.  On coupling multi-systems through data sharing , 1987, Proceedings of the IEEE.

[19]  Gultekin Özsoyoglu,et al.  Extending relational algebra and relational calculus with set-valued attributes and aggregate functions , 1987, TODS.

[20]  C. J. Date A Guide to the SQL Standard , 1987 .

[21]  Michael Pittarelli,et al.  The Theory of Probabilistic Databases , 1987, VLDB.

[22]  Patrick Bosc,et al.  Fuzzy querying with SQL: extensions and implementation aspects , 1988 .

[23]  Didier Dubois,et al.  Possibility Theory - An Approach to Computerized Processing of Uncertainty , 1988 .

[24]  LINDA G. DEMICHIEL,et al.  Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains , 1989, IEEE Trans. Knowl. Data Eng..

[25]  Tomasz Imielinski,et al.  Complexity of query processing in databases with OR-objects , 1989, PODS '89.

[26]  C. J. Date A guide to the SQL standard (2nd ed.) , 1989 .

[27]  Elke A. Rundensteiner,et al.  Aggregates in Possibilistic Databases , 1989, VLDB.

[28]  Amihai Motro,et al.  Accommodating imprecision in database systems: issues and solutions , 1990, SGMD.

[29]  L. G. Demichiel Performing database operations over mismatched domains , 1990 .

[30]  Lawrence A. Rowe,et al.  Cache consistency and concurrency control in a client/server DBMS architecture , 1991, SIGMOD '91.

[31]  Miron Livny,et al.  Data caching tradeoffs in client-server DBMS architectures , 1991, SIGMOD '91.

[32]  Elke A. Rundensteiner,et al.  Evaluating aggregates in possibilistic relational databases , 1992, Data Knowl. Eng..

[33]  Arbee L. P. Chen,et al.  Generalizing the Division Operation on Indefinite Databases , 1992, Future Databases.

[34]  Adegbemiga Ola,et al.  Relational databases with exclusive disjunctions , 1992, [1992] Eighth International Conference on Data Engineering.

[35]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[36]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[37]  Erhard Rahm,et al.  Empirical performance evaluation of concurrency and coherency control protocols for database sharing systems , 1993, TODS.

[38]  Arbee L. P. Chen,et al.  Refining Imprecise Data by Integrity Constraints , 1993, Data Knowl. Eng..

[39]  Erhard Rahm Evaluation of closely coupled systems for high performance database processing , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[40]  Arbee L. P. Chen,et al.  Querying uncertain data in heterogeneous databases , 1993, Proceedings RIDE-IMS `93: Third International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems.

[41]  Philip S. Yu,et al.  Performance Modelling and Comparisons of Global Shared Buffer Management Policies in a Cluster Environment , 1994, IEEE Trans. Computers.