OLAP over uncertain and imprecise data

We extend the OLAP data model to represent data ambiguity, specifically imprecision and uncertainty, and introduce an allocation-based approach to the semantics of aggregation queries over such data. We identify three natural query properties and use them to shed light on alternative query semantics. While there is much work on representing and querying ambiguous data, to our knowledge this is the first paper to handle both imprecision and uncertainty in an OLAP setting.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Christian Genest,et al.  Combining Probability Distributions: A Critique and an Annotated Bibliography , 1986 .

[3]  Serge Abiteboul,et al.  On the representation and querying of sets of possible worlds , 1987, SIGMOD '87.

[4]  Michael Pittarelli,et al.  The Theory of Probabilistic Databases , 1987, VLDB.

[5]  Amihai Motro,et al.  Accommodating imprecision in database systems: issues and solutions , 1990, SGMD.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  Elke A. Rundensteiner,et al.  Evaluating aggregates in possibilistic relational databases , 1992, Data Knowl. Eng..

[8]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[9]  Arbee L. P. Chen,et al.  Evaluating Aggregate Operations Over Imprecise Data , 1996, IEEE Trans. Knowl. Data Eng..

[10]  Amihai Motro,et al.  Sources of Uncertainty, Imprecision, and Inconsistency in Information Systems , 1996, Uncertainty Management in Information Systems.

[11]  David A. Bell,et al.  Generalized Union and Project Operations for Pooling Uncertain and Imprecise Information , 1996, Data Knowl. Eng..

[12]  Laks V. S. Lakshmanan,et al.  ProbView: a flexible probabilistic database system , 1997, TODS.

[13]  Arie Shoshani,et al.  Summarizability in OLAP and statistical data bases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[14]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[15]  Arie Shoshani,et al.  OLAP and statistical databases: similarities and differences , 1997, PODS '97.

[16]  T. Minka Expectation-Maximization as lower bound maximization , 1998 .

[17]  Sumit Sarkar,et al.  PSQL: A Query Language for Probabilistic Relational Data , 1998, Data Knowl. Eng..

[18]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[19]  Antoni Wolski,et al.  Lazy Aggregates for Real-Time OLAP , 1999, DaWaK.

[20]  Torben Bach Pedersen,et al.  Supporting imprecision in multidimensional databases using granularities , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[21]  Bernhard Thalheim,et al.  OLAP databases and aggregation functions , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[22]  Sally I. McClean,et al.  Aggregation of Imprecise and Uncertain Information in Databases , 2001, IEEE Trans. Knowl. Data Eng..

[23]  Xintao Wu,et al.  Modeling and Imputation of Large Incomplete Multidimensional Datasets , 2002, DaWaK.

[24]  Xintao Wu,et al.  Learning missing values from summary constraints , 2002, SKDD.

[25]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[26]  Mahesh V. Joshi,et al.  Topic Learning from Few Examples , 2003, PKDD.

[27]  Xin He,et al.  Scalar aggregation in inconsistent databases , 2003, Theor. Comput. Sci..

[28]  T. S. Jayram,et al.  Generalized Opinion Pooling , 2004, ISAIM.

[29]  Robert B. Ross,et al.  Aggregate operators in probabilistic databases , 2005, JACM.

[30]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .