Towards Approximating Incomplete Queries over Partially Complete Databases (Extended Abstract)

Building reliable systems over partially complete data poses significant challenges because queries they send to the available data retrieve answers that may significantly differ from the real answers. This may lead to a wrong understanding of the data and the events and processes it describes. This problem is especially critical for analytical systems that aggregate retrieved data since missing answers may significantly change results of analytical computations, e.g., computation of minimal or average values is sensitive to missing values. One way to ensure reliability of (analytical) systems over partially complete data is to guarantee that whatever data they touch is complete w.r.t. to the real data.