Approximate Processing for Medical Record Linking and Multidatabase Analysis

In this article we investigate how approximate query processing (AQP) can be used in medical multidatabase systems. We identify two areas where this estimation technique will be of use. First, approximate query processing can be used to preprocess medical record linking in the multidatabase. Second, approximate answers can be given for aggregate queries. In the case of multidatabase systems used to link health and health related data sources, preprocessing can be used to find records related to the same patient. This may be the first step in the linking strategy. If the aim is to gather aggregate statistics, then the approximate answers may be enough to provide the required answers. At least they may provide initial answers to encourage further investigation. This estimation may also be used for general query planning and optimization, important in multidatabase systems. In this article we propose two techniques for the estimation. These techniques enable synopses of component local databases to be precalculated and then used for obtaining approximate results for linking records and for aggregate queries. The synopses are constructed with restrictions on the storage space. We report on experiments which show that good approximate results can be obtained in a much shorter time than performing the exact query.