A Referential Integrity Browser for Distributed Databases

We demonstrate a program that can inspect a distributed relational database on the Internet to discover and quantify referential integrity issues for integration purposes. The program computes data quality metrics for referential integrity at four granularity levels: database, table, column and value, going from a global to a detailed view, exhibiting specific evidence about referential errors. Two orthogonal data quality dimensions are considered: completeness and consistency. Each table is stored at one primary site and it can be replicated at multiple sites, having foreign key references to tables at the same site or at dierent sites. The user can choose alternative query evaluation strategies to eciently compute referential error metrics. Our proposal can be used in data integration, data warehousing and data quality assurance.