A systematic approach to reliability assessment in integrated databases

We provide a novel framework based on a systematic treatment of data inconsistency and the related concept of data reliability in integrated databases. Our main contribution is the formalization of reliability assessment for historical data where redundancy and inconsistency are common. We discover data inconsistency through the analysis of relationships between existing reports in the integrated database. We present a new approach by defining properties (rules) that a good measure of reliability should satisfy. We then propose such measures and show which properties they satisfy. We also report on a simulation-based study of the introduced framework.

[1]  Weiru Liu,et al.  A Syntax-based approach to measuring the degree of inconsistency for belief bases , 2011, Int. J. Approx. Reason..

[2]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[3]  Anthony Hunter,et al.  On the measure of conflicts: Shapley Inconsistency Values , 2010, Artif. Intell..

[4]  John Grant,et al.  Distance-Based Measures of Inconsistency , 2013, ECSQARU.

[5]  Leopoldo E. Bertossi,et al.  Consistent query answering in databases , 2006, SGMD.

[6]  Gio Wiederhold,et al.  Flexible relation: an approach for integrating data from multiple, possibly inconsistent databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[7]  Michael L. Brodie Data Integration at Scale: From Relational Data Integration to Information Ecosystems , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[8]  Vladimir Zadorozhny,et al.  Conflict-Aware Historical Data Fusion , 2011, SUM.

[9]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[10]  Jan Chomicki,et al.  Answer sets for consistent query answering in inconsistent databases , 2002, Theory and Practice of Logic Programming.

[11]  Vladimir Zadorozhny,et al.  Alternative Path Selection in Resilient Web Infrastructure Using Performances Dependencies , 2007, J. Web Eng..

[12]  Vladimir Zadorozhny,et al.  Scalable Catalog Infrastructure for Managing Access Costs and Source Selection in Wide Area Networks , 2008, Int. J. Cooperative Inf. Syst..

[13]  Rajeev Rastogi,et al.  A cost-based model and effective heuristic for repairing constraints by value modification , 2005, SIGMOD '05.

[14]  Jef Wijsen,et al.  Consistent query answering under primary keys: a characterization of tractable queries , 2009, ICDT '09.

[15]  Weiru Liu,et al.  Under Consideration for Publication in Knowledge and Information Systems a General Framework for Measuring Inconsistency through Minimal Inconsistent Sets , 2022 .

[16]  Vladimir Zadorozhny,et al.  Collaborative for Historical Information and Analysis: Vision and Work Plan , 2013 .

[17]  Laura M. Haas,et al.  Beauty and the Beast: The Theory and Practice of Information Integration , 2007, ICDT.

[18]  Anthony Hunter,et al.  Approaches to Measuring Inconsistent Information , 2005, Inconsistency Tolerance.

[19]  John Grant,et al.  Measuring inconsistency in knowledgebases , 2006, Journal of Intelligent Information Systems.

[20]  Vladimir Zadorozhny,et al.  AReNA: Adaptive Distributed Catalog Infrastructure Based On Relevance Networks , 2005, VLDB.

[21]  John Grant,et al.  Measuring Consistency Gain and Information Loss in Stepwise Inconsistency Resolution , 2011, ECSQARU.

[22]  Felix Naumann,et al.  Data Fusion – Resolving Data Conflicts for Integration , 2009 .

[23]  Jan Chomicki,et al.  Consistent query answers in the presence of universal constraints , 2008, Inf. Syst..

[24]  Jef Wijsen,et al.  Database repairing using updates , 2005, TODS.

[25]  John Grant,et al.  Classifications for inconsistent theories , 1978, Notre Dame J. Formal Log..

[26]  François Bry,et al.  Query Answering in Information Systems with Integrity Constraints , 1997, IICIS.

[27]  Jan Chomicki,et al.  Query Answering in Inconsistent Databases , 2003, Logics for Emerging Applications of Databases.