IncompFuse: a logical framework for historical information fusion with inaccurate data sources

We propose a novel framework, called IncompFuse , that significantly improves the accuracy of existing methods for reconstructing aggregated historical data from inaccurate historical reports. IncompFuse supports efficient data reliability assessment using the incompatibility probability of historical reports. We provide a systematic approach to define this probability based on properties of the data and relationships between the reports. Our experimental study demonstrates high utility of the proposed framework. In particular, we were able to detect noisy historical reports with very high detection accuracy.

[1]  Jan Chomicki,et al.  Consistent query answers in the presence of universal constraints , 2008, Inf. Syst..

[2]  Jeff Z. Pan,et al.  An Argument-Based Approach to Using Multiple Ontologies , 2009, SUM.

[3]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4]  Simon Regard,et al.  ["Less is more"]. , 2013, Revue medicale suisse.

[5]  Eyke Hüllermeier,et al.  Scalable Uncertainty Management , 2012, Lecture Notes in Computer Science.

[6]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[7]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[8]  Xiaoxin Yin,et al.  Semi-supervised truth discovery , 2011, WWW.

[9]  John Grant,et al.  Classifications for inconsistent theories , 1978, Notre Dame J. Formal Log..

[10]  Vladimir Zadorozhny,et al.  Data Credence in IoT: Vision and Challenges , 2017, Open J. Internet Things.

[11]  M. Askarizade,et al.  Data conflict resolution among same entities in Web of Data , 2012, 2012 2nd International eConference on Computer and Knowledge Engineering (ICCKE).

[12]  Vladimir Zadorozhny,et al.  Efficient information access in data-intensive sensor networks , 2010 .

[13]  MengWeiyi,et al.  Truth finding on the deep web , 2012, VLDB 2012.

[14]  Vladimir Zadorozhny,et al.  Information fusion for USAR operations based on crowdsourcing , 2013, Proceedings of the 16th International Conference on Information Fusion.

[15]  Raph Levien,et al.  Attack-Resistant Trust Metrics , 2009, Computing with Social Trust.

[16]  Georg Lausen,et al.  Spreading activation models for trust propagation , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[17]  Vladimir Zadorozhny,et al.  Alternative Path Selection in Resilient Web Infrastructure Using Performances Dependencies , 2007, J. Web Eng..

[18]  Divesh Srivastava,et al.  Truth Finding on the Deep Web: Is the Problem Solved? , 2012, Proc. VLDB Endow..

[19]  Nikos D. Sidiropoulos,et al.  HomeRun: Scalable Sparse-Spectrum Reconstruction of Aggregated Historical Data , 2018, Proc. VLDB Endow..

[20]  Vladimir Zadorozhny,et al.  Collaborative for Historical Information and Analysis: Vision and Work Plan , 2013 .

[21]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[22]  Vladimir Zadorozhny,et al.  A systematic approach to reliability assessment in integrated databases , 2015, Journal of Intelligent Information Systems.

[23]  Serge Abiteboul,et al.  Corroborating information from disagreeing views , 2010, WSDM '10.

[24]  Zongge Liu,et al.  H-Fuse: Efficient Fusion of Aggregated Historical Data , 2017, SDM.

[25]  Felix Naumann,et al.  Data Fusion – Resolving Data Conflicts for Integration , 2009 .

[26]  Matthias Thimm,et al.  On the Evaluation of Inconsistency Measures , 2017 .

[27]  Frank Y. Li,et al.  A Novel Approach to Trust Management in Unattended Wireless Sensor Networks , 2014, IEEE Transactions on Mobile Computing.

[28]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[29]  Rajeev Rastogi,et al.  A cost-based model and effective heuristic for repairing constraints by value modification , 2005, SIGMOD '05.