Truth Discovery from Conflicting Multi-Valued Objects

Truth discovery is a fundamental research topic, which aims at identifying the true value(s) of objects of interest given the conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, we can still point out two significant issues unsolved: i) single-valued assumption, i.e., current methods assume only one true value for each object, while in reality objects with multiple true values widely exist; ii) sparse ground truth, i.e., current works evaluate and compare existing truth discovery methods based on datasets with limited ground truth. Therefore, the empirical studies might be biased and cannot legitimately validate the existing methods. In this PhD project, we propose a full-fledged graph-based model, SmartMTD (Smart Multi-valued Truth Discovery), which incorporates four important implications to conduct truth discovery for multi-valued objects. Two graphs are constructed and further used to derive two aspects of source reliability via random walk computations. We also present a general approach, which utilizes Markov chain models with Bayesian inference, for comparing the existing truth discovery methods and validate our approach without ground truth. Initial empirical studies on two real-world datasets show the effectiveness of SmartMTD.

[1]  Dan Roth,et al.  Latent credibility analysis , 2013, WWW.

[2]  Xiaoxin Yin,et al.  Semi-supervised truth discovery , 2011, WWW.

[3]  Bo Zhao,et al.  A Survey on Truth Discovery , 2015, SKDD.

[4]  Bo Zhao,et al.  A Confidence-Aware Approach for Truth Discovery on Long-Tail Data , 2014, Proc. VLDB Endow..

[5]  References , 1971 .

[6]  Serge Abiteboul,et al.  Corroborating information from disagreeing views , 2010, WSDM '10.

[7]  Dan Roth,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Making Better Informed Trust Decisions with Generalized Fact-Finding , 2022 .

[8]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[9]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[10]  Laure Berti-Équille,et al.  Truth Discovery Algorithms: An Experimental Evaluation , 2014, ArXiv.

[11]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[12]  Lance Kaplan,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[13]  Bo Zhao,et al.  Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation , 2014, SIGMOD Conference.

[14]  Jiawei Han,et al.  A Probabilistic Model for Estimating Real-valued Truth from Conflicting Sources , 2012 .

[15]  Dan Roth,et al.  Generalized fact-finding , 2011, WWW.

[16]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[17]  Divesh Srivastava,et al.  Truth Finding on the Deep Web: Is the Problem Solved? , 2012, Proc. VLDB Endow..

[18]  Divesh Srivastava,et al.  Less is More: Selecting Sources Wisely for Integration , 2012, Proc. VLDB Endow..

[19]  Lina Yao,et al.  An Integrated Bayesian Approach for Effective Multi-Truth Discovery , 2015, CIKM.