From Truth Discovery to Trustworthy Opinion Discovery: An Uncertainty-Aware Quantitative Modeling Approach

In this era of information explosion, conflicts are often encountered when information is provided by multiple sources. Traditional truth discovery task aims to identify the truth the most trustworthy information, from conflicting sources in different scenarios. In this kind of tasks, truth is regarded as a fixed value or a set of fixed values. However, in a number of real-world cases, objective truth existence cannot be ensured and we can only identify single or multiple reliable facts from opinions. Different from traditional truth discovery task, we address this uncertainty and introduce the concept of trustworthy opinion of an entity, treat it as a random variable, and use its distribution to describe consistency or controversy, which is particularly difficult for data which can be numerically measured, i.e. quantitative information. In this study, we focus on the quantitative opinion, propose an uncertainty-aware approach called Kernel Density Estimation from Multiple Sources (KDEm) to estimate its probability distribution, and summarize trustworthy information based on this distribution. Experiments indicate that KDEm not only has outstanding performance on the classical numeric truth discovery task, but also shows good performance on multi-modality detection and anomaly detection in the uncertain-opinion setting.

[1]  Mani B. Srivastava,et al.  Debiasing crowdsourced quantitative characteristics in local businesses and services , 2015, IPSN.

[2]  Serge Abiteboul,et al.  Corroborating information from disagreeing views , 2010, WSDM '10.

[3]  Charu C. Aggarwal,et al.  Mining collective intelligence in diverse groups , 2013, WWW.

[4]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[5]  Bo Zhao,et al.  Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation , 2014, SIGMOD Conference.

[6]  Yue Lu,et al.  Latent aspect rating analysis without aspect keyword supervision , 2011, KDD.

[7]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[8]  Bo Zhao,et al.  A Confidence-Aware Approach for Truth Discovery on Long-Tail Data , 2014, Proc. VLDB Endow..

[9]  Alexander Hinneburg,et al.  DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation , 2007, IDA.

[10]  Jiawei Han,et al.  A Probabilistic Model for Estimating Real-valued Truth from Conflicting Sources , 2012 .

[11]  Dan Roth,et al.  Content-driven trust propagation framework , 2011, KDD.

[12]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[13]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[14]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2008, IEEE Trans. Knowl. Data Eng..

[15]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[16]  Lance Kaplan,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[17]  Heng Ji,et al.  Modeling Truth Existence in Truth Discovery , 2015, KDD.

[18]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[19]  Divesh Srivastava,et al.  Data Fusion: Resolving Conflicts from Multiple Sources , 2013, WAIM.

[20]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[21]  Dan Roth,et al.  Latent credibility analysis , 2013, WWW.

[22]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[23]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[24]  Dan Roth,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Making Better Informed Trust Decisions with Generalized Fact-Finding , 2022 .

[25]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[26]  Bo Zhao,et al.  A Survey on Truth Discovery , 2015, SKDD.

[27]  Clement T. Yu,et al.  T-verifier: Verifying truthfulness of fact statements , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[28]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.