Optimizing Source Selection in Social Sensing in the Presence of Influence Graphs

This paper addresses the problem of choosing the right sources to solicit data from in sensing applications involving broadcast channels, such as those crowdsensing applications where sources share their observations on social media. The goal is to select sources such that expected fusion error is minimized. We assume that soliciting data from a source incurs a cost and that the cost budget is limited. Contrary to other formulations of this problem, we focus on the case where some sources influence others. Hence, asking a source to make a claim affects the behavior of other sources as well, according to an influence model. The paper makes two contributions. First, we develop an analytic model for estimating expected fusion error, given a particular influence graph and solution to the source selection problem. Second, we use that model to search for a solution that minimizes expected fusion error, formulating it as a zero-one integer non-linear programming (INLP) problem. To scale the approach, the paper further proposes a novel reliability-based pruning heuristic (RPH) and a similarity-based lossy estimation (SLE) algorithm that significantly reduce the complexity of the INLP algorithm at the cost of a modest approximation. The analytically computed expected fusion error is validated using both simulations and real-world data from Twitter, demonstrating a good match between analytic predictions and empirical measurements. It is also shown that our method outperforms baselines in terms of resulting fusion error.

[1]  Charu C. Aggarwal,et al.  Using humans as sensors: An estimation-theoretic perspective , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[2]  Klara Nahrstedt,et al.  Quality of Information Aware Incentive Mechanisms for Mobile Crowd Sensing Systems , 2015, MobiHoc.

[3]  Bo Zhao,et al.  Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation , 2014, SIGMOD Conference.

[4]  Jing Gao,et al.  Truth Discovery on Crowd Sensing of Correlated Entities , 2015, SenSys.

[5]  Shaohan Hu,et al.  On Source Dependency Models for Reliable Social Sensing: Algorithms and Fundamental Error Bounds , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[6]  Kwong-Sak Leung,et al.  A Survey of Crowdsourcing Systems , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[7]  Lu Su,et al.  A Truth Discovery Approach with Theoretical Guarantee , 2016, KDD.

[8]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[9]  Nicholas R. Jennings,et al.  Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks , 2013, AAMAS.

[10]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Eric Horvitz,et al.  Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[12]  Yan Zhang,et al.  Individual Differences and Online Health Information Source Selection , 2016, CHIIR.

[13]  Murat Demirbas,et al.  Crowdsourcing for Multiple-Choice Question Answering , 2014, AAAI.

[14]  Shen Li,et al.  Scalable social sensing of interdependent phenomena , 2015, IPSN.

[15]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[16]  Devavrat Shah,et al.  Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[17]  Shaohan Hu,et al.  DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing , 2016, WWW.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Klara Nahrstedt,et al.  CENTURION: Incentivizing multi-requester mobile crowd sensing , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[20]  Andreas Krause,et al.  Truthful incentives in crowdsourcing tasks using regret minimization mechanisms , 2013, WWW.

[21]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[22]  Eric Horvitz,et al.  Volunteering Versus Work for Pay: Incentives and Tradeoffs in Crowdsourcing , 2013, HCOMP.

[23]  Charu C. Aggarwal,et al.  Recursive Ground Truth Estimator for Social Data Streams , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[24]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[25]  Tarek F. Abdelzaher,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, International Symposium on Information Processing in Sensor Networks.

[26]  Pei-Hsuan Tsai,et al.  Design and implementation of participant selection for crowdsourcing disaster information , 2015 .