Detecting Statistically Significant Events in Large Heterogeneous Attribute Graphs via Densest Subgraphs

With the widespread of social platforms, event detection is becoming an important problem in social media. Yet, the large amount of content accumulated on social platforms brings great challenges. Moreover, the content usually is informal, lacks of semantics and rapidly spreads in dynamic networks, which makes the situation even worse. Existing approaches, including content-based detection and network structure-based detection, only use limited and single information of social platforms that limits the accuracy and integrity of event detection. In this paper, (1) we propose to model the entire social platform as a heterogeneous attribute graph (HAG), including types, entities, relations and their attributes; (2) we exploit non-parametric scan statistics to measure the statistical significance of subgraphs in HAG by considering historical information; (3) we transform the event detection in HAG into a densest subgraph discovery problem in statistical weighted network. Due to its NP-hardness, we propose an efficient approximate method to find the densest subgraphs based on (k, \(\varPsi \))-core, and simultaneously the statistical significance is guaranteed. In experiments, we conduct comprehensive empirical evaluations on Weibo data to demonstrate the effectiveness and efficiency of our proposed approaches.

[1]  Mona T. Diab,et al.  Rumor Detection and Classification for Twitter Data , 2015, ArXiv.

[2]  He Chen,et al.  Scalable Rumor Source Detection under Independent Cascade Model in Online Social Networks , 2015, 2015 11th International Conference on Mobile Ad-hoc and Sensor Networks (MSN).

[3]  Martin Vetterli,et al.  Locating the Source of Diffusion in Large-Scale Networks , 2012, Physical review letters.

[4]  Kyomin Jung,et al.  Prominent Features of Rumor Propagation in Online Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining.

[5]  Arkaitz Zubiaga,et al.  Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media , 2016, ArXiv.

[6]  Hongyan Liu,et al.  Detecting Event Rumors on Sina Weibo Automatically , 2013, APWeb.

[7]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Daniel B. Neill,et al.  Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs , 2014, KDD.

[9]  Douglas H. Jones,et al.  Goodness-of-fit test statistics that dominate the Kolmogorov statistics , 1979 .

[10]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[11]  Matthias Weidlich,et al.  From Anomaly Detection to Rumour Detection using Data Streams of Social Platforms , 2019, Proc. VLDB Endow..

[12]  Reynold Cheng,et al.  Efficient Algorithms for Densest Subgraph Discovery , 2019, Proc. VLDB Endow..

[13]  Kenny Q. Zhu,et al.  False rumors detection on Sina Weibo by propagation structures , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[14]  Weili Wu,et al.  A novel approach for detecting multiple rumor sources in networks with partial observations , 2015, Journal of Combinatorial Optimization.

[15]  Quan Z. Sheng,et al.  Extreme User and Political Rumor Detection on Twitter , 2016, ADMA.

[16]  Jinquan Zeng,et al.  Rumor Identification in Microblogging Systems Based on Users’ Behavior , 2015, IEEE Transactions on Computational Social Systems.

[17]  Yongdong Zhang,et al.  Rumor Detection on Twitter Pertaining to the 2016 U.S. Presidential Election , 2017, ArXiv.

[18]  Liang Zheng,et al.  A probabilistic characterization of the rumor graph boundary in rumor source detection , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).

[19]  Feng Ji,et al.  An Algorithmic Framework for Estimating Rumor Sources With Different Start Times , 2017, IEEE Transactions on Signal Processing.