Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks

Events are happening in real world and real time, which can be planned and organized for occasions, such as social gatherings, festival celebrations, influential meetings, or sports activities. Social media platforms generate a lot of real-time text information regarding public events with different topics. However, mining social events is challenging because events typically exhibit heterogeneous texture and metadata are often ambiguous. In this article, we first design a novel event-based meta-schema to characterize the semantic relatedness of social events and then build an event-based heterogeneous information network (HIN) integrating information from external knowledge base. Second, we propose a novel Pairwise Popularity Graph Convolutional Network, named as PP-GCN, based on weighted meta-path instance similarity and textual semantic representation as inputs, to perform fine-grained social event categorization and learn the optimal weights of meta-paths in different tasks. Third, we propose a streaming social event detection and evolution discovery framework for HINs based on meta-path similarity search, historical information about meta-paths, and heterogeneous DBSCAN clustering method. Comprehensive experiments on real-world streaming social text data are conducted to compare various social event detection and evolution discovery algorithms. Experimental results demonstrate that our proposed framework outperforms other alternative social event detection and evolution discovery techniques.

[1]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Laks V. S. Lakshmanan,et al.  Event Evolution Tracking from Streaming Social Posts , 2013, ArXiv.

[4]  Michalis Vazirgiannis,et al.  Text Categorization as a Graph Classification Problem , 2015, ACL.

[5]  Bang Liu Story Forest: Extracting Events and Telling Stories from Breaking News , 2020 .

[6]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[7]  Matthew Hurst,et al.  Event Detection and Tracking in Social Streams , 2009, ICWSM.

[8]  Michael Grossniklaus,et al.  Event Identification and Tracking in Social Media Streaming Data , 2014, EDBT/ICDT Workshops.

[9]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[10]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[11]  Ming Zhou,et al.  Exacting Social Events for Tweets Using a Factor Graph , 2012, AAAI.

[12]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[13]  Rodrygo L. T. Santos,et al.  Context-Aware Event Recommendation in Event-based Social Networks , 2015, RecSys.

[14]  Xiong Li,et al.  Event detection and evolution in multi-lingual social streams , 2020, Frontiers of Computer Science.

[15]  Yu Xu,et al.  Matching Long Text Documents via Graph Convolutional Networks , 2018, ArXiv.

[16]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[17]  Jai E. Jung,et al.  Real-time event detection for online behavioral analysis of big social data , 2017, Future Gener. Comput. Syst..

[18]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[19]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[20]  Marián Boguñá,et al.  Popularity versus similarity in growing networks , 2011, Nature.

[21]  Isamu Shioya,et al.  Topic Detection and Tracking for News Web Pages , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[22]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[23]  Eric Horvitz,et al.  Mining the web to predict future events , 2013, WSDM.

[24]  Heng Ji,et al.  Building a Cross-document Event-Event Relation Corpus , 2016, LAW@ACL.

[25]  Sachin Patel,et al.  An Overview on Event Evolution Technique , 2013 .

[26]  Ramesh Nallapati,et al.  Event threading within news topics , 2004, CIKM '04.

[27]  Juha Makkonen,et al.  Investigations on Event Evolution on TDT , 2003, NAACL.

[28]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[29]  Yu He,et al.  HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding , 2019, CIKM.

[30]  Yizhou Sun,et al.  Co-Evolution of Multi-Typed Objects in Dynamic Star Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[31]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[32]  Jiawei Han,et al.  Unsupervised meta-path selection for text similarity measure based on heterogeneous information networks , 2018, Data Mining and Knowledge Discovery.

[33]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[34]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[35]  Jeffrey M. Zacks,et al.  Event structure in perception and conception. , 2001, Psychological bulletin.

[36]  Charu C. Aggarwal,et al.  On Anomalous Hotspot Discovery in Graph Streams , 2013, 2013 IEEE 13th International Conference on Data Mining.

[37]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[38]  Philip S. Yu,et al.  Predicting Social Links for New Users across Aligned Heterogeneous Social Networks , 2013, 2013 IEEE 13th International Conference on Data Mining.

[39]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[40]  Annemarie Friedrich,et al.  Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016) , 2016 .

[41]  Akinori Yonezawa,et al.  Overview of Genia Event Task in BioNLP Shared Task 2011 , 2011, BioNLP@ACL.

[42]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[43]  Jinpeng Huai,et al.  Ring: Real-Time Emerging Anomaly Monitoring System Over Text Streams , 2019, IEEE Transactions on Big Data.

[44]  Nebojsa Jojic,et al.  Documents as multiple overlapping windows into grids of counts , 2013, NIPS.

[45]  Jianxin Li,et al.  Discovering Event Evolution Chain in Microblog , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[46]  Iadh Ounis,et al.  Real-Time Detection, Tracking, and Monitoring of Automatically Discovered Events in Social Media , 2014, ACL.

[47]  Rossano Schifanella,et al.  The role of information diffusion in the evolution of social networks , 2013, KDD.

[48]  Jianxin Li,et al.  Event Detection and Evolution Based on Knowledge Base , 2017 .

[49]  R. Papka,et al.  On-line new event detection and tracking , 1998, SIGIR '98.

[50]  D. Getz Event tourism: Definition, evolution, and research , 2008 .

[51]  Dongxiao He,et al.  Event prediction based on evolutionary event ontology knowledge , 2021, Future Gener. Comput. Syst..

[52]  Geoff O'Brien,et al.  Approaching disaster management through social learning , 2010 .

[53]  Hans-Peter Blossfeld,et al.  Event History Analysis: Statistical theory and Application in the Social Sciences , 2016 .

[54]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[55]  Heiko Paulheim,et al.  Automatic Classification and Relationship Extraction for Multi-Lingual and Multi-Granular Events from Wikipedia , 2012, DeRiVE@ISWC.

[56]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[57]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[58]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[59]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[60]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[61]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[62]  Jiawei Han,et al.  KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks , 2015, 2015 IEEE International Conference on Data Mining.

[63]  Jiawei Han,et al.  Large-Scale Embedding Learning in Heterogeneous Event Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[64]  Zi Huang,et al.  Embedding and predicting the event at early stage , 2018, World Wide Web.

[65]  Owen Rambow,et al.  Automatic Detection and Classification of Social Events , 2010, EMNLP.

[66]  Philip S. Yu,et al.  Knowledge-Preserving Incremental Social Event Detection via Heterogeneous GNNs , 2021, WWW.

[67]  Wei-keng Liao,et al.  A new scalable parallel DBSCAN algorithm using the disjoint-set data structure , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[68]  Jiawei Han,et al.  Text Classification with Heterogeneous Information Network Kernels , 2016, AAAI.

[69]  Louiqa Raschid,et al.  A Graph Analytical Approach for Topic Detection , 2013, TOIT.

[70]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[71]  Milos Hauskrecht,et al.  Mining recent temporal patterns for event detection in multivariate time series data , 2012, KDD.

[72]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[73]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[74]  Harry Timmermans,et al.  Predicting the evolution of social networks with life cycle events , 2015, Transportation.

[75]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[76]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[77]  Haofen Wang,et al.  Towards Effective Event Detection, Tracking and Summarization on Microblog Data , 2011, WAIM.

[78]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[79]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[80]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[81]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[82]  Xueqi Cheng,et al.  A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations , 2015, AAAI.

[83]  Mehmet A. Orgun,et al.  Real-time event detection from the Twitter data stream using the TwitterNews+ Framework , 2019, Inf. Process. Manag..

[84]  Philip S. Yu,et al.  Fine-grained Event Categorization with Heterogeneous Graph Convolutional Networks , 2019, IJCAI.

[85]  Ning Ma,et al.  SuperedgeRank algorithm and its application in identifying opinion leader of online public opinion supernetwork , 2014, Expert Syst. Appl..

[86]  Hila Becker,et al.  Identification and Characterization of Events in Social Media , 2011 .

[87]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[88]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[89]  Bin Liang,et al.  CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System , 2017, IEA/AIE.

[90]  Sampo Pyysalo,et al.  Overview of BioNLP’09 Shared Task on Event Extraction , 2009, BioNLP@HLT-NAACL.