A Distance-Dependent Chinese Restaurant Process Based Method for Event Detection on Social Media

In this paper, we propose a method for event detection on social media, which aims at clustering media items into groups of events based on their textural information as well as available metadata. Our approach is based on distance-dependent Chinese Restaurant Process (ddCRP), a clustering approach resembling Dirichlet process algorithm. Furthermore, we scrutinize the effectiveness of a series of pre-processing steps in improving the detection performance. We experimentally evaluated our method using the Social Event Detection (SED) dataset of MediaEval 2013 benchmarking workshop, which pertains to the discovery of social events and their grouping in event-specific clusters. The obtained results indicate that the proposed method attains very good performance rates compared to existing approaches.

[1]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[2]  Cheng Li,et al.  Data clustering using side information dependent Chinese restaurant processes , 2015, Knowledge and Information Systems.

[3]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[4]  Francesco G. B. De Natale,et al.  Event Clustering and Classification from Social Media: Watershed-based and Kernel Methods , 2013, MediaEval.

[5]  Lars Schmidt-Thieme,et al.  Supervised Clustering of Social Media Streams , 2013, MediaEval.

[6]  Yiannis Kompatsiaris,et al.  Social event detection using multimodal clustering and integrating supervisory signals , 2012, ICMR.

[7]  Mehmet A. Orgun,et al.  A survey on real-time event detection from the Twitter data stream , 2018, J. Inf. Sci..

[8]  Simone Frintrop,et al.  Object Proposal Generation Applying the Distance Dependent Chinese Restaurant Process , 2017, SCIA.

[9]  Philipp Cimiano,et al.  Event-based classification of social media streams , 2012, ICMR.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Regina Barzilay,et al.  Event Discovery in Social Media Feeds , 2011, ACL.

[12]  Emmanouel A. Varvarigos,et al.  Event Detection in Twitter Microblogging , 2016, IEEE Transactions on Cybernetics.

[13]  Maia Zaharieva,et al.  Unsupervised Clustering of Social Events , 2013, MediaEval.

[14]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[15]  Soumya Ghosh,et al.  Spatial distance dependent Chinese restaurant processes for image segmentation , 2011, NIPS.

[16]  Philipp Cimiano,et al.  ReSEED: social event dEtection dataset , 2014, MMSys '14.

[17]  Miles Osborne,et al.  Using paraphrases for improving first story detection in news and Twitter , 2012, HLT-NAACL.

[18]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[19]  Yiannis Kompatsiaris,et al.  Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation , 2012, MediaEval.

[20]  Richi Nayak,et al.  ADMRG @ MediaEval 2013 Social Event Detection , 2013, MediaEval.

[21]  Konstantinos Tserpes,et al.  A Similarity-based Chinese Restaurant Process for Social Event Detection , 2013, MediaEval.

[22]  Xavier Giró-i-Nieto,et al.  UPC at MediaEval 2014 Social Event Detection Task , 2013, MediaEval.

[23]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[24]  Michael G. Strintzis,et al.  A Data-Driven Approach for Social Event Detection , 2013, MediaEval.

[25]  Neha Jain,et al.  Social Event Detection Via Sparse Multi-modal Feature Selection and Incremental Density Based Clustering , 2013, MediaEval.

[26]  Christopher D. Manning,et al.  Spectral Chinese Restaurant Processes: Nonparametric Clustering Based on Similarities , 2011, AISTATS.

[27]  Krishna Chandramouli,et al.  VIT@MediaEval 2013 Social Event Detection Task: Semantic Structuring of Complementary Information for Clustering Events , 2013, MediaEval.

[28]  Cheng Li,et al.  Exploiting side information in distance dependent Chinese restaurant processes for data clustering , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[29]  Craig MacDonald,et al.  Enhancing First Story Detection using Word Embeddings , 2016, SIGIR.

[30]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[31]  Xavier Giró Nieto,et al.  UPC at MediaEval 2013 social event detection task , 2013 .