ComStreamClust: A communicative text clustering approach to topic detection in streaming data

Topic detection is the task of determining and tracking hot topics in social media. Twitter is arguably the most popular platform for people to share their ideas with others about different issues. One such prevalent issue is the COVID-19 pandemic. Detecting and tracking topics on these kinds of issues would help governments and healthcare companies deal with this phenomenon. In this paper, we propose a novel communicative clustering approach, so-called ComStreamClust for clustering sub-topics inside a broader topic, e.g. COVID-19. The proposed approach was evaluated on two datasets: the COVID-19 and the FA CUP. The results obtained from ComStreamClust approve the effectiveness of the proposed approach when compared to existing methods such as LDA.

[1]  Yuefeng Li,et al.  Hot Topic Detection in Professional Blogs , 2011, AMT.

[2]  Roberto V. Zicari,et al.  PoliTwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis , 2014, Knowl. Based Syst..

[3]  Matthias Carnein,et al.  Stream Clustering of Chat Messages with Applications to Twitch Streams , 2017, ER Workshops.

[4]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[5]  Shervin Minaee,et al.  COVID - TRANSFORMER : DETECTING TRENDING TOPICS ON TWITTER USING UNIVERSAL SENTENCE ENCODER , 2020 .

[6]  Yang Xiang,et al.  LDA-based online topic detection using tensor factorization , 2013, J. Inf. Sci..

[7]  J. Allan,et al.  On-Line New Event Detection using Single Pass Clustering , 1998 .

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[10]  Hendri Murfi,et al.  Combination of singular value decomposition and K-means clustering methods for topic detection on Twitter , 2015, 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[11]  Ke Wang,et al.  TopicSketch: Real-Time Bursty Topic Detection from Twitter , 2013, 2013 IEEE 13th International Conference on Data Mining.

[12]  Yücel Saygin,et al.  Sentimental causal rule discovery from Twitter , 2014, Expert Syst. Appl..

[13]  Ana-Maria Popescu,et al.  Detecting controversial events from twitter , 2010, CIKM.

[14]  Fakhri Karray,et al.  Tools and approaches for topic detection from Twitter streams: survey , 2017, Knowledge and Information Systems.

[15]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[16]  Ana M. García-Serrano,et al.  A step forward for Topic Detection in Twitter: An FCA-based approach , 2016, Expert Syst. Appl..

[17]  Michael Hahsler,et al.  Clustering Data Streams Based on Shared Density between Micro-Clusters , 2016, IEEE Transactions on Knowledge and Data Engineering.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Mehmet A. Orgun,et al.  Real-time event detection from the Twitter data stream using the TwitterNews+ Framework , 2019, Inf. Process. Manag..

[20]  Yunming Ye,et al.  Detecting hot topics from Twitter: A multiview approach , 2014, J. Inf. Sci..

[21]  Shervin Minaee,et al.  Covid-Transformer: Detecting COVID-19 Trending Topics on Twitter Using Universal Sentence Encoder , 2020, ArXiv.

[22]  Nitin N. Patil,et al.  Topic detection using BNgram method and sentiment analysis on twitter dataset , 2015, 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions).