Leveraging Phase Transition of Topics for Event Detection in Social Media

With the advancement of technology, many processes in our world have been reformulated, updated, and digitized. Therefore, interpersonal relationships have also been following this trend so that social networks have become increasingly present in our lives. Given this context, social network users create and share a large amount of data, from content about their daily lives, funny facts, as well as information about traffic, weather, and various subjects. The problem of event detection in social media, such as Twitter, is related to the identification of the first story on a topic of interest. In this work, we propose a novel approach based on the observation that tweets are subjected to a continuous phase transition when an event takes place, i.e., its underlying dynamic changes. Our proposal consists of a formal characterization of the phase transition that occurs when an event takes place, and the use of this characterization to devise a new method to detect events in Twitter, based on calculating the entropy of the keywords extracted from the content of tweets (regardless of the language used). We evaluated the performance of our approach using seven data sets, and we outperformed nine different techniques present in the literature. Unlike the work found in the literature, we present a theoretical rationale about the existence of phase transitions. For this, we characterize a model, already existing in the literature, of phase transitions described by differential equations, where we find correspondence between the model used in the study and the real data. The experimental results show that our proposal significantly improves the learning performance for the metrics used.

[1]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[2]  Wanlei Zhou,et al.  Rumor Source Identification in Social Networks with Time-Varying Topology , 2018, IEEE Transactions on Dependable and Secure Computing.

[3]  R. Shaw,et al.  Utilization of Social Media in the East Japan Earthquake and Tsunami and its Effectiveness , 2012 .

[4]  Makarand Hastak,et al.  Social network analysis: Characteristics of online social networks after a disaster , 2018, Int. J. Inf. Manag..

[5]  Bhuvaneswari Anbalagan,et al.  Information entropy based event detection during disaster in cyber-social networks , 2019, J. Intell. Fuzzy Syst..

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Tarek F. Abdelzaher,et al.  ClariSense+: An enhanced traffic anomaly explanation service using social network feeds , 2016, Pervasive Mob. Comput..

[8]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[9]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[10]  Jugal K. Kalita,et al.  Streaming trend detection in Twitter , 2013, Int. J. Web Based Communities.

[11]  Omer F. Rana,et al.  Can We Predict a Riot? Disruptive Event Detection Using Twitter , 2017, ACM Trans. Internet Techn..

[12]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Feng Gao,et al.  Early detection method for emerging topics based on dynamic bayesian networks in micro-blogging networks , 2016, Expert Syst. Appl..

[15]  Zeynep Tufekci,et al.  Social Media and the Decision to Participate in Political Protest: Observations From Tahrir Square , 2012 .

[16]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[17]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[18]  Jafar Adibi,et al.  Discovering important nodes through graph entropy the case of Enron email database , 2005, LinkKDD '05.

[19]  Cécile Favre,et al.  Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach , 2015, Social Network Analysis and Mining.

[20]  Cécile Bothorel,et al.  Location Recommendation with Social Media Data , 2018, Social Information Access.

[21]  Moment ratios for absorbing-state phase transitions , 1998, cond-mat/9805294.

[22]  O. Rosso,et al.  Shakespeare and other English Renaissance authors as characterized by Information Theory complexity quantifiers , 2009 .

[23]  Kazutoshi Sumiya,et al.  Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection , 2010, LBSN '10.

[24]  Antonio Alfredo Ferreira Loureiro,et al.  Event Detection in Social Media Through Phase Transition of Bigrams Entropy , 2018, 2018 IEEE Symposium on Computers and Communications (ISCC).

[25]  J. Mockus,et al.  THE BAYES METHODS FOR SEEKING THE EXTREMAL POINT , 1974 .

[26]  Michelle X. Zhou,et al.  Event detection with social media data , 2012 .

[27]  Hai Jin,et al.  QuickPoint: Efficiently identifying densest sub-graphs in Online Social Networks for event stream dissemination , 2016, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS).

[28]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[29]  Nobuyuki Yagi,et al.  Bayesian event detection for sport games with hidden Markov model , 2011, Pattern Analysis and Applications.

[30]  Florence Sèdes,et al.  A Topic-Based Hidden Markov Model for Real-Time Spam Tweets Filtering , 2017, KES.

[31]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[32]  Arkaitz Zubiaga,et al.  A longitudinal assessment of the persistence of twitter datasets , 2017, J. Assoc. Inf. Sci. Technol..

[33]  Eleonora D'Andrea,et al.  Real-Time Detection of Traffic From Twitter Stream Analysis , 2015, IEEE Transactions on Intelligent Transportation Systems.

[34]  Hawoong Jeong,et al.  Universality classes of the generalized epidemic process on random networks. , 2015, Physical review. E.

[35]  Jai E. Jung,et al.  Real-time event detection for online behavioral analysis of big social data , 2017, Future Gener. Comput. Syst..

[36]  Jakub Piskorski,et al.  On Named Entity Recognition in Targeted Twitter Streams in Polish , 2013, BSNLP@ACL.

[37]  Nikos Giatrakos,et al.  Omnibus outlier detection in sensor networks using windowed locality sensitive hashing , 2020, Future Gener. Comput. Syst..

[38]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[39]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[40]  M. Kuperman,et al.  Small world effect in an epidemiological model. , 2000, Physical review letters.

[41]  D. Zanette Critical behavior of propagation on small-world networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[43]  L. Landau,et al.  The Theory of Phase Transitions , 1936, Nature.

[44]  Zvia Agur,et al.  Theoretical examination of the pulse vaccination policy in the SIR epidemic model , 2000 .

[45]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[46]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[47]  Muhammad Imran,et al.  Automatic identification of eyewitness messages on twitter during disasters , 2020, Inf. Process. Manag..

[48]  Kamalakar Karlapalem,et al.  ET: events from tweets , 2013, WWW.

[49]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[50]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[51]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[52]  Cheong Hee Park,et al.  Emerging topic detection in twitter stream based on high utility pattern mining , 2019, Expert Syst. Appl..