Application of fuzzy semantic similarity measures to event detection within tweets

This paper examines the suitability of applying fuzzy semantic similarity measures (FSSM) to the task of detecting potential future events through the use of a group of prototypical event tweets. FSSM are ideal measures to be used to analyse the semantic textual content of tweets due to the ability to deal equally with not only nouns, verbs, adjectives and adverbs, but also perception based fuzzy words. The proposed methodology first creates a set of prototypical event related tweets and a control group of tweets from a data source, then calculates the semantic similarity against an event dataset compiled from tweets issued during the 2011 London riots. The dataset of tweets contained a proportion of tweets that the Guardian Newspaper publically released that were attributed to 200 influential Twitter users during the actual riot. The effects of changing the semantic similarity threshold are investigated in order to evaluate if Twitter tweets can be used in conjunction with fuzzy short text similarity measures and prototypical event related tweets to determine if an event is more likely to occur. By looking at the increase in frequency of tweets in the dataset, over a certain similarity threshold when matched with prototypical event tweets about riots, the results have shown that a potential future event can be detected.

[1]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[2]  Vijay V. Raghavan,et al.  Detection of event onset using Twitter , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[3]  David McLean,et al.  An automatic corpus based method for a building Multiple Fuzzy Word Dataset , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[4]  Gisele L. Pappa,et al.  Traffic observatory: a system to detect and locate traffic events and conditions using Twitter , 2012, LBSN '12.

[5]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[6]  Joseph Hillier,et al.  Social Network Analysis of an Urban Street Gang Using Police Intelligence Data Research Report 89 , 2016 .

[7]  Bohdan M. Pavlyshenko,et al.  Forecasting of Events by Tweet Data Mining , 2013, ArXiv.

[8]  Kevin Chen-Chuan Chang,et al.  Towards a social media analytics platform , 2014 .

[9]  Igor Brigadir,et al.  Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering , 2014, SNOW-DC@WWW.

[10]  Gregory J. L. Tourte,et al.  Twitter, information sharing and the London riots? , 2012 .

[11]  Abdelhamid Bouchachia,et al.  Information Propagation in Social Networks During Crises: A Structural Framework , 2015, Propagation Phenomena in Real World Networks.

[12]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[13]  Rui Li,et al.  Towards a social media analytics platform: event detection and user profiling for twitter , 2014, WWW.

[14]  David McLean,et al.  FAST: A fuzzy semantic sentence similarity measure , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[15]  David McLean,et al.  On the creation of a fuzzy dataset for the evaluation of fuzzy semantic similarity measures , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[16]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  Argimiro Arratia,et al.  Forecasting with twitter data , 2013, ACM Trans. Intell. Syst. Technol..

[18]  Zuhair Bandar,et al.  A new benchmark dataset with production methodology for short text semantic similarity algorithms , 2013, TSLP.