Analyzing the evolution of rare events via social media data and k-means clustering algorithm

Recently, many researchers attempt to find relationships between rare events and social media activities. This work proposes a data processing method based on the k-means clustering algorithm and analyze the evolution of a rare event via social media data. We use k-means twice in the spatial and time domains, respectively. The effectiveness of the method is verified by analyzing the damage of a hurricane named Sandy that occurred in 2012. The data set with respect to Sandy is obtained from a very popular social media, Twitter. The results show that our method can precisely predicate the accurate evolution of the hurricane, i.e., the affected place, time and severity. Besides, two new concepts, growth ratio and DRR rate, are presented to analyze the dataset in the time domain.

[1]  MengChu Zhou,et al.  Vehicle Scheduling of an Urban Bus Line via an Improved Multiobjective Genetic Algorithm , 2015, IEEE Transactions on Intelligent Transportation Systems.

[2]  L. R. Leighton Are We Asking The Right Question? , 2004 .

[3]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[4]  MengChu Zhou,et al.  Composite Particle Swarm Optimizer With Historical Memory for Function Optimization , 2015, IEEE Transactions on Cybernetics.

[5]  O. O. Oladipupo,et al.  Application of k Means Clustering algorithm for prediction of Students Academic Performance , 2010, ArXiv.

[6]  MengChu Zhou,et al.  Colored Traveling Salesman Problem , 2015, IEEE Transactions on Cybernetics.

[7]  MengChu Zhou,et al.  Swarm Intelligence Approaches to Optimal Power Flow Problem With Distributed Generator Failures in Power Networks , 2013, IEEE Transactions on Automation Science and Engineering.

[8]  E. Quarantelli,et al.  What Is A Disaster?: New Answers to Old Questions , 2005 .

[9]  Xiangyang Guan,et al.  Using social media data to understand and assess disasters , 2014, Natural Hazards.

[10]  K. Tierney From the Margins to the Mainstream? Disaster Research at the Crossroads , 2007 .

[11]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[12]  MengChu Zhou,et al.  An adaptive particle swarm optimization method based on clustering , 2015, Soft Comput..

[13]  MengChu Zhou,et al.  Generating Highly Accurate Predictions for Missing QoS Data via Aggregating Nonnegative Latent Factor Models , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[14]  E. Quarantelli,et al.  Response to Social Crisis and Disaster , 1977 .

[15]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[16]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[17]  MengChu Zhou,et al.  Understanding the evolution of a disaster—a Framework for Assessing Crisis in a System Environment (FACSE) , 2012, Natural Hazards.

[18]  M. Zhou,et al.  Gaussian Classifier-Based Evolutionary Strategy for Multimodal Optimization , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[19]  MengChu Zhou,et al.  Improved Quantum-Inspired Evolutionary Algorithm for Large-Size Lane Reservation , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.