Geo-located community detection in Twitter with enhanced fast-greedy optimization of modularity: the case study of typhoon Haiyan

As they increase in popularity, social media are regarded as important sources of information on geographical phenomena. Studies have also shown that people rely on social media to communicate during disasters and emergency situation, and that the exchanged messages can be used to get an insight into the situation. Spatial data mining techniques are one way to extract relevant information from social media. In this article, our aim is to contribute to this field by investigating how graph clustering can be applied to support the detection of geo-located communities in Twitter in disaster situations. For this purpose, we have enhanced the fast-greedy optimization of modularity (FGM) clustering algorithm with semantic similarity so that it can deal with the complex social graphs extracted from Twitter. Then, we have coupled the enhanced FGM with the varied density-based spatial clustering of applications with noise spatial clustering algorithm to obtain spatial clusters at different temporal snapshots. The method was experimented with a case study on typhoon Haiyan in the Philippines, and Twitter’s different interaction modes were compared to create the graph of users and to detect communities. The experiments show that communities that are relevant to identify areas where disaster-related incidents were reported can be extracted, and that the enhanced algorithm outperforms the generic one in this task.

[1]  Yannis Manolopoulos,et al.  C2P: Clustering based on Closest Pairs , 2001, VLDB.

[2]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[4]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[5]  Alex Arenas,et al.  Synchronization reveals topological scales in complex networks. , 2006, Physical review letters.

[6]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[7]  Anthony Stefanidis,et al.  #Earthquake: Twitter as a Distributed Sensor System , 2013, Trans. GIS.

[8]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Mark Gahegan,et al.  A Genetic Approach to Detecting Clusters in Point Data Sets , 2005 .

[10]  Pasquale De Meo,et al.  Mixing local and global information for community detection in large networks , 2013, J. Comput. Syst. Sci..

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[13]  Randy Goebel,et al.  Local Community Identification in Social Networks , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[14]  I Vragović,et al.  Network community structure and loop coefficient method. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Alexander Zipf,et al.  A dynamic and context-aware semantic mediation service for discovering and fusion of heterogeneous sensor data , 2013, J. Spatial Inf. Sci..

[16]  Tsuyoshi Murata,et al.  Detecting Communities in Social Networks , 2010, Handbook of Social Network Technologies.

[17]  Maximilian Walther,et al.  Geo-spatial Event Detection in the Twitter Stream , 2013, ECIR.

[18]  M. C. Pike,et al.  Epidemiology of childhood leukaemia in greater london: A search for evidence of transmission assuming a possibly long latent period. , 1976, British Journal of Cancer.

[19]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[20]  Borko Furht,et al.  Handbook of Social Network Technologies and Applications , 2010, Handbook of Social Network Technologies.

[21]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Abhaya Kumar Sahoo,et al.  ADCA: Advanced Density Based Clustering Algorithm for Spatial Database System , 2013 .

[23]  A. Pozdnoukhov,et al.  Spatial structure and dynamics of urban communities , 2011 .

[24]  David A Noyce,et al.  Application and Integration of Lattice Data Analysis, Network K-Functions, and Geographic Information System Software to Study Ice-Related Crashes , 2009 .

[25]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[26]  Peter Mooney,et al.  Understanding the Roles of Communities in Volunteered Geographic Information Projects , 2013, Progress in Location-Based Services.

[27]  Wei Hu,et al.  A Coarse-to-Fine Strategy for Vehicle Motion Trajectory Clustering , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[28]  Peter F. Fisher,et al.  Classics from IJGIS : twenty years of the International journal of geographical information science and systems , 2006 .

[29]  Martin Charlton,et al.  A Mark 1 Geographical Analysis Machine for the automated analysis of point data sets , 1987, Int. J. Geogr. Inf. Sci..

[30]  N Mantel,et al.  Lack of time-space clustering of childhood leukemia in Los Angeles County, 1960-1964. , 1969, Cancer research.

[31]  Chenghu Zhou,et al.  Please Scroll down for Article International Journal of Geographical Information Science Windowed Nearest Neighbour Method for Mining Spatio-temporal Clusters in the Presence of Noise Windowed Nearest Neighbour Method for Mining Spatio-temporal Clusters in the Presence of Noise , 2022 .

[32]  Vladimir Batagelj,et al.  Fast algorithms for determining (generalized) core groups in social networks , 2011, Adv. Data Anal. Classif..

[33]  Yiannis Kompatsiaris,et al.  Cluster-Based Landmark and Event Detection for Tagged Photo Collections , 2011, IEEE MultiMedia.

[34]  P. Albuquerque,et al.  SPATIAL HIERARCHICAL CLUSTERING , 2009 .

[35]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[37]  Marko Bajec,et al.  Group detection in complex networks: An algorithm and comparison of the state of the art , 2013, 1305.5136.

[38]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[39]  Sergio Gómez,et al.  Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms , 2006, J. Classif..

[40]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[41]  Mengen Chen,et al.  Short Text Classification Improved by Learning Multi-Granularity Topics , 2011, IJCAI.

[42]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[43]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[44]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[45]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[46]  Maurice Tchuente,et al.  Local Community Identification in Social Networks , 2012, Parallel Process. Lett..

[47]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[48]  Aron Culotta,et al.  Towards detecting influenza epidemics by analyzing Twitter messages , 2010, SOMA '10.

[49]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[50]  M. Parimala,et al.  A Survey on Density Based Clustering Algorithms for Mining Large Spatial Databases , 2011 .

[51]  L. Palen,et al.  Finding Community Through Information and Communication Technology During Disaster Events , 2008 .

[52]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[53]  Jiming Liu,et al.  Discovering global network communities based on local centralities , 2008, TWEB.

[54]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[55]  M. Hastings Community detection as an inference problem. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[56]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Yang Zhang,et al.  Community Discovery in Twitter Based on User Interests , 2012 .

[58]  Massimo Marchiori,et al.  Method to find community structures based on information centrality. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[59]  Renaud Lambiotte,et al.  Uncovering space-independent communities in spatial networks , 2010, Proceedings of the National Academy of Sciences.

[60]  Yiannis,et al.  Community Detection in Social Media Performance and application considerations , 2010 .

[61]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[62]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[63]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[64]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[65]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[66]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[67]  Giorgos Mountrakis,et al.  Multi‐scale spatiotemporal analyses of moose–vehicle collisions: a case study in northern Vermont , 2009, Int. J. Geogr. Inf. Sci..

[68]  Md. Asikur Rahman,et al.  An efficient method for subjectively choosing parameter ‘k’ automatically in VDBSCAN (Varied Density Based Spatial Clustering of Applications with Noise) algorithm , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[69]  Lotfi Ben Romdhane,et al.  An efficient algorithm for community mining with overlap in social networks , 2014, Expert Syst. Appl..

[70]  Eric T. G. Wang,et al.  Understanding knowledge sharing in virtual communities: An integration of social capital and social cognitive theories , 2006, Decis. Support Syst..

[71]  Alexei Pozdnoukhov,et al.  Best Paper Award , 2011 .

[72]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[73]  Tomoki Nakaya,et al.  Visualising Crime Clusters in a Space‐time Cube: An Exploratory Data‐analysis Approach Using Space‐time Kernel Density Estimation and Scan Statistics , 2010, Trans. GIS.

[74]  Yvan Bédard,et al.  Mapping between dynamic ontologies in support of geospatial data integration for disaster management , 2007 .

[75]  Wei Chang,et al.  A stack-based prospective spatio-temporal data analysis approach , 2008, Decis. Support Syst..

[76]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[77]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[78]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[79]  Jeremy L. Mennis Multidimensional Map Algebra: Design and Implementation of a Spatio-Temporal GIS Processing Language , 2010, Trans. GIS.

[80]  Steve H. L. Liang,et al.  Discovering Sensor Services with Social Network Analysis and Expanded SQWRL Querying , 2012, W2GIS.

[81]  Alexei Pozdnoukhov,et al.  Spatial structure and dynamics of urbancommunities , 2011 .