A possibilistic framework for the detection of terrorism‐related Twitter communities in social media

Since the appearance of social networks, there was a historic increase of data. Unfortunately, terrorists are taking advantage of the easiness of accessing social networks and they have set up profiles to recruit, radicalize, and raise funds. Most of these profiles have pages that exist as well as new recruits to join the terrorist groups, see, and share information. Therefore, there is a potential need for detecting terrorist communities in social networks in order to search for key hints in posts that appear to promote the militants' cause. In order to remedy this problem, we first use a possibilistic‐clustering algorithm that allows more flexibility when assigning a social network profile to clusters (non‐terrorist, terrorist‐sympathizer, terrorist). Then, we introduce a new possibilistic flexible graph mining method to discover similar subgraphs by applying possibilistic similarity rather than using hard structural exact similarity. We experimentally show the efficiency of our possibilistic approach through a detailed process of tweets extract, semantic processing, and classification of the community detection.

[1]  Hong Jiang,et al.  An Improved Information Gain Feature Selection Algorithm for SVM Text Classifier , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[2]  Adolfo Martínez Usó,et al.  Modelling contextual constraints in probabilistic relaxation for multi-class semi-supervised learning , 2014, Knowl. Based Syst..

[3]  Jianxin Li,et al.  Parallel algorithms for anomalous subgraph detection , 2017, Concurr. Comput. Pract. Exp..

[4]  Joost N. Kok,et al.  The Gaston Tool for Frequent Subgraph Mining , 2005, GraBaTs.

[5]  Mohammad Hossein Fazel Zarandi,et al.  Fuzzy duocentric community detection model in social networks , 2015, Soc. Networks.

[6]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[7]  Erich Schikuta,et al.  Toward an economic and energy‐aware cloud cost model , 2013, Concurr. Comput. Pract. Exp..

[8]  Henri Prade,et al.  Fuzzy sets and probability: misunderstandings, bridges and gaps , 1993, [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems.

[9]  B Gallagher,et al.  The State of the Art in Graph-Based Pattern Matching , 2006 .

[10]  Meng Wang,et al.  Community Detection in Social Networks: An In-depth Benchmarking Study with a Procedure-Oriented Framework , 2015, Proc. VLDB Endow..

[11]  Jong-Ha Lee,et al.  Topology Preserving Relaxation Labeling for Nonrigid Point Matching , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Didier Dubois,et al.  Modeling positive and negative information in possibility theory , 2008, Int. J. Intell. Syst..

[13]  Edwin R. Hancock,et al.  Probabilistic relaxation labelling using the Fokker-Planck equation , 2008, Pattern Recognit..

[14]  Jianzhong Li,et al.  Mining Frequent Subgraph Patterns from Uncertain Graph Data , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  Zeev Volkovich,et al.  Boosted decision trees for behaviour mining of concurrent programmes , 2017, Concurr. Comput. Pract. Exp..

[16]  Jalel Akaichi,et al.  Structural-semantic approach for approximate frequent subgraph mining , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[17]  Panos Kalnis,et al.  GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph , 2014, Proc. VLDB Endow..

[18]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[19]  Xiaohua Hu,et al.  Utilizing Different Link Types to Enhance Document Clustering Based on Markov Random Field Model With Relaxation Labeling , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[20]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[21]  Lorenzo Livi,et al.  The graph matching problem , 2012, Pattern Analysis and Applications.

[22]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[23]  Bruce A. Desmarais,et al.  Inferential Network Analysis with Exponential Random Graph Models , 2011, Political Analysis.

[24]  Wei Wang,et al.  REAFUM: Representative Approximate Frequent Subgraph Mining , 2015, SDM.

[25]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Ying Li,et al.  Measuring Similarity between Graphs Based on the Levenshtein Distance , 2013 .

[27]  R. Krishnapuram,et al.  A fuzzy approach to content-based image retrieval , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[28]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[29]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[30]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Peng Gang Sun,et al.  Community detection by fuzzy clustering , 2015 .

[32]  Nitesh V. Chawla,et al.  Identifying and evaluating community structure in complex networks , 2010, Pattern Recognit. Lett..

[33]  Dale Schuurmans,et al.  Modular Community Detection in Networks , 2011, IJCAI.

[34]  Alessandro Flammini,et al.  Predicting online extremism, content adopters, and interaction reciprocity , 2016, SocInfo.

[35]  Mohammad Al Hasan,et al.  ORIGAMI: A Novel and Effective Approach for Mining Representative Orthogonal Graph Patterns , 2008, Stat. Anal. Data Min..

[36]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[37]  Shang Lei,et al.  A Feature Selection Method Based on Information Gain and Genetic Algorithm , 2012, 2012 International Conference on Computer Science and Electronics Engineering.

[38]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Jalel Akaichi,et al.  A survey of uncertainty handling in frequent subgraph mining algorithms , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[40]  Anh-Phuong Ta,et al.  Inexact graph matching techniques : application to object detection and human action recognition , 2010 .

[41]  Lei Shu,et al.  Lifelong-RL: Lifelong Relaxation Labeling for Separating Entities and Aspects in Opinion Targets , 2016, EMNLP.

[42]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  Ling Chen,et al.  A Fast Frequent Subgraph Mining Algorithm , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[44]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Jacob D. Furst,et al.  Predictive Data Mining for Lung Nodule Interpretation , 2007 .

[46]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[47]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[48]  Kuo-Lung Wu,et al.  Unsupervised possibilistic clustering , 2006, Pattern Recognit..

[49]  Javier Montero,et al.  A new modularity measure for Fuzzy Community detection problems based on overlap and grouping functions , 2016, Int. J. Approx. Reason..

[50]  Edwin R. Hancock,et al.  Probabilistic Relaxation Labeling by Fokker-Planck Diffusion on a Graph , 2007, GbRPR.

[51]  Ioannis Hatzilygeroudis,et al.  Recognizing emotions in text using ensemble of classifiers , 2016, Eng. Appl. Artif. Intell..

[52]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[53]  R. Prabhakar,et al.  Frequent Subgraph Mining Algorithms – A Survey , 2015 .

[54]  Khaled Mellouli,et al.  Information Affinity: A New Similarity Measure for Possibilistic Uncertain Information , 2007, ECSQARU.

[55]  Christian Borgelt,et al.  Subgraph Support in a Single Large Graph , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[56]  Jalel Akaichi,et al.  Clustering Social Network Profiles Using Possibilistic C-means Algorithm , 2018, IIMSS.

[57]  Siegfried Nijssen,et al.  What Is Frequent in a Single Graph? , 2007, PAKDD.

[58]  José Francisco Martínez Trinidad,et al.  AGraP: an algorithm for mining frequent patterns in a single graph using inexact matching , 2014, Knowledge and Information Systems.

[59]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[60]  Katharina Anna Zweig,et al.  Influence of the null-model on motif detection , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[61]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[62]  Imran Awan Cyber-Extremism: Isis and the Power of Social Media , 2017, Society.

[63]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[64]  Amy-Jane Gielen,et al.  Countering Violent Extremism: A Realist Review for Assessing What Works, for Whom, in What Circumstances, and How? , 2017 .

[65]  Jalel Akaichi,et al.  POSGRAMI: Possibilistic Frequent Subgraph Mining in a Single Large Graph , 2016, IPMU.

[66]  J. van Leeuwen,et al.  Graph Based Representations in Pattern Recognition , 2003, Lecture Notes in Computer Science.

[67]  Adiwijaya,et al.  On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis , 2018, Appl. Comput. Intell. Soft Comput..

[68]  Kathleen M. Carley,et al.  From Tweets to Intelligence: Understanding the Islamic Jihad Supporting Community on Twitter , 2016, SBP-BRiMS.

[69]  Gang Zhou,et al.  VigilNet: An integrated sensor network system for energy-efficient surveillance , 2006, TOSN.

[70]  David S. Doermann,et al.  Robust point matching for two-dimensional nonrigid shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[71]  Matthew O. Jackson,et al.  Diffusion and contagion in networks with heterogeneous agents and homophily , 2011, Network Science.