Topic-based social network analysis for virtual communities of interests in the Dark Web

The study of extremist groups and their interaction is a crucial task in order to maintain homeland security and peace. Tools such as social networks analysis and text mining have contributed to the understanding of this kind of groups in order to develop counter-terrorism applications. This work addresses the topic-based community key members extraction problem, for which our method combines both text mining and social network analysis techniques. This is achieved by first applying latent Dirichlet allocation to build two topic-based social networks: one social network oriented towards the thread creator point-of-view, and the other one oriented towards the repliers of the overall forum. Then, by using different Social Network Analysis measures, topic-based key members are evaluated using as benchmark a social network built using the plain documents. Experiments were performed using an English language based forum available in the Dark Web portal.

[1]  Luis A. Guerrero,et al.  Virtual Communities of Practice's Purpose Evolution Analysis Using a Concept-Based Mining Approach , 2009, KES.

[2]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[3]  R. B. Bradford Application of Latent Semantic Indexing in Generating Graphs of Terrorist Networks , 2006, ISI.

[4]  Daniel Dajun Zeng,et al.  Finding leaders from opinion networks , 2009, 2009 IEEE International Conference on Intelligence and Security Informatics.

[5]  Panayiotis Zaphiris,et al.  Investigating social network patterns within an empathic online community for older people , 2009, Comput. Hum. Behav..

[6]  Jácint Szabó,et al.  Linked latent Dirichlet allocation in web spam filtering , 2009, AIRWeb '09.

[7]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[8]  Lina Zhou,et al.  Social computing and weighting to identify member roles in online communities , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[9]  A. Banerjee,et al.  Social Topic Models for Community Extraction , 2008 .

[10]  Hsinchun Chen,et al.  On the Topology of the Dark Web of Terrorist Groups , 2006, ISI.

[11]  Hsinchun Chen,et al.  The topology of dark networks , 2008, Commun. ACM.

[12]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[14]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[15]  Hsinchun Chen,et al.  CrimeNet explorer: a framework for criminal network knowledge discovery , 2005, TOIS.

[16]  Joseph Migga Kizza,et al.  Discovering topics from dark websites , 2009, 2009 IEEE Symposium on Computational Intelligence in Cyber Security.

[17]  Haewoon Kwak,et al.  Mining communities in networks: a solution for consistency and its evaluation , 2009, IMC '09.

[18]  Srini Ramaswamy,et al.  Social network analysis for email classification , 2008, ACM-SE 46.

[19]  Mark A. Girolami,et al.  Employing Latent Dirichlet Allocation for fraud detection in telecommunications , 2007, Pattern Recognit. Lett..

[20]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[21]  Li Fan,et al.  Dark web forums portal: Searching and analyzing jihadist forums , 2009, 2009 IEEE International Conference on Intelligence and Security Informatics.

[22]  Miia Kosonen,et al.  Knowledge sharing in virtual communities - a review of the empirical research , 2009, Int. J. Web Based Communities.

[23]  Hsinchun Chen,et al.  Collecting and Analyzing the Presence of Terrorists on the Web: A Case Study of Jihad Websites , 2005, ISI.

[24]  Gilbert Probst,et al.  Why communities of practice succeed and why they fail , 2008 .

[25]  Hsinchun Chen,et al.  Applying authorship analysis to extremist-group Web forum messages , 2005, IEEE Intelligent Systems.

[26]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[27]  Hsinchun Chen,et al.  US domestic extremist groups on the Web: link and content analysis , 2005, IEEE Intelligent Systems.

[28]  Marc Sageman,et al.  A Strategy for Fighting International Islamist Terrorists , 2008 .

[29]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks , 2005, IJCAI.

[30]  Sang-Won Lee,et al.  On social Web sites , 2010, Inf. Syst..

[31]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[32]  Constance Elise Porter,et al.  A Typology of Virtual Communities: A Multi-Disciplinary Foundation for Future Research , 2006, J. Comput. Mediat. Commun..

[33]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..