A multi-language approach towards the identification of suspicious users on social networks

The use of IT technology for the planning and implementation of illegal activities has been gaining ground in recent years. Nowadays, through the web and the social media, it is possible not only to divulge advertising for the disclosure of illicit activities, but also to take action that in the past needed to have people in place and at the moment the activity took place. In fact, this phenomenon allows criminals to be less exposed to the risk of being discovered. Furthermore, the technology tends to encourage international collaborations, which makes the process of identifying illegal activities even more complex because of the lack of adequate tools that can operate effectively by considering multi-cultural aspects. Consequently, this evolving phenomenon towards cyber-crime requires new models and analysis techniques to address these challenges. In this context, the paper proposes an approach based on a multi-language model that aims to support the identification of suspicious users on social networks. It exploits the effectiveness of web translation services along with specific stand-alone libraries for normalizing user profiles in a common language. In addition, different text analysis techniques are combined for supporting the user profiles evaluation. The proposed approach is exemplified through a case study by analyzing Twitter users profile by showing step by step the overall process and related results.

[1]  Qiang Zhou,et al.  A semantic approach for text clustering using WordNet and lexical chains , 2015, Expert Syst. Appl..

[2]  Omer F. Rana,et al.  Temporal TF-IDF: A High Performance Approach for Event Summarization in Twitter , 2016, 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[3]  Arnold W. M. Smeulders,et al.  Real-time bag of words, approximately , 2009, CIVR '09.

[4]  Yiannis Kompatsiaris,et al.  Proceedings of the ACM International Conference on Image and Video Retrieval , 2009, CIVR 2009.

[5]  Pankaj Choudhary,et al.  A Survey on Social Network Analysis for Counter- Terrorism , 2015 .

[6]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[7]  S. R. El-Beltagy,et al.  Open issues in the sentiment analysis of Arabic social media: A case study , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[8]  Max Mühlhäuser,et al.  CHALLENGES AND AVAILABLE SOLUTIONS AGAINST ORGANIZED CYBER-CRIME AND TERRORIST NETWORKS , 2017 .

[9]  Pankoo Kim,et al.  Text analysis for detecting terrorism-related articles on the web , 2014, J. Netw. Comput. Appl..

[10]  Kristin M. Finklea,et al.  Organized Crime: An Evolving Challenge for U.S. Law Enforcement , 2010 .

[11]  H. Milward,et al.  Dark Networks as Problems , 2003 .

[12]  Angelo Furfaro,et al.  Towards Security as a Service (SecaaS): On the modeling of Security Services for Cloud Computing , 2014, 2014 International Carnahan Conference on Security Technology (ICCST).

[13]  Dhanya Pramod,et al.  Document clustering: TF-IDF approach , 2016, 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT).

[14]  Sunita Chand Empirical survey of machine translation tools , 2016, 2016 Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN).

[15]  Patrick M. Dudas Cooperative, dynamic Twitter parsing and visualization for dark network analysis , 2013, 2013 IEEE 2nd Network Science Workshop (NSW).

[16]  Christian Gagné,et al.  Stream clustering of tweets , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[17]  Shahab Saquib,et al.  Malicious behavior in online social network , 2015, 2015 IEEE Workshop on Computational Intelligence: Theories, Applications and Future Directions (WCI).