Generation of Automatic and Realistic Artificial Profiles

Online social networks (OSNs) are abused by cyber criminals for various malicious activities. One of the most effective approaches for detecting malicious activity in OSNs involves the use of social network honeypots - artificial profiles that are deliberately planted within OSNs in order to attract abusers. Honeypot profiles have been used in detecting spammers, potential cyber attackers, and advanced attackers. Therefore, there is a growing need for the ability to reliably generate realistic artificial honeypot profiles in OSNs. In this research we present 'ProfileGen' - a method for the automated generation of profiles for professional social networks, giving particular attention to producing realistic education and employment records. 'ProfileGen' creates honeypot profiles that are similar to actual data by extrapolating the characteristics and properties of real data items. Evaluation by 70 domain experts confirms the method's ability to generate realistic artificial profiles that are indistinguishable from real profiles, demonstrating that our method can be applied to generate realistic artificial profiles for a wide range of applications.

[1]  Surajit Chaudhuri,et al.  Flexible Database Generators , 2005, VLDB.

[2]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[3]  Shanton Chang,et al.  Information Leakage through Online Social Networking: Opening the Doorway for Advanced Persistence Threats , 2010, AISM 2010.

[4]  Rui Xiao,et al.  Development of a Synthetic Data Set Generator for Building and Testing Information Discovery Systems , 2006, Third International Conference on Information Technology: New Generations (ITNG'06).

[5]  Rami Puzis,et al.  ProfileGen: Generation of Automatic and Realistic Artificial Profiles , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[6]  Yannis Theodoridis,et al.  On the Generation of Spatiotemporal Datasets , 1999 .

[7]  Kyumin Lee,et al.  The social honeypot project: protecting online communities from spammers , 2010, WWW '10.

[8]  Kyumin Lee,et al.  Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter , 2011, ICWSM.

[9]  Rami Puzis,et al.  Anti-Reconnaissance Tools: Detecting Targeted Socialbots , 2014, IEEE Internet Computing.

[10]  A. Shamshad,et al.  First and second order Markov chain models for synthetic generation of wind speed time series , 2005 .

[11]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[12]  Ping Chen,et al.  A Study on Advanced Persistent Threats , 2014, Communications and Multimedia Security.

[13]  Margaret Martonosi,et al.  Human mobility modeling at metropolitan scales , 2012, MobiSys '12.

[14]  Lior Rokach,et al.  HoneyGen: An automated honeytokens generator , 2011, Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics.

[15]  S. Compton,et al.  Nursing research. Principles and methods: 7th edition , 2005 .

[16]  Yuval Elovici,et al.  Guided socialbots: Infiltrating the social networks of specific organizations' employees , 2014, AI Commun..

[17]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[18]  Christopher Krügel,et al.  A Practical Attack to De-anonymize Social Network Users , 2010, 2010 IEEE Symposium on Security and Privacy.

[19]  Rui Xiao,et al.  Generation of synthetic data sets for evaluating the accuracy of knowledge discovery systems , 2005, KDD '05.

[20]  rey O. Kephart,et al.  Automatic Extraction of Computer Virus SignaturesJe , 2006 .

[21]  Patrick Graham,et al.  Using Bayesian networks to create sinthetic data , 2009 .

[22]  Youngsoo Kim,et al.  Analysis of Cyber Attacks and Security Intelligence , 2013, MUSIC.

[23]  Helen Nissenbaum,et al.  Privacy in Context - Technology, Policy, and the Integrity of Social Life , 2009 .

[24]  Jian Cao,et al.  Combating the evasion mechanisms of social bots , 2016, Comput. Secur..

[25]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[26]  Rami Puzis,et al.  Hunting organization-targeted socialbots , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[27]  T. McMahon,et al.  Stochastic generation of annual, monthly and daily climate data: A review , 2001 .

[28]  Rami Puzis,et al.  Creation and Management of Social Network Honeypots for Detecting Targeted Cyber Attacks , 2017, IEEE Transactions on Computational Social Systems.

[29]  Akira Utsumi,et al.  Ineluctable background checking on social networks: Linking job seeker's résumé and posts , 2013, 2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[30]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[31]  Denise Polit-O'Hara,et al.  Nursing Research: Principles and Methods , 1978 .

[32]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[33]  Latanya Sweeney Protecting job seekers from identity theft , 2006, IEEE Internet Computing.

[34]  Patrick Graham,et al.  Using Bayesian networks to create synthetic data , 2010 .

[35]  Fosca Giannotti,et al.  Synthetic generation of cellular network positioning data , 2005, GIS '05.

[36]  Jonathan White Creating Personally Identifiable Honeytokens , 2008, SCSS.

[37]  A. D. Nicks,et al.  Stochastic generation of temperature and solar radiation data , 1980 .