论文信息 - Analytics for characterising and measuring the naturalness of online personae

Analytics for characterising and measuring the naturalness of online personae

IntroductionCurrently 40 % of the world’s population, around 3 billion users, are online using cyberspace for everything from work to pleasure. While there are numerous benefits accompanying this medium, the Internet is not without its perils. In this case study article, we focus specifically on the challenge of fake (or unnatural) online identities, such as those used to defraud people and organisations, with the aim of exploring an approach to detect them.Case descriptionIn particular, through our method and case study we outline and experiment with novel analytics for characterising and measuring the naturalness of an online persona or identity; this naturalness is defined as the extent to which that persona has features similar to those expected for comparable personae online. Our case scenario involves a participant set of two types of individuals, and our aim at this stage is to use our approach to correctly characterise, and then distinguish between, these two types.Discussion and evaluationTo briefly précis our case study results, we found that our method to conceptualise an individual’s complete online presence was very successful. This was undoubtedly linked to its detailed consideration of how cyberspace is typically used, while also building on our existing model of identity which has been used to aid law enforcement in identification tasks. In terms of developing effective analytics for naturalness however, improvements in our approach (e.g., features selected and nuanced metrics) are required. Moreover, the study would benefit from a larger sample size to better identify common aspects between natural personae.ConclusionsOverall, the case study allowed us to explore a novel technique to characterise naturalness and to examine its utility at detecting unnatural personae. Our goal now is to build on the study’s findings in several key ways. Specifically, we aim to conduct further assessments on the criteria through which naturalness is defined, and refine our analytics and combinatorics to measure a persona’s naturalness. We will also explore clustering approaches based on complete online personae, as a means to complement our identification of naturally occurring personae types in large datasets.

[1] Michael Sirivianos,et al. Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[2] Krishna P. Gummadi,et al. Towards Detecting Anomalous User Behavior in Online Social Networks , 2014, USENIX Security Symposium.

[3] P. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[4] D. Pham,et al. Selection of K in K-means clustering , 2005 .

[5] Jean Scholtz,et al. Pathways to identity: using visualization to aid law enforcement in identification tasks , 2014, Security Informatics.

[6] Sadie Creese,et al. A Data-Reachability Model for Elucidating Privacy and Security Risks Related to the Use of Online Social Networks , 2012, 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications.

[7] Simon Fong,et al. Not every friend on a social network can be trusted: Classifying imposters using decision trees , 2012, The First International Conference on Future Generation Communication Technologies.

[8] Uffe Kock Wiil,et al. Criminal network investigation , 2014, Security Informatics.

[9] Divya,et al. Techniques to Detect Spammers in Twitter- A Survey , 2014 .

[10] Qiang Cao,et al. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.

[11] Jason R. C. Nurse,et al. The anatomy of online deception: what makes automated text convincing? , 2016, SAC.

[12] Gianluca Stringhini,et al. Detecting spammers on social networks , 2010, ACSAC '10.

[13] J. Edward Jackson,et al. A User's Guide to Principal Components. , 1991 .