Your Style Your Identity: Leveraging Writing and Photography Styles for Drug Trafficker Identification in Darknet Markets over Attributed Heterogeneous Information Network

Due to its anonymity, there has been a dramatic growth of underground drug markets hosted in the darknet (e.g., Dream Market and Valhalla). To combat drug trafficking (a.k.a. illicit drug trading) in the cyberspace, there is an urgent need for automatic analysis of participants in darknet markets. However, one of the key challenges is that drug traffickers (i.e., vendors) may maintain multiple accounts across different markets or within the same market. To address this issue, in this paper, we propose and develop an intelligent system named uStyle-uID leveraging both writing and photography styles for drug trafficker identification at the first attempt. At the core of uStyle-uID is an attributed heterogeneous information network (AHIN) which elegantly integrates both writing and photography styles along with the text and photo contents, as well as other supporting attributes (i.e., trafficker and drug information) and various kinds of relations. Built on the constructed AHIN, to efficiently measure the relatedness over nodes (i.e., traffickers) in the constructed AHIN, we propose a new network embedding model Vendor2Vec to learn the low-dimensional representations for the nodes in AHIN, which leverages complementary attribute information attached in the nodes to guide the meta-path based random walk for path instances sampling. After that, we devise a learning model named vIdentifier to classify if a given pair of traffickers are the same individual. Comprehensive experiments on the data collections from four different darknet markets are conducted to validate the effectiveness of uStyle-uID which integrates our proposed method in drug trafficker identification by comparisons with alternative approaches.

[1]  R. Harald Baayen,et al.  How Variable May a Constant be? Measures of Lexical Richness in Perspective , 1998, Comput. Humanit..

[2]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  James Ze Wang,et al.  Studying Aesthetics in Photographic Images Using a Computational Approach , 2006, ECCV.

[4]  Richard Colbaugh,et al.  Proactive defense for evolving cyber threats , 2011, Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics.

[5]  Hakan Demirbüken,et al.  The global Afghan opium trade : a threat assessment, 2011 , 2011 .

[6]  Urs Hengartner,et al.  Privacy: Gone with the Typing! Identifying Web Users by Their Typing Patterns , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[7]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[8]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  James Martin,et al.  Drugs on the Dark Net: How Cryptomarkets are Transforming the Global Trade in Illicit Drugs , 2014 .

[12]  Lucy Burns,et al.  The closure of the Silk Road: what has this meant for online drug trading? , 2014, Addiction.

[13]  Ariel Stolerman,et al.  Doppelgänger Finder: Taking Stylometry to the Underground , 2014, 2014 IEEE Symposium on Security and Privacy.

[14]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[15]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[16]  Mark van Staalduinen,et al.  Authorship Analysis on Dark Marketplace Forums , 2015, 2015 European Intelligence and Security Informatics Conference.

[17]  A. Roxburgh,et al.  A response to Dolliver's "Evaluating drug trafficking on the Tor network". , 2015, The International journal on drug policy.

[18]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[20]  Tim Bingham,et al.  The rise and challenge of dark net drug markets. , 2015 .

[21]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[22]  Wee Keong Ng,et al.  Application of Stylometry to DarkWeb Forum User Identification , 2016, ICICS.

[23]  Julian Broséus,et al.  Buying drugs on a Darknet market: A better deal? Studying the online illicit drug market through the analysis of digital, physical and chemical data. , 2016, Forensic science international.

[24]  F Crispino,et al.  Studying illicit drug trafficking on Darknet markets: Structure and organisation from a Canadian perspective. , 2016, Forensic science international.

[25]  P Griffiths,et al.  Disruptive Potential of the Internet to Transform Illicit Drug Markets and Impact on Future Patterns of Drug Consumption , 2017, Clinical pharmacology and therapeutics.

[26]  Nabarun Dasgupta,et al.  Silicon to syringe: Cryptomarkets and disruptive innovation in opioid supply chains. , 2017, The International journal on drug policy.

[27]  Sami Abu-El-Haija,et al.  Learning Edge Representations via Low-Rank Asymmetric Projections , 2017, CIKM.

[28]  Xiang Li,et al.  Semi-supervised Clustering in Attributed Heterogeneous Information Networks , 2017, WWW.

[29]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[30]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[31]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[32]  Xin Li,et al.  Social Media for Opioid Addiction Epidemiology: Automatic Detection of Opioid Addicts from Twitter and Case Studies , 2017, CIKM.

[33]  Ulises Cortés,et al.  A visual embedding for the unsupervised extraction of abstract semantics , 2015, Cognitive Systems Research.

[34]  Jian Liu,et al.  iDetector: Automate Underground Forum Analysis Based on Heterogeneous Information Network , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[35]  Jiabin Wang,et al.  KADetector: Automatic Identification of Key Actors in Online Hack Forums Based on Structured Heterogeneous Information Network , 2018, 2018 IEEE International Conference on Big Knowledge (ICBK).

[36]  Xin Li,et al.  Automatic Opioid User Detection from Twitter: Transductive Ensemble Built on Different Meta-graph Based Similarities over Heterogeneous Information Network , 2018, IJCAI.

[37]  Gang Wang,et al.  You Are Your Photographs: Detecting Multiple Identities of Vendors in the Darknet Marketplaces , 2018, AsiaCCS.

[38]  Yanfang Ye,et al.  Gotcha - Sly Malware!: Scorpion A Metagraph2vec Based Malware Detection System , 2018, KDD.

[39]  Mark Graham,et al.  Platform Criminalism: The 'Last-Mile' Geography of the Darknet Market Supply Chain , 2018, WWW.