Classifying suspicious content in tor darknet through Semantic Attention Keypoint Filtering

Abstract One of the tasks Law Enforcement Agencies are responsible for is to find evidence of criminal activities in the Darknet. However, visiting thousands of domains to locate visual information containing illicit acts manually requires a considerable amount of time and human resources. To support this task, in this paper, we explore the automatic classification of images uploaded to Tor darknet. Unfortunately, the foreground objects on such images are not always presented standalone, without background, due to the environmental conditions. To address this challenge on the digital investigation of Tor darknet visual content, we propose to classify automatically only relevant parts of the image combining saliency maps, i.e. to select the regions with the most salient information, with Bag of Visual Words (BoVW). We introduce Semantic Attention Keypoint Filtering (SAKF), a filtering strategy that removes non-significant features at a pixel level that mainly do not belong to the object of interest or foreground. We assessed SAKF on seven publicly available datasets, obtaining from 1.64 to 15.73 points higher accuracies than the method set as the baseline, i.e. BoVW using dense SIFT (Scale-Invariant Feature Transform) descriptors. We also compared SAKF filtering performance against the deep features extracted from two well-known Convolutional Neural Network (CNN) architectures, namely MobileNet and ResNet50. Experimental results reveal the effectiveness of the proposed approach and highlight that the use of automatic image classification could be advantageous to support daily Law Enforcement Agencies investigations on Tor darknet.

[1]  Cordelia Schmid,et al.  A maximum entropy framework for part-based texture and object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  Eduardo Fidalgo,et al.  Illegal Activity Categorisation in DarkNet Based on Image Classification Using CREIC Method , 2017, SOCO-CISIS-ICEUTE.

[3]  Michael Dorr,et al.  Space-Variant Descriptor Sampling for Action Recognition Based on Saliency and Eye Movements , 2012, ECCV.

[4]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Eduardo Fidalgo,et al.  ToRank: Identifying the most influential suspicious domains in the Tor network , 2019, Expert Syst. Appl..

[7]  Fei Li,et al.  A Smartphone Camera-Based Indoor Positioning Algorithm of Crowded Scenarios with the Assistance of Deep CNN , 2017, Sensors.

[8]  Eduardo Fidalgo,et al.  Object Detection for Crime Scene Evidence Analysis Using Deep Learning , 2017, ICIAP.

[9]  Yuan Yan Tang,et al.  SIFT Keypoint Removal and Injection via Convex Relaxation , 2016, IEEE Transactions on Information Forensics and Security.

[10]  Eduardo Fidalgo,et al.  Recognition of service domains on TOR dark net using perceptual hashing and image classification techniques , 2017, ICDP.

[11]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13]  Hermann Ney,et al.  Training and Recognition of Complex Scenes Using a Holistic Statistical Model , 2003, DAGM-Symposium.

[14]  Eduardo Fidalgo,et al.  Boosting image classification through semantic attention filtering strategies , 2018, Pattern Recognit. Lett..

[15]  Campbell Wilson,et al.  Criminal motivation on the dark web: A categorisation model for law enforcement , 2018, Digit. Investig..

[16]  Richa Singh,et al.  RGB-D Face Recognition With Texture and Attribute Features , 2014, IEEE Transactions on Information Forensics and Security.

[17]  Xiangzhi Bai,et al.  Region Based CNN for Foreign Object Debris Detection on Airfield Pavement , 2018, Sensors.

[18]  Eduardo Fidalgo,et al.  Compass radius estimation for improved image classification using Edge-SIFT , 2016, Neurocomputing.

[19]  Campbell Wilson,et al.  Laying foundations for effective machine learning in law enforcement. Majura - A labelling schema for child exploitation materials , 2018, Digit. Investig..

[20]  Eduardo Fidalgo,et al.  Query Based Object Retrieval Using Neural Codes , 2017, SOCO-CISIS-ICEUTE.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yuan Zhang,et al.  SIFT Matching with CNN Evidences for Particular Object Retrieval , 2017, Neurocomputing.

[23]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[24]  Eduardo Fidalgo,et al.  Classifying Illegal Activities on Tor Network Based on Web Textual Contents , 2017, EACL.

[25]  Frank Breitinger,et al.  Availability of datasets for digital forensics - And what is missing , 2017, Digit. Investig..

[26]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[28]  Fouad Khelifi,et al.  Dissimilarity Gaussian Mixture Models for Efficient Offline Handwritten Text-Independent Identification Using SIFT and RootSIFT Descriptors , 2019, IEEE Transactions on Information Forensics and Security.

[29]  Edoardo Ardizzone,et al.  > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < , 2007 .

[30]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[31]  Marion Berbineau,et al.  People silhouette extraction from people detection bounding boxes in images , 2017, Pattern Recognit. Lett..

[32]  Yongsheng Ding,et al.  Using line segments to train multi-stream stacked autoencoders for image classification , 2017, Pattern Recognit. Lett..

[33]  Thomas Rid,et al.  Cryptopolitik and the Darknet , 2016 .

[34]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[35]  Peter M. A. Sloot,et al.  From data to disruption , 2015, Digit. Investig..

[36]  Gareth Owenson,et al.  The darknet's smaller than we thought: The life cycle of Tor Hidden Services , 2018, Digit. Investig..

[37]  Cheng-Lin Liu,et al.  Adaptive spatial pooling for image classification , 2016, Pattern Recognit..

[38]  Eduardo Fidalgo,et al.  Pornography and child sexual abuse detection in image and video: A comparative evaluation , 2017, ICDP.

[39]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[40]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[41]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.