Phishing Detection with Image Retrieval Based on Improved Texton Correlation Descriptor

Anti-detection is becoming as an emerging challenge for anti-phishing. This paper solves the threats of anti-detection from the threshold setting condition. Enough webpages are considered to complicate threshold setting condition when the threshold is settled. According to the common visual behavior which is easily attracted by the salient region of webpages, image retrieval methods based on texton correlation descriptor (TCD) is improved to obtain enough webpages which have similarity in the salient region for the images of webpages. There are two steps for improving TCD which has advantage of recognizing the salient region of images: (1) This paper proposed Weighted Euclidean Distance based on neighborhood location (NLW-Euclidean distance) and double cross windows, and combine them to solve the problems in TCD; (2) Space structure is introduced to map the image set to Euclid space so that similarity relation among images can be used to complicate threshold setting conditions. Experimental results show that the proposed method can improve the effectiveness of anti-phishing and make the system more stable, and significantly reduce the possibilities of being hacked to be used as mining systems for blockchain.