A proactive discovery and filtering solution on phishing websites

Phishing website is becoming a major threat to the information security in Social Network. The attacks not only lessen the users' trust but also influence the benefit of the third party who develops the platform. In order to solve the time lag in phishing website passive detection, this paper proposes a solution to discover phishing website initiatively based on blacklist, in which the anomalies of its URL and WHOIS information are analyzed, and based on this, the heuristic rules that aim to generate suspicious URLs are made. In order to filter out noise sites in the suspicious set, a website filtering solution based on webpages image-layout is presented. We firstly propose a Ray Scan Method to generate the location feature of webpage images quickly, and then, we proposed a method of calculating the webpage layout similarity, which will be compared against the preset threshold to decide whether it will be filtered. The experimental results show that the solution successfully detects some phishing websites out before they are widely spread, and further, the webpage filtering method guarantees both high filtration ratio and high phishing website retention ratio.