A PageRank based detection technique for phishing web sites

Phishing is an attempt to acquire one's information without user's knowledge by tricking him by making similar kind of website or sending emails to user which looks like legitimate site or email. Phishing is a social cyber threat attack, which is causing severe loss of economy to the user, due to phishing attacks online transaction users are declining. This paper aims to design and implement a new technique to detect phishing web sites using Google's PageRank. Google gives a PageRank value to each site in the web. This work uses the PageRank value and other features to classify phishing sites from normal sites. We have collected a dataset of 100 phishing sites and 100 legitimate sites for our use. By using this Google PageRank technique 98% of the sites are correctly classified, showing only 0.02 false positive rate and 0.02 false negative rate.

[1]  Lorrie Faith Cranor,et al.  Cantina: a content-based approach to detecting phishing web sites , 2007, WWW '07.

[2]  J. Doug Tygar,et al.  The battle against phishing: Dynamic Security Skins , 2005, SOUPS '05.

[3]  Youssef Iraqi,et al.  A novel Phishing classification based on URL features , 2011, 2011 IEEE GCC Conference and Exhibition (GCC).

[4]  John C. Mitchell,et al.  Client-Side Defense Against Web-Based Identity Theft , 2004, NDSS.

[5]  Arnon Rungsawang,et al.  Using Domain Top-page Similarity Feature in Machine Learning-Based Web Phishing Detection , 2010, 2010 Third International Conference on Knowledge Discovery and Data Mining.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  Juan Pablo Hourcade,et al.  B-APT: Bayesian Anti-Phishing Toolbar , 2008, 2008 IEEE International Conference on Communications.

[8]  Chuanxiong Guo,et al.  Online Detection and Prevention of Phishing Attacks , 2006, 2006 First International Conference on Communications and Networking in China.

[9]  Lorrie Faith Cranor,et al.  An Empirical Analysis of Phishing Blacklists , 2009, CEAS 2009.