Detecting phishing web sites: A heuristic URL-based approach

With the growth of Internet, e-commerce plays a vital role in the society. As a result, phishing, the act of stealing personal user data used in e-commerce transaction, has been becoming an emergency problem in modern society. Many techniques have been proposed to protect online users, e.g. blacklist, pagerank. However, the numbers of victims have been increasing due to inefficient protection technique. This is due to the fact that phishers try to make the URL of phishing sites look similar to original sites. In this paper, we are interested in proposing a new approach to detect phishing site by using the features of URL. Particularly, we derive different components from URL and compute a metric for each component. Then, the page ranking will be combined with the achieved metrics to decide whether the websites are phishing websites. The proposed phishing detection technique was evaluated with the dataset of contains 9,661 phishing websites and 1,000 legitimate websites. The results show that our proposed technique can detect over 97% phishing websites.