Comparative Evaluation of Techniques for Detection of Phishing URLs

One of the popular cyberattacks today is phishing. It combines social engineering and online identity theft to delude Internet users into submitting their personal information to cybercriminals. Reports have shown continuous increase in the number and sophistication of this attack worldwide. Phishing Uniform Resource Locator (URL) is a malicious web address often created to look like legitimate URL, in order to deceive unsuspecting users. Many algorithms have been proposed to detect phishing URLs and classify them as benign or phishing. Most of these detection algorithms are based on machine learning and detect using inherent characteristics of the URLs. In this study, we examine the performance of a number of such techniques. The algorithms were tested using three publicly available datasets. Our results revealed, overall, the Random Forest algorithm as the best performing algorithm, achieving an accuracy of 97.3%.

[1]  Fadi A. Thabtah,et al.  Intelligent phishing detection system for e-banking using fuzzy data mining , 2010, Expert Syst. Appl..

[2]  Suku Nair,et al.  A comparison of machine learning techniques for phishing detection , 2007, eCrime '07.

[3]  Huajun Huang,et al.  A SVM-based Technique to Detect Phishing URLs , 2012 .

[4]  Wa'el Hadi,et al.  Multi-class associative classification to predicting phishing websites , 2012 .

[5]  Youssef Iraqi,et al.  Phishing Detection: A Literature Survey , 2013, IEEE Communications Surveys & Tutorials.

[6]  Adamu I. Abubakar,et al.  A Review on Mobile SMS Spam Filtering Techniques , 2017, IEEE Access.

[7]  Niels Provos,et al.  A framework for detection and measurement of phishing attacks , 2007, WORM '07.

[8]  Enrico Blanzieri,et al.  A survey of learning-based techniques of email spam filtering , 2008, Artificial Intelligence Review.

[9]  Qingzhong Liu,et al.  Feature Selection for Improved Phishing Detection , 2012, IEA/AIE.

[10]  Ali Yazdian Varjani,et al.  New rule-based phishing detection method , 2016, Expert Syst. Appl..

[11]  Oluwafemi Osho,et al.  Comparative Analysis of Classification Algorithms for Email Spam Detection , 2018 .

[12]  Fadi A. Thabtah,et al.  Predicting Phishing Websites Using Classification Mining Techniques with Experimental Case Studies , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[13]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[14]  Gerhard Paass,et al.  Improved Phishing Detection using Model-Based Features , 2008, CEAS.

[15]  Andrew H. Sung,et al.  LEARNING TO DETECT PHISHING URLs , 2014 .

[16]  Malik Muneeb Abid,et al.  Study on the Effectiveness of Spam Detection Technologies , 2016 .

[17]  Ayanfeoluwa Oluyomi,et al.  EVALUATION OF CLASSIFICATION ALGORITHMS FOR PHISHING URL DETECTION , 2018 .

[18]  Xiaotie Deng,et al.  Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD) , 2006, IEEE Transactions on Dependable and Secure Computing.

[19]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[20]  Susan Mengel,et al.  Phishing URL Detection Using URL Ranking , 2015, 2015 IEEE International Congress on Big Data.

[21]  Justin Tung Ma,et al.  Learning to detect malicious URLs , 2011, TIST.

[22]  Dong Hyun Kim,et al.  Heuristic-based Approach for Phishing Site Detection Using URL Features , 2015 .

[23]  Markus Jakobsson,et al.  Social phishing , 2007, CACM.

[24]  Fadi Thabtah,et al.  Associative Classification techniques for predicting e-banking phishing websites , 2010, 2010 International Conference on Multimedia Computing and Information Technology (MCIT).

[25]  Daisuke Miyamoto,et al.  An Evaluation of Machine Learning-Based Methods for Detection of Phishing Sites , 2008, ICONIP.

[26]  Samuel Marchal,et al.  Know Your Phish: Novel Techniques for Detecting Phishing Sites and Their Targets , 2015, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[27]  Youssef Iraqi,et al.  Lexical URL analysis for discriminating phishing and legitimate websites , 2011, CEAS '11.

[28]  Waleed Ali,et al.  Phishing Website Detection based on Supervised Machine Learning with Wrapper Features Selection , 2017 .

[29]  P. K. Panigrahi,et al.  A Comparative Study of Supervised Machine Learning Techniques for Spam E-mail Filtering , 2012, 2012 Fourth International Conference on Computational Intelligence and Communication Networks.

[30]  Jianyi Zhang,et al.  A real-time automatic detection of phishing URLs , 2012, Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.

[31]  Youssef Iraqi,et al.  A novel Phishing classification based on URL features , 2011, 2011 IEEE GCC Conference and Exhibition (GCC).

[32]  Osho Oluwafemi,et al.  Combating Terrorism with Cybersecurity: The Nigerian Perspective , 2013 .