Machine Learning-Based Phishing Attack Detection

This paper explores machine learning techniques and evaluates their performances when trained to perform against datasets consisting of features that can differentiate between a Phishing Website and a safe one. This capability of telling these sites apart from one another is vital in the modern-day internet surfing. As more and more of our resources shift online, one vulnerability and a leak of sensitive information by someone could bring everything down in a connected network. This paper's objective through this research is to highlight the best technique for identifying one of the most commonly occurring cyberattacks and thus allow faster identification and blacklisting of such sites, therefore leading to a safer and more secure web surfing experience for everyone. To achieve this, we describe each of the techniques we look into in great detail and use different evaluation techniques to portray their performance visually. After pitting all of these techniques against each other, we have concluded with an explanation in this paper that Random Forest Classifier does indeed work best for Phishing Website Detection.

[1]  Suku Nair,et al.  Bypassing Security Toolbars and Phishing Filters via DNS Poisoning , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[2]  Ahmad Alamgir Khan Preventing Phishing Attacks using One Time Password and User Machine Identification , 2013, ArXiv.

[3]  Esma Aïmeur,et al.  A Personalized Whitelist Approach for Phishing Webpage Detection , 2012, 2012 Seventh International Conference on Availability, Reliability and Security.

[4]  Gang Wang,et al.  Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild , 2018, Internet Measurement Conference.

[5]  Zhenkai Liang,et al.  Phishing page detection via learning classifiers from page layout feature , 2019, EURASIP J. Wirel. Commun. Netw..

[6]  Jianhua Sun,et al.  Fine-Grained Mining and Classification of Malicious Web Pages , 2013, 2013 Fourth International Conference on Digital Manufacturing & Automation.

[7]  Iqbal H. Sarker,et al.  Cyber Intrusion Detection Using Machine Learning Classification Techniques , 2020, COMS2.

[8]  Amit Dvir,et al.  Robust Malicious Domain Detection , 2020, CSCML.

[9]  Elijah Blessing Rajsingh,et al.  Phishing URL detection-based feature selection to classifiers , 2017, Int. J. Electron. Secur. Digit. Forensics.

[10]  Susan Mengel,et al.  Phishing URL Detection Using URL Ranking , 2015, 2015 IEEE International Congress on Big Data.

[11]  Indrakshi Ray,et al.  "Kn0w Thy Doma1n Name": Unbiased Phishing Detection Using Domain Name Based Features , 2018, SACMAT.

[12]  P. Shanthi,et al.  Anti-phishing detection of phishing attacks using genetic algorithm , 2010, 2010 INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES.

[13]  Eleni Berki,et al.  A usability test of whitelist and blacklist-based anti-phishing application , 2012, MindTrek.

[14]  Iqbal H. Sarker,et al.  Crime Prediction Using Spatio-Temporal Data , 2020, Computing Science, Communication and Security.

[15]  C. D. Jaidhar,et al.  Applicability of machine learning in spam and phishing email filtering: review and approaches , 2020, Artificial Intelligence Review.

[16]  Adam Doupé,et al.  PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[17]  Mohsen Sharifi,et al.  A phishing sites blacklist generator , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[18]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[19]  Iqbal Gondal,et al.  A survey of similarities in banking malware behaviours , 2018, Comput. Secur..

[20]  Junshan Tan,et al.  Countermeasure Techniques for Deceptive Phishing Attack , 2009, 2009 International Conference on New Trends in Information and Service Science.

[21]  Aderemi Oluyinka Adewumi,et al.  Classification of Phishing Email Using Random Forest Machine Learning Technique , 2014, J. Appl. Math..

[22]  Arun D. Kulkarni,et al.  Phishing Websites Detection using Machine Learning , 2019, International Journal of Recent Technology and Engineering.

[23]  Adam Kozakiewicz,et al.  Analysis of the Similarities in Malicious DNS Domain Names , 2011 .

[24]  Iqbal H. Sarker,et al.  A Rule Based Expert System to Assess Coronary Artery Disease under Uncertainty , 2020, ArXiv.

[25]  Sohrab Hossain,et al.  A Belief Rule Based Expert System to Predict Student Performance under Uncertainty , 2019, 2019 22nd International Conference on Computer and Information Technology (ICCIT).

[26]  AbdulMalik S. Al-Salman,et al.  Combating Comment Spam with Machine Learning Approaches , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[27]  Alfredo Cuzzocrea,et al.  Applying Machine Learning Techniques to Detect and Analyze Web Phishing Attacks , 2018, iiWAS.

[28]  Arnon Rungsawang,et al.  Using Domain Top-page Similarity Feature in Machine Learning-Based Web Phishing Detection , 2010, 2010 Third International Conference on Knowledge Discovery and Data Mining.