Impact of Current Phishing Strategies in Machine Learning Models for Phishing Detection

Phishing is one of the most widespread attacks based on social engineering. The detection of Phishing using Machine Learning approaches is more robust than the blacklist-based ones, which need regular reports and updates. However, the datasets currently used for training the Supervised Learning approaches have some drawbacks. These datasets only have the landing page of legitimate domains and they do not include the login forms from the websites, which is the most common situation in a real case of Phishing. This makes the performance of Machine Learning-based models to drop, especially when they are tested using login pages.

[1]  Carolyn Penstein Rosé,et al.  CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites , 2011, TSEC.

[2]  M. Alamgir Hossain,et al.  Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text , 2019, Expert Syst. Appl..

[3]  Indrakshi Ray,et al.  "Kn0w Thy Doma1n Name": Unbiased Phishing Detection Using Domain Name Based Features , 2018, SACMAT.

[4]  Xu Chen,et al.  URL2Vec: URL Modeling with Character Embeddings for Fast and Accurate Phishing Website Detection , 2018, 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom).

[5]  Kang Leng Chiew,et al.  Building Standard Offline Anti-phishing Dataset forBenchmarking , 2018 .

[6]  T. Chithralekha,et al.  Classification of Anti-phishing Solutions , 2019, SN Computer Science.

[7]  Ali Yazdian Varjani,et al.  New rule-based phishing detection method , 2016, Expert Syst. Appl..

[8]  Weili Han,et al.  Anti-phishing based on automated individual white-list , 2008, DIM '08.

[9]  Ninghui Li,et al.  Introduction to special section SACMAT'08 , 2011, TSEC.

[10]  Ankit Kumar Jain,et al.  A novel approach to protect against phishing attacks at client side using auto-updated white-list , 2016, EURASIP Journal on Information Security.

[11]  Banu Diri,et al.  NLP Based Phishing Attack Detection from URLs , 2017, ISDA.

[12]  Alwyn Roshan Pais,et al.  Jail-Phish: An improved search engine based phishing detection system , 2019, Comput. Secur..

[13]  Xu Chen,et al.  A stacking model using URL and HTML features for phishing webpage detection , 2019, Future Gener. Comput. Syst..

[14]  Jason R. C. Nurse,et al.  Catching the Phish: Detecting Phishing Attacks using Recurrent Neural Networks (RNNs) , 2019, WISA.

[15]  Ana Ferreira,et al.  Persuasion: How phishing emails can influence users and bypass security measures , 2019, Int. J. Hum. Comput. Stud..

[16]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[17]  Alwyn Roshan Pais,et al.  Detection of phishing websites using an efficient feature-based machine learning framework , 2018, Neural Computing and Applications.

[18]  Mark B. Neider,et al.  Perceptual representation of spam and phishing emails , 2019, Applied Cognitive Psychology.

[19]  Tyler Moore,et al.  Examining the impact of website take-down on phishing , 2007, eCrime '07.

[20]  Ramana Rao Kompella,et al.  PhishNet: Predictive Blacklisting to Detect Phishing Attacks , 2010, 2010 Proceedings IEEE INFOCOM.

[21]  Lorrie Faith Cranor,et al.  Cantina: a content-based approach to detecting phishing web sites , 2007, WWW '07.