A Framework of New Hybrid Features for Intelligent Detection of Zero Hour Phishing Websites

Existing machine learning based approaches for detecting zero hour phishing websites have moderate accuracy and false alarm rates and rely heavily on limited types of features. Phishers are constantly learning their features and use sophisticated tools to adopt the features in phishing websites to evade detections. Therefore, there is a need for continuous discovery of new, robust and more diverse types of prediction features to improve resilience against detection evasions. This paper proposes a framework for predicting zero hour phishing websites by introducing new hybrid features with high prediction performances. Prediction performance of the features was investigated using eight machine learning algorithms in which Random Forest algorithm performed the best with accuracy and false negative rates of 98.45% and 0.73% respectively. It was found that domain registration information and webpage reputation types of features were strong predictors when compared to other feature types. On individual features, webpage reputation features were highly ranked in terms of feature importance weights. The prediction runtime per webpage measured at 7.63 s suggest that our approach has a potential for real time applications. Our framework is able to detect phishing websites hosted in either compromised or dedicated phishing domains.

[1]  Ali Selamat,et al.  New hybrid features for phish website prediction , 2016 .

[2]  Fang Feng,et al.  The application of a novel neural network in the detection of phishing websites , 2018, J. Ambient Intell. Humaniz. Comput..

[3]  Felix C. Freiling,et al.  Measuring and Detecting Fast-Flux Service Networks , 2008, NDSS.

[4]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[5]  M. S. Vijaya,et al.  Efficient prediction of phishing websites using supervised learning algorithms , 2012 .

[6]  Ilango Krishnamurthi,et al.  A comprehensive and efficacious architecture for detecting phishing webpages , 2014, Comput. Secur..

[7]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[8]  Alwyn Roshan Pais,et al.  Detection of phishing websites using an efficient feature-based machine learning framework , 2018, Neural Computing and Applications.

[9]  Ankit Kumar Jain,et al.  Towards detection of phishing websites on client-side using machine learning based approach , 2017, Telecommunication Systems.

[10]  Carolyn Penstein Rosé,et al.  CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites , 2011, TSEC.

[11]  Lorrie Faith Cranor,et al.  An Empirical Analysis of Phishing Blacklists , 2009, CEAS 2009.

[12]  Xu Chen,et al.  A stacking model using URL and HTML features for phishing webpage detection , 2019, Future Gener. Comput. Syst..

[13]  Dharma P. Agrawal,et al.  Fighting against phishing attacks: state of the art and future challenges , 2016, Neural Computing and Applications.