Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting

In recent years, the web phishing attack has become one of the most serious web security problems, in which the phishers can steal significant financial information about the internet users to carry out financial thefts. Several blacklist-based conventional phishing website detection methods are used to predict the phishing websites. However, numerous phishing websites are not predicted precisely by these blacklist-based conventional methods since many new phishing websites are constantly developed and launched on the Web over time. In this study, hybrid intelligent phishing website prediction using deep neural networks (DNNs) with evolutionary algorithm-based feature selection and weighting methods are suggested to enhance the phishing website prediction. In the proposed hybrid intelligent phishing website prediction approaches, the most influential features and the optimal weights of website features are heuristically identified with the genetic algorithm (GA) to help in increasing the accuracy of phishing website prediction. Accordingly, the website features selected and weighted by the GA are utilised to train DNNs to accurately predict the phishing websites. The experimental results demonstrated that the proposed hybrid intelligent phishing website prediction approaches achieved significantly higher classification accuracy, sensitivity, specificity, and geometric mean in phishing website prediction compared to those proposed in other studies.

[1]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[2]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[3]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[4]  Ali Yazdian Varjani,et al.  New rule-based phishing detection method , 2016, Expert Syst. Appl..

[5]  Choon Lin Tan,et al.  A new hybrid ensemble feature selection framework for machine learning-based phishing detection system , 2019, Inf. Sci..

[6]  Martti Juhola,et al.  Genetic Algorithm Based Approach in Attribute Weighting for a Medical Data Set , 2014 .

[7]  Javier Pérez-Rodríguez,et al.  Simultaneous instance and feature selection and weighting using evolutionary computation: Proposal and study , 2015, Appl. Soft Comput..

[8]  Swagatam Das,et al.  Simultaneous feature selection and weighting - An evolutionary multi-objective optimization approach , 2015, Pattern Recognit. Lett..

[9]  Huan Wang,et al.  New cubic reference table based image steganography , 2018, Multimedia Tools and Applications.

[10]  T. L. McCluskey,et al.  Intelligent rule-based phishing websites classification , 2014, IET Inf. Secur..

[11]  Eui-Nam Huh,et al.  Phishing-Aware: A Neuro-Fuzzy Approach for Anti-Phishing on Fog Networks , 2018, IEEE Transactions on Network and Service Management.

[12]  Nauman Aslam,et al.  Intelligent phishing detection and protection scheme for online transactions , 2013, Expert Syst. Appl..

[13]  Hao Chen,et al.  Enabling cyber-physical communication in 5G cellular networks: challenges, spatial spectrum sensing, and cyber-security , 2017, IET Cyper-Phys. Syst.: Theory & Appl..

[14]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[15]  T. L. McCluskey,et al.  Tutorial and critical analysis of phishing websites methods , 2015, Comput. Sci. Rev..

[16]  Song Guo,et al.  Information and Communications Technologies for Sustainable Development Goals: State-of-the-Art, Needs and Perspectives , 2018, IEEE Communications Surveys & Tutorials.

[17]  Fadi A. Thabtah,et al.  Phishing detection based Associative Classification data mining , 2014, Expert Syst. Appl..

[18]  Mingtian Zhou,et al.  Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes , 2011, Expert Syst. Appl..

[19]  M. S. Vijaya,et al.  Efficient prediction of phishing websites using supervised learning algorithms , 2012 .

[20]  Yi Yang,et al.  Big Data Meet Cyber-Physical Systems: A Panoramic Survey , 2018, IEEE Access.

[21]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[22]  Hong Wen,et al.  A Cross-Layer Secure Communication Model Based on Discrete Fractional Fourier Fransform (DFRFT) , 2015, IEEE Transactions on Emerging Topics in Computing.

[23]  Chulwoo Han,et al.  Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies , 2017, Expert Syst. Appl..

[24]  Mingxing He,et al.  An efficient phishing webpage detector , 2011, Expert Syst. Appl..

[25]  Nicolas Huck,et al.  Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500 , 2017, Eur. J. Oper. Res..

[26]  Song Guo,et al.  Big Data Meet Green Challenges: Big Data Toward Green Applications , 2016, IEEE Systems Journal.

[27]  Jiawei Han,et al.  Feature selection using dynamic weights for classification , 2013, Knowl. Based Syst..

[28]  Swagatam Das,et al.  Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy , 2017, Pattern Recognit. Lett..

[29]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[30]  Ankit Kumar Jain,et al.  Towards detection of phishing websites on client-side using machine learning based approach , 2017, Telecommunication Systems.

[31]  Thomas Fischer,et al.  Deep learning with long short-term memory networks for financial market predictions , 2017, Eur. J. Oper. Res..