Heuristic nonlinear regression strategy for detecting phishing websites

In this paper, we propose a method of phishing website detection that utilizes a meta-heuristic-based nonlinear regression algorithm together with a feature selection approach. In order to validate the proposed method, we used a dataset comprised of 11055 phishing and legitimate webpages, and select 20 features to be extracted from the mentioned websites. This research utilizes two feature selection methods: decision tree and wrapper to select the best feature subset, while the latter incurred the detection accuracy rate as high as 96.32%. After the feature selection process, two meta-heuristic algorithms are successfully implemented to predict and detect the fraudulent websites: harmony search (HS) which was deployed based on nonlinear regression technique and support vector machine (SVM). The nonlinear regression approach was used to classify the websites, where the parameters of the proposed regression model were obtained using HS algorithm. The proposed HS algorithm uses dynamic pitch adjustment rate and generated new harmony. The nonlinear regression based on HS led to accuracy rates of 94.13 and 92.80% for train and test processes, respectively. As a result, the study finds that the nonlinear regression-based HS results in better performance compared to SVM.

[1]  Johannes Jahn Karush–Kuhn–Tucker Conditions in Set Optimization , 2017, J. Optim. Theory Appl..

[2]  Hamid Reza Karimi,et al.  Reliable Control of Discrete-Time Piecewise-Affine Time-Delay Systems via Output Feedback , 2018, IEEE Transactions on Reliability.

[3]  M. R. Aghaebrahimi,et al.  A fuzzy discrete harmony search algorithm applied to annual cost reduction in radial distribution systems , 2016 .

[4]  K. Dahal,et al.  Intelligent Phishing Website Detection System using Fuzzy Techniques , 2008, 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications.

[5]  Giovanni Bottazzi,et al.  MP-Shield: A Framework for Phishing Detection in Mobile Devices , 2015, 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing.

[6]  Hamid Reza Karimi,et al.  Reliable Output Feedback Control of Discrete-Time Fuzzy Affine Systems With Actuator Faults , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[7]  G. Montazer,et al.  Identifying the critical indicators for phishing detection in Iranian e-banking system , 2013, The 5th Conference on Information and Knowledge Technology.

[8]  Amir Hossein Gandomi,et al.  Hybridizing harmony search algorithm with cuckoo search for global numerical optimization , 2014, Soft Computing.

[9]  Jian Cao,et al.  Detection of Forwarding-Based Malicious URLs in Online Social Networks , 2016, International Journal of Parallel Programming.

[10]  Naixue Xiong,et al.  Steganalysis of LSB matching using differences between nonadjacent pixels , 2016, Multimedia Tools and Applications.

[11]  Rajendra Gupta,et al.  System Design, Investigation and Countermeasure of Phishing Attacks using Data Mining Classification Methods and its Analysis , 2015 .

[12]  Mustafa Korkmaz,et al.  Application of Nonlinear Regression Analysis for Methyl Violet (MV) Dye Adsorption from Solutions onto Illite Clay , 2016 .

[13]  Panos M. Pardalos,et al.  Feature selection based on meta-heuristics for biomedicine , 2014, Optim. Methods Softw..

[14]  Jing Liu,et al.  Feature selection based on FDA and F-score for multi-class classification , 2017, Expert Syst. Appl..

[15]  Gillian Dobbie,et al.  Phishing Detection on Twitter Streams , 2016, PAKDD Workshops.

[16]  Lixia Zhang,et al.  A new algorithm for image recognition and classification based on improved Bag of Features algorithm , 2016 .

[17]  Yu-Lin He,et al.  Fuzzy nonlinear regression analysis using a random weight network , 2016, Inf. Sci..

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Fadi A. Thabtah,et al.  Phishing detection based Associative Classification data mining , 2014, Expert Syst. Appl..

[20]  K. Lee,et al.  A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice , 2005 .

[21]  T. L. McCluskey,et al.  An assessment of features related to phishing websites using an automated technique , 2012, 2012 International Conference for Internet Technology and Secured Transactions.

[22]  Fadi A. Thabtah,et al.  Intelligent phishing detection system for e-banking using fuzzy data mining , 2010, Expert Syst. Appl..

[23]  Andrew H. Sung,et al.  Detection of Phishing Attacks: A Machine Learning Approach , 2008, Soft Computing Applications in Industry.

[24]  Vadlamani Ravi,et al.  Detecting phishing e-mails using text and data mining , 2012, 2012 IEEE International Conference on Computational Intelligence and Computing Research.

[25]  S. Mohan Krishna,et al.  Kalman particle swarm optimized polynomials for data classification , 2012 .

[26]  Zong Woo Geem,et al.  A survey on applications of the harmony search algorithm , 2013, Eng. Appl. Artif. Intell..

[27]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[28]  Xin-She Yang,et al.  A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest , 2014, Expert Syst. Appl..

[29]  Ajith Abraham,et al.  A self adaptive harmony search based functional link higher order ANN for non-linear data classification , 2016, Neurocomputing.

[30]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[31]  T. L. McCluskey,et al.  Intelligent rule-based phishing websites classification , 2014, IET Inf. Secur..

[32]  Siddharth Jain,et al.  An improved harmony search algorithm with dynamically varying bandwidth , 2016 .

[33]  Jianbin Qiu,et al.  A Novel Approach to Reliable Control of Piecewise Affine Systems With Actuator Faults , 2017, IEEE Transactions on Circuits and Systems II: Express Briefs.

[34]  Jemal H. Abawajy,et al.  Phishing Email Feature Selection Approach , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.