COMPARISON OF MACHINE LEARNING TECHNIQUESIN PHISHING WEBSITE CLASSIFICATION

Abstract: Phishing is one among the luring strategies utilized by phishing artist in the aim of abusing the personal details of unsuspected clients. Phishing website is a counterfeit website with similar appearance, but changed destination. The unsuspected client post their information thinking that these websites originate from trusted financial institutions. New antiphishing techniques rise continuously, yet phishers come with new strategy by breaking all the antiphishing mechanisms. Hence there is a need for productive mechanism for the prediction of phishing website. This paper described comparison in classification of phishing websites using different Machinelearning algorithms. Random Forest (RF), C4.5, REP Tree, Decision Stump, Hoeffding Tree, Rotation Forest and MLP were used to determine which method provides the best results in phishing websites classification. All instances are categorized as 1 for “Legitimate”, 0 for “Suspicious” and 1 for “Phishy”. Results show that RF with REP Tree show the best performance on this dataset for classification of phishing websites.

[1]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[2]  Fadi A. Thabtah,et al.  Modelling Intelligent Phishing Detection System for E-banking Using Fuzzy Data Mining , 2009, 2009 International Conference on CyberWorlds.

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  Michael I. Jordan,et al.  Failure diagnosis using decision trees , 2004 .

[5]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[6]  Andrew H. Sung,et al.  Detection of Phishing Attacks: A Machine Learning Approach , 2008, Soft Computing Applications in Industry.

[7]  Bhojane Yogesh,et al.  Intelligent rule-based Phishing Websites Classification , 2016 .

[8]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.