Phishing URL detection-based feature selection to classifiers

Phishing is an online scandalous act that occurs when a malevolent web page impersonates as legitimate web page in the intension of exploiting the confidential information from the user. Phishing attack continues to pose serious risk for web users and annoying threat in the field of electronic commerce. Feature selection is the process of removing unrelated features and thus reduces the dimensionality of the feature. This paper focuses on identifying the foremost features that categorise legitimate websites from phishing websites based on feature selection. In real world identifying phishing URL with low computational time and accuracy is very important and thus feature selection is considered in this work. A comparative study is carried out on different data mining classifiers before and after feature selection and the performance are evaluated in terms of accuracy and computational rate. The results indicate that the proposed approach detects phishing websites with considerable accuracy.

[1]  Ian Witten,et al.  Data Mining , 2000 .

[2]  Gholam Ali Montazer,et al.  Detection of phishing attacks in Iranian e-banking using a fuzzy-rough hybrid system , 2015, Appl. Soft Comput..

[3]  Xiaotie Deng,et al.  Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD) , 2006, IEEE Transactions on Dependable and Secure Computing.

[4]  David C. Yen,et al.  An intelligent embedded system for malicious email filtering , 2013, Comput. Stand. Interfaces.

[5]  Carolyn Penstein Rosé,et al.  CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites , 2011, TSEC.

[6]  Eric Medvet,et al.  Visual-similarity-based phishing detection , 2008, SecureComm.

[7]  Jarrod Trevathan,et al.  A Proactive Approach to Preventing Phishing Attacks Using Pshark , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[8]  Kuan-Ta Chen,et al.  Counteracting Phishing Page Polymorphism: An Image Layout Analysis Approach , 2009, ISA.

[9]  Fadi A. Thabtah,et al.  Phishing detection based Associative Classification data mining , 2014, Expert Syst. Appl..

[10]  Mingxing He,et al.  An efficient phishing webpage detector , 2011, Expert Syst. Appl..

[11]  Jemal H. Abawajy,et al.  An approach for profiling phishing activities , 2014, Comput. Secur..

[12]  Ilango Krishnamurthi,et al.  A comprehensive and efficacious architecture for detecting phishing webpages , 2014, Comput. Secur..

[13]  Xuhua Ding,et al.  Anomaly Based Web Phishing Page Detection , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[14]  Kuan-Ta Chen,et al.  Fighting Phishing with Discriminative Keypoint Features , 2009, IEEE Internet Computing.

[15]  Xi Chen,et al.  Assessing the severity of phishing attacks: A hybrid data mining approach , 2011, Decis. Support Syst..

[16]  Yuancheng Li,et al.  A semi-supervised learning approach for detection of phishing webpages , 2013 .

[17]  Lorrie Faith Cranor,et al.  Cantina: a content-based approach to detecting phishing web sites , 2007, WWW '07.

[18]  Md. Rafiqul Islam,et al.  A multi-tier phishing detection and filtering approach , 2013, J. Netw. Comput. Appl..

[19]  Zhijun Yan,et al.  A domain-feature enhanced classification model for the detection of Chinese phishing e-Business websites , 2014, Inf. Manag..

[20]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[21]  Huajun Huang,et al.  A SVM-based Technique to Detect Phishing URLs , 2012 .

[22]  Neda Abdelhamid,et al.  Multi-label rules for phishing classification , 2015 .

[23]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[24]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[25]  Chun-Ying Huang,et al.  Using one-time passwords to prevent password phishing attacks , 2011, J. Netw. Comput. Appl..