Jail-Phish: An improved search engine based phishing detection system

Abstract Stealing of sensitive information (username, password, credit card information and social security number, etc.) using a fake webpage that imitates trusted website is termed as phishing. Recent techniques use search engine based approach to counter the phishing attacks as it achieves promising detection accuracy. But, the limitation of this approach is that it fails when phishing page is hosted on compromised server. Moreover, it also results in low true negative rate when newly registered or non-popular domains are encountered. Hence, in this paper, we propose an application named as Jail-Phish, which improves the accuracy of the search engine based techniques with an ability to detect the Phishing Sites Hosted on Compromised Servers (PSHCS) and also detection of newly registered legitimate sites. Jail-Phish compares the suspicious site and matched domain in the search results for calculating the similarity score between them. There exists some degree of similarity such as logos, favicons, images, scripts, styles, and anchorlinks within the pages of the same website whereas on the other side, the dissimilarity within the pages is very high in PSHCS. Hence, we use the similarity score between the suspicious site and matched domain as a parameter to detect the PSHCS. From the experimental results, it is observed that Jail-Phish achieved an accuracy of 98.61%, true positive rate of 97.77% and false positive rate less than 0.64%.

[1]  Ilango Krishnamurthi,et al.  A comprehensive and efficacious architecture for detecting phishing webpages , 2014, Comput. Secur..

[2]  Zhenkai Liang,et al.  Phishing-Alarm: Robust and Efficient Phishing Detection via Page Component Similarity , 2017, IEEE Access.

[3]  M. Alamgir Hossain,et al.  Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text , 2019, Expert Syst. Appl..

[4]  Syed Taqi Ali,et al.  A Computer Vision Technique to Detect Phishing Attacks , 2015, 2015 Fifth International Conference on Communication Systems and Network Technologies.

[5]  Dawn Xiaodong Song,et al.  Design and Evaluation of a Real-Time URL Spam Filtering Service , 2011, 2011 IEEE Symposium on Security and Privacy.

[6]  Yu Zhou,et al.  Visual Similarity Based Anti-phishing with the Combination of Local and Global Features , 2014, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications.

[7]  Heejo Lee,et al.  Detecting Malicious Web Links and Identifying Their Attack Types , 2011, WebApps.

[8]  Alwyn Roshan Pais,et al.  An Enhanced Blacklist Method to Detect Phishing Websites , 2017, ICISS.

[9]  T. L. McCluskey,et al.  An assessment of features related to phishing websites using an automated technique , 2012, 2012 International Conference for Internet Technology and Secured Transactions.

[10]  Xiaotie Deng,et al.  Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD) , 2006, IEEE Transactions on Dependable and Secure Computing.

[11]  Samuel Marchal,et al.  Off-the-Hook: An Efficient and Usable Client-Side Phishing Prevention Application , 2017, IEEE Transactions on Computers.

[12]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[13]  Ilango Krishnamurthi,et al.  An efficacious method for detecting phishing webpages through target domain identification , 2014, Decis. Support Syst..

[14]  Choon Lin Tan,et al.  PhishWHO: Phishing webpage detection via identity keywords extraction and target domain name finder , 2016, Decis. Support Syst..

[15]  Brian Ryner,et al.  Large-Scale Automatic Classification of Phishing Pages , 2010, NDSS.

[16]  Hicham Tout,et al.  Phishpin: An Identity-Based Anti-phishing Approach , 2009, 2009 International Conference on Computational Science and Engineering.

[17]  Tyler Moore,et al.  Automatic Identification of Replicated Criminal Websites Using Combined Clustering , 2014, 2014 IEEE Security and Privacy Workshops.

[18]  Michael Atighetchi,et al.  Attribute-Based Prevention of Phishing Attacks , 2009, 2009 Eighth IEEE International Symposium on Network Computing and Applications.

[19]  Jason I. Hong,et al.  A hybrid phish detection approach by identity discovery and keywords retrieval , 2009, WWW '09.

[20]  Wei Zhang,et al.  Two-stage ELM for phishing Web pages detection using hybrid features , 2017, World Wide Web.

[21]  Steven C. H. Hoi,et al.  Cost-sensitive online active learning with application to malicious URL detection , 2013, KDD.

[22]  Xu Chen,et al.  A stacking model using URL and HTML features for phishing webpage detection , 2019, Future Gener. Comput. Syst..

[23]  Akira Yamada,et al.  Visual similarity-based phishing detection without victim site information , 2009, 2009 IEEE Symposium on Computational Intelligence in Cyber Security.

[24]  Shouhuai Xu,et al.  Cross-layer detection of malicious websites , 2013, CODASPY.

[25]  Jun Ho Huh,et al.  Phishing Detection with Popular Search Engines: Simple and Effective , 2011, FPS.

[26]  Weili Han,et al.  Anti-phishing based on automated individual white-list , 2008, DIM '08.

[27]  Carolyn Penstein Rosé,et al.  CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites , 2011, TSEC.

[28]  Rakesh M. Verma,et al.  Detecting Phishing Emails the Natural Language Way , 2012, ESORICS.

[29]  Xiaotie Deng,et al.  An antiphishing strategy based on visual similarity assessment , 2006, IEEE Internet Computing.

[30]  Mingxing He,et al.  An efficient phishing webpage detector , 2011, Expert Syst. Appl..

[31]  Stephen Groat,et al.  GoldPhish: Using Images for Content-Based Phishing Analysis , 2010, 2010 Fifth International Conference on Internet Monitoring and Protection.

[32]  Ankit Kumar Jain,et al.  Two-level authentication approach to protect from phishing attacks in real time , 2018, J. Ambient Intell. Humaniz. Comput..

[33]  Indrakshi Ray,et al.  "Kn0w Thy Doma1n Name": Unbiased Phishing Detection Using Domain Name Based Features , 2018, SACMAT.

[34]  Tyler Moore,et al.  Examining the impact of website take-down on phishing , 2007, eCrime '07.

[35]  Gary Warner,et al.  Clustering Potential Phishing Websites Using DeepMD5 , 2012, LEET.

[36]  Ba Lam To,et al.  A novel approach for phishing detection using URL-based heuristic , 2014, 2014 International Conference on Computing, Management and Telecommunications (ComManTel).

[37]  Kang-Leng Chiew,et al.  Phishing Detection via Identification of Website Identity , 2013, 2013 International Conference on IT Convergence and Security (ICITCS).

[38]  Rakesh M. Verma,et al.  Two-Pronged Phish Snagging , 2012, 2012 Seventh International Conference on Availability, Reliability and Security.

[39]  Ali Yazdian Varjani,et al.  New rule-based phishing detection method , 2016, Expert Syst. Appl..

[40]  Phillip A. Porras,et al.  Highly Predictive Blacklisting , 2008, USENIX Security Symposium.

[41]  Hahn-Ming Lee,et al.  Suspicious URL Filtering Based on Logistic Regression with Multi-view Analysis , 2013, 2013 Eighth Asia Joint Conference on Information Security.

[42]  Rakesh M. Verma,et al.  On the Character of Phishing URLs: Accurate and Robust Statistical Learning Classifiers , 2015, CODASPY.

[43]  Qian Cui,et al.  Tracking Phishing Attacks Over Time , 2017, WWW.

[44]  Christopher Krügel,et al.  A layout-similarity-based approach for detecting phishing pages , 2007, 2007 Third International Conference on Security and Privacy in Communications Networks and the Workshops - SecureComm 2007.

[45]  Hsing-Kuo Kenneth Pao,et al.  Malicious URL Detection Based on Kolmogorov Complexity Estimation , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[46]  Kang-Leng Chiew,et al.  Leverage Website Favicon to Detect Phishing Websites , 2018, Secur. Commun. Networks.

[47]  Kang-Leng Chiew,et al.  Utilisation of website logo for phishing detection , 2015, Comput. Secur..

[48]  Pradeep K. Atrey,et al.  Improving the accuracy of Search Engine based anti-phishing solutions using lightweight features , 2016, 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST).

[49]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[50]  Alwyn Roshan Pais,et al.  Detection of phishing websites using an efficient feature-based machine learning framework , 2018, Neural Computing and Applications.

[51]  Ramana Rao Kompella,et al.  PhishNet: Predictive Blacklisting to Detect Phishing Attacks , 2010, 2010 Proceedings IEEE INFOCOM.

[52]  Lorrie Faith Cranor,et al.  Cantina: a content-based approach to detecting phishing web sites , 2007, WWW '07.

[53]  T. L. McCluskey,et al.  Intelligent rule-based phishing websites classification , 2014, IET Inf. Secur..

[54]  Pradeep K. Atrey,et al.  A phish detector using lightweight search features , 2016, Comput. Secur..

[55]  Niels Provos,et al.  A framework for detection and measurement of phishing attacks , 2007, WORM '07.

[56]  Samuel Marchal,et al.  Know Your Phish: Novel Techniques for Detecting Phishing Sites and Their Targets , 2015, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[57]  Gustavo Gonzalez Granadillo,et al.  Decisive Heuristics to Differentiate Legitimate from Phishing Sites , 2011, 2011 Conference on Network and Information Systems Security.

[58]  Jong Kim,et al.  WarningBird: A Near Real-Time Detection System for Suspicious URLs in Twitter Stream , 2013, IEEE Transactions on Dependable and Secure Computing.