Catching Classical and Hijack-Based Phishing Attacks

The social engineering strategy, used by cyber criminals, to get confidential information from Internet users is called phishing. It continues to trick Internet users into losing time and money each year, besides the loss of productivity. The trends and patterns in such attacks keep on changing over time and hence the detection algorithm needs to be robust and adaptive. Although, many phishing attacks work by luring Internet users to a web site designed to trick them into revealing sensitive information, recently some phishing attacks have been found that work by either installing malware on a computer or by hijacking a good web site. In this paper, we present effective and comprehensive classifiers for both kinds of attacks, classical or hijack-based. To the best of our knowledge, our work is the first to consider hijack-based phishing attacks. Our techniques are also effective at zero-hour phishing web site detection. We focus on the fundamental characteristics of phishing web sites and decompose the classification task for a phishing web site into a URL classifier, a content-based classifier and ways of combining the two. Both the URL classifier and the content-based classifier introduce new features and techniques. We present results of these classifiers and combination schemes on datasets extracted from several sources. We show that: (i) our URL classifier is highly accurate, (ii) our content-based classifier achieves good performance considering the difficulty of the problem and the small size of our white list, and (iii) one of our combination methods achieves superior detection of phishing web sites (over 99.97%) with reasonable false positives of about 3.5 % and another achieves just 0.22% false positives with more than 83% true positive rate. Moreover, our content-based classifier does not need any periodic retraining. Our methods are also language independent.

[1]  Wilfried N. Gansterer,et al.  E-Mail Classification for Phishing Defense , 2009, ECIR.

[2]  Jason Hong,et al.  The state of phishing attacks , 2012, Commun. ACM.

[3]  Gerhard Paass,et al.  Improved Phishing Detection using Model-Based Features , 2008, CEAS.

[4]  Carolyn Penstein Rosé,et al.  CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites , 2011, TSEC.

[5]  Christopher Krügel,et al.  Your botnet is my botnet: analysis of a botnet takeover , 2009, CCS.

[6]  Lorrie Faith Cranor,et al.  An Empirical Analysis of Phishing Blacklists , 2009, CEAS 2009.

[7]  Rakesh M. Verma,et al.  Semantic Feature Selection for Text with Application to Phishing Email Detection , 2013, ICISC.

[8]  Jason I. Hong,et al.  A hybrid phish detection approach by identity discovery and keywords retrieval , 2009, WWW '09.

[9]  Justin Tung Ma,et al.  Learning to detect malicious URLs , 2011, TIST.

[10]  Marie-Francine Moens,et al.  New filtering approaches for phishing email , 2010, J. Comput. Secur..

[11]  Rakesh M. Verma,et al.  Detecting Phishing Emails the Natural Language Way , 2012, ESORICS.

[12]  Lorrie Faith Cranor,et al.  Cantina: a content-based approach to detecting phishing web sites , 2007, WWW '07.

[13]  Suku Nair,et al.  A comparison of machine learning techniques for phishing detection , 2007, eCrime '07.

[14]  Norman M. Sadeh,et al.  Learning to detect phishing emails , 2007, WWW '07.

[15]  Christopher Krügel,et al.  On the Effectiveness of Techniques to Detect Phishing Sites , 2007, DIMVA.

[16]  Brian Ryner,et al.  Large-Scale Automatic Classification of Phishing Pages , 2010, NDSS.

[17]  Weider D. Yu,et al.  PhishCatch - A Phishing Detection Tool , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.

[18]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[19]  Andrew H. Sung,et al.  Detection of Phishing Attacks: A Machine Learning Approach , 2008, Soft Computing Applications in Industry.

[20]  Niels Provos,et al.  A framework for detection and measurement of phishing attacks , 2007, WORM '07.

[21]  Pavel Laskov,et al.  Detection of Intrusions and Malware, and Vulnerability Assessment: 19th International Conference, DIMVA 2022, Cagliari, Italy, June 29 –July 1, 2022, Proceedings , 2022, International Conference on Detection of intrusions and malware, and vulnerability assessment.

[22]  Dimitris Gritzalis,et al.  Computer Security - Esorics 2010 , 2011 .

[23]  Lance James,et al.  Phishing exposed , 2005 .

[24]  Markus Jakobsson,et al.  Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft , 2006 .

[25]  Moti Yung,et al.  Computer Security – ESORICS 2012 , 2012, Lecture Notes in Computer Science.

[26]  Carolyn Penstein Rosé,et al.  A Hierarchical Adaptive Probabilistic Approach for Zero Hour Phish Detection , 2010, ESORICS.

[27]  Minaxi Gupta,et al.  Behind Phishing: An Examination of Phisher Modi Operandi , 2008, LEET.