To Get Lost is to Learn the Way: An Analysis of Multi-Step Social Engineering Attacks on the Web

Web-based social engineering (SE) attacks manipulate users to perform specific actions, such as downloading malware and exposing personal information. Aiming to effectively lure users, some SE attacks, which we call multi-step SE attacks, constitute a sequence of web pages starting from a landing page and require browser interactions at each web page. Also, different browser interactions executed on a web page often branch to multiple sequences to redirect users to different SE attacks. Although common systems analyze only landing pages or conduct browser interactions limited to a specific attack, little effort has been made to follow such sequences of web pages to collect multi-step SE attacks. We propose StraySheep, a system to automatically crawl a sequence of web pages and detect diverse multi-step SE attacks. We evaluate the effectiveness of StraySheep’s three modules (landing-page-collection, web-crawling, and SE-detection) in terms of the rate of collected landing pages leading to SE attacks, efficiency of web crawling to reach more SE attacks, and accuracy in detecting the attacks. Our experimental results indicate that StraySheep can lead to 20% more SE attacks than Alexa top sites and search results of trend words, crawl five times more efficiently than a simple crawling module, and detect SE attacks with 95.5% accuracy. We demonstrate that StraySheep can collect various SE attacks, not limited to a specific attack. We also clarify attackers’ techniques for tricking users and browser interactions, redirecting users to attacks. key words: social engineering attacks, browser automation, web crawler

[1]  Mitsuaki Akiyama,et al.  To Get Lost is to Learn the Way: Automatically Collecting Multi-step Social Engineering Attacks on the Web , 2020, AsiaCCS.

[2]  Cheat Sheet Selenium , 2018, Reactions Weekly.

[3]  William K. Robertson,et al.  Surveylance: Automatically Detecting Online Survey Scams , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[4]  Manos Antonakakis,et al.  Exposing Search and Advertisement Abuse Tactics and Infrastructure of Technical Support Scammers , 2018, WWW.

[5]  Guang Liu,et al.  How to Learn Klingon without a Dictionary: Detection and Measurement of Black Keywords Used by the Underground Economy , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Bo Li,et al.  Enabling Reconstruction of Attacks on Users via Efficient Browsing Snapshots , 2017, NDSS.

[7]  Nick Nikiforakis,et al.  Dial One for Scam: A Large-Scale Analysis of Technical Support Scams , 2016, NDSS.

[8]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.

[9]  Roberto Perdisci,et al.  Towards Measuring and Mitigating Social Engineering Software Download Attacks , 2016, USENIX Security Symposium.

[10]  Leyla Bilge,et al.  Measuring PUP Prevalence and PUP Distribution through Pay-Per-Install Services , 2016, USENIX Security Symposium.

[11]  Jiyong Jang,et al.  Detecting Malicious Exploit Kits using Tree-based Similarity Searches , 2016, CODASPY.

[12]  Wouter Joosen,et al.  It's Free for a Reason: Exploring the Ecosystem of Free Live Streaming Services , 2016, NDSS.

[13]  Chris Sharp,et al.  Investigating Commercial Pay-Per-Install and the Distribution of Unwanted Software , 2016, USENIX Security Symposium.

[14]  Roberto Perdisci,et al.  WebWitness: Investigating, Categorizing, and Mitigating Malware Download Paths , 2015, USENIX Security Symposium.

[15]  Wei Meng,et al.  Understanding Malvertising Through Ad-Injecting Browser Extensions , 2015, WWW.

[16]  Vern Paxson,et al.  Ad Injection at Scale: Assessing Deceptive Advertisement Modifications , 2015, 2015 IEEE Symposium on Security and Privacy.

[17]  William K. Robertson,et al.  TrueClick: automatically distinguishing trick banners from genuine download links , 2014, ACSAC '14.

[18]  Gianluca Stringhini,et al.  The Dark Alleys of Madison Avenue: Understanding Malicious Advertisements , 2014, Internet Measurement Conference.

[19]  Christopher Krügel,et al.  Hulk: Eliciting Malicious Behavior in Browser Extensions , 2014, USENIX Security Symposium.

[20]  Antonio Nucci,et al.  Detecting malicious HTTP redirections using trees of user browsing activity , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[21]  Gianluca Stringhini,et al.  Stranger danger: exploring the ecosystem of ad-based URL shortening services , 2014, WWW.

[22]  Gianluca Stringhini,et al.  Shady paths: leveraging surfing crowds to detect malicious web pages , 2013, CCS.

[23]  Norbert Pohlmann,et al.  Exploiting visual appearance to cluster and detect rogue software , 2013, SAC '13.

[24]  Adrien Bartoli,et al.  Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[25]  Fang Yu,et al.  Knowing your enemy: understanding and detecting malicious web advertising , 2012, CCS '12.

[26]  D. Dittrich,et al.  The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research , 2012 .

[27]  Paolo Milani Comparetti,et al.  EvilSeed: A Guided Approach to Finding Malicious Web Pages , 2012, 2012 IEEE Symposium on Security and Privacy.

[28]  Jong Kim,et al.  WarningBird: Detecting Suspicious URLs in Twitter Stream , 2012, NDSS.

[29]  Wenke Lee,et al.  SURF: detecting and measuring search poisoning , 2011, CCS '11.

[30]  Ben Y. Zhao,et al.  Detecting and characterizing social spam campaigns , 2010, Conference on Computer and Communications Security.