The Long "Taile" of Typosquatting Domain Names

Typosquatting is a speculative behavior that leverages Internet naming and governance practices to extract profit from users' misspellings and typing errors. Simple and inexpensive domain registration motivates speculators to register domain names in bulk to profit from display advertisements, to redirect traffic to third party pages, to deploy phishing sites, or to serve malware. While previous research has focused on typosquatting domains which target popular websites, speculators also appear to be typosquatting on the "long tail" of the popularity distribution: millions of registered domain names appear to be potential typos of other site names, and only 6.8% target the 10,000 most popular .com domains. Investigating the entire distribution can give a more complete understanding of the typosquatting phenomenon. In this paper, we perform a comprehensive study of typosquatting domain registrations within the .com TLD. Our methodology helps us to significantly improve upon existing solutions in identifying typosquatting domains and their monetization strategies, especially for less popular targets. We find that about half of the possible typo domains identified by lexical analysis are truly typo domains. From our zone file analysis, we estimate that 20% of the total number of .com domain registrations are true typo domains and their number is increasing with the expansion of the .com domain space. This large number of typo registrations motivates us to review intervention attempts and implement efficient user-side mitigation tools to diminish the financial benefit of typosquatting to miscreants.

[1]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[2]  Vern Paxson,et al.  On the Potential of Proactive Domain Blacklisting , 2010, LEET.

[3]  Tyler Moore,et al.  Measuring the Perpetrators and Funders of Typosquatting , 2010, Financial Cryptography.

[4]  Michael K. Reiter,et al.  Understanding domain registration abuses , 2012, Comput. Secur..

[5]  Vamsi Paruchuri,et al.  Combating Typo-Squatting for Safer Browsing , 2009, 2009 International Conference on Advanced Information Networking and Applications Workshops.

[6]  He Liu,et al.  Click Trajectories: End-to-End Analysis of the Spam Value Chain , 2011, 2011 IEEE Symposium on Security and Privacy.

[7]  Sara D. Sunderland Domain Name Speculation: Are We Playing Whac-a-Mole , 2010 .

[8]  Vern Paxson,et al.  The BIZ Top-Level Domain: Ten Years Later , 2012, PAM.

[9]  Markus Jakobsson,et al.  The Threat of Political Phishing , 2008, HAISA.

[10]  Michalis Faloutsos,et al.  Cyber-Fraud is One Typo Away , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[11]  Xiaowei Yang,et al.  Ads-portal domains: Identification and measurements , 2010, TWEB.

[12]  Michalis Faloutsos,et al.  SUT: Quantifying and mitigating URL typosquatting , 2011, Comput. Networks.

[13]  Peter Groves,et al.  Uniform Domain-Name Dispute-Resolution Policy (UDRP) , 2011 .

[14]  Vern Paxson,et al.  Redirecting DNS for Ads and Profit , 2011, FOCI.

[15]  He Liu,et al.  On the Effects of Registrar-level Intervention , 2011, LEET.

[16]  Yi-Min Wang,et al.  Strider Typo-Patrol: Discovery and Analysis of Systematic Typo-Squatting , 2006, SRUTI.