A Case of Identity: Detection of Suspicious IDN Homograph Domains Using Active DNS Measurements

The possibility to include Unicode characters in domain names allows users to deal with domains in their regional languages. This is done by introducing Internationalized Domain Names (IDN). However, the visual similarity between different Unicode characters - called homoglyphs - is a potential security threat, as visually similar domain names are often used in phishing attacks. Timely detection of suspicious homograph domain names is an important step towards preventing sophisticated attacks, since this can prevent unaware users to access those homograph domains that actually carry malicious content. We therefore propose a structured approach to identify suspicious homograph domain names based not on use, but on characteristics of the domain name itself and its associated DNS records. To achieve this, we leverage the OpenINTEL active DNS measurement platform, which performs a daily snapshot of more than 65% of the DNS namespace. In this paper, we first extend the existing Unicode homoglyph tables (confusion tables). This allows us to detect on average 2.97 times homograph domains compared to existing tables. Our proactive detection of suspicious IDN homograph domains provides an early alert that would help both domain owners as well as security researchers in preventing IDN homograph abuse.

[1]  Mitsuaki Akiyama,et al.  DomainScouter: Understanding the Risks of Deceptive IDNs , 2019, RAID.

[2]  Wenyin Liu,et al.  Detect Visual Spoofing in Unicode-Based Text , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Scott R. Tilley,et al.  Multilingual web sites: Internationalized Domain Name homograph attacks , 2010, 2010 12th IEEE International Symposium on Web Systems Evolution (WSE).

[4]  Ahmed F. Shosha,et al.  Large scale detection of IDN domain name masquerading , 2018, 2018 APWG Symposium on Electronic Crime Research (eCrime).

[5]  Sid Stamm,et al.  Fighting unicode-obfuscated spam , 2007, eCrime '07.

[6]  Viktor Krammer Phishing defense against IDN address spoofing attacks , 2006, PST.

[7]  James Miller,et al.  Finding Homoglyphs - A Step towards Detecting Unicode-Based Visual Spoofing Attacks , 2011, WISE.

[8]  Adam M. Costello Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA) , 2003, RFC.

[9]  Indrakshi Ray,et al.  "Kn0w Thy Doma1n Name": Unbiased Phishing Detection Using Domain Name Based Features , 2018, SACMAT.

[10]  Xiaotie Deng,et al.  Regap: a Tool for Unicode-based Web Identity Fraud Detection Regap: a Tool for Unicode-based Web Identity Fraud Detection De , 2022 .

[11]  Mitsuaki Akiyama,et al.  Detection Method of Homograph Internationalized Domain Names with OCR , 2019, J. Inf. Process..

[12]  Mark Stevenson,et al.  Plagiarism Detection in Texts Obfuscated with Homoglyphs , 2017, ECIR.

[13]  Xiaotie Deng,et al.  The methodology and an application to fight against Unicode attacks , 2006, SOUPS '06.

[14]  Martin J. Dürst,et al.  Internationalized Resource Identifiers (IRIs) , 2005, RFC.

[15]  Tobias Lauinger,et al.  It's Not what It Looks Like: Measuring Attacks and Defensive Registrations of Homograph Domains , 2019, 2019 IEEE Conference on Communications and Network Security (CNS).

[16]  Ying Liu,et al.  A Reexamination of Internationalized Domain Names: The Good, the Bad and the Ugly , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[17]  Shigeki Goto,et al.  ShamFinder: An Automated Framework for Detecting IDN Homographs , 2019, Internet Measurement Conference.

[18]  Steven D. Gribble,et al.  Cutting through the Confusion: A Measurement Study of Homograph Attacks , 2006, USENIX Annual Technical Conference, General Track.

[19]  Evgeniy Gabrilovich,et al.  The homograph attack , 2002, CACM.