Before adding and sharing your Fraud Alert please check to see if a similar alert has already been posted, thank you:

Phishing through homographs: Letters that look alike but lead you astray in some browsers

Post a Fraud Alert:

When is an “a” not necessary the “a” you think it is? When a browser shows it as part of the URL in the location or smart-search field. Due to the late entry of non-Roman characters to domain names, a backwards-compatible method of representing them aids phishing.

Unicode allows the representation of nearly all the glyphs—characters, symbols, ideograms, script element, and more—that form the basis of language and other written subjects, like math and games, in use around the world. While the Unicode Consortium started its work decades ago, but it’s only in the last few years that it’s finally permeated operating systems, browsers, and apps to the point where you can almost rely on it working almost everywhere.

But the Domain Name System (DNS) that operating systems use to turn human-readable location and resource names into the numeric and other data needed to make a connection dates back even before Unicode. And because of its ubiquity, making any change could break compatibility for hundreds of millions of people and devices—maybe more. This is why some sensible improvements, like having a cryptographic component to a domain name that prevented its being spoofed by a party that didn’t own the domain, has still not been rolled out.

Domains were encoded in a very tiny subset of all possible glyphs: the 26 Roman letters (capitalization aside) plus the numbers 0 to 9 and the hyphen. (The period or “dot” is used to separate elements of a domain name.) This rankled people who live or write or do commerce in scripts other than Roman English—even folks who just needed a ñ or a é were left out. Since that’s the vast majority of people on the planet, something had to give.

Hide one script in another

In 2003, the patch job was “punycode,” a funny name for using ASCII characters to represent glyphs outside of the DNS-allowed set. This would let me register, say, ??.com, which gets converted to The codes are a compressed way to represent Unicode characters, which a browser can then interpret and display.

But here’s the long-running problem: many scripts contains letters similar to those in other scripts. In fact, they may have exactly the same design in the font being used in a location/search bar, which in this context is called a homograph. A homograph is usually words spelled the same but having different meanings; here, it’s really words spelled differently (in different scripts) while having the same appearance.

That has let phishers register domains that, when converted from punycode into Unicode and shown in a browser field look like, say, “” (or similar enough to it). These domains can also receive legitimate TLS certificates, so a savvy user looking for validation that it’s a “real” domain will spot the lock icon showing an https connection is in place.

Browsers have added limitations to punycode conversion over the years to reduce the opportunity for phishing, but the issues has reared its head again as researchers poked at the limitations of how browsers handled issues like using only Roman-identical letters from other scripts. One found that an all-Cyrillic “” could be registered and would appear in Chrome and Firefox as rendered Unicode.