<<< Date Index >>>     <<< Thread Index >>>

Re: Thoughts and a possible solution on homograph attacks



At 00:48 09/03/2005, Nick FitzGerald wrote:
> Maybe it's better to attack this problem on the browser side and have a

Given IDN would seem to be here to stay, I'd say that will be the only
place we can attack it...

Personally, I think the best place to attack it is at the registry - unfortunately most of those are bound by commercial motivation instead of trying to be good citizens, or help Internet users in general, so it might be harder to change their behaviour than change browser behaviour.

My proposal would be:

1) IDNs only allowed on ccTLDs (not gTLDs). After all , the whole point of IDNs is to have a domain name in the locally readable script to target people within your own region/nation/etc. gTLDs are to have domains to target people globally. I see no purpose (other than vanity) to having an IDN in a gTLD .

2) IDNs should only be allowed to consist of a single character set - be that Latin, Western European, Japanese, Cyrillic etc.

3) A ccTLD should only allow IDNs in their local character set(s). So, you couldn't have a cyrillic IDN on a .us domain, and you couldn't have a greek IDN on a .ru domain.

(4) A domain registry's DRS system should take into account homograph/pseudograph attacks.

(5) Possibly any domains containing only characters which are graphically equivalent to latin characters should not be allowed, but I'm not sure of this one.

I think if IDNs followed these rules they would still keep most of their benefit, but also make it MUCH harder to have a homograph/pseudograph attack.

(1) is needed as it's still possible to make up certain words using characters from the cyrillic and/or greek (and possibly others) character sets that look like words from the latin character set. Having this rule limits the scope of these possible attacks to people in Russian or Greece

(2) is needed for (hopefully) obvious reasons

(3) is needed for the same reason as (1)

(4) Obvious

This leaves (AFAICS) the only possible attacks being in Russia or Greece (and possibly others with similar character sets) to a very few domains. For instance EBAY.RU would still be possible using 0415, 0412, 0410, 0423 as well as 0045, 0042, 0041, 0059.

But, there is only this single combination of cyrillic characters which could resemble the 'proper' EBAY.RU. If you don't have rule (2), then you could have lots of different combinations, eg, 0045, 0412, 0041, 0059 etc. So, having rule (2) means that eBay could register (or use DRS to regain) a single extra domain to protect themselves and their customers, instead of 16.

Rule (5) would stop even this, but it could cause problems with legitimate Russian or Greek words which contain only Latin characters. This really needs behavioural investigation. For instance, if a Greek Internet user saw WWW.KAPPA.GR (or www.<another greek word which can have greek or latin letters), would they type in 'KAPPA' in Greek letters, or in Latin letters? I suspect they'd type it in latin, in which case rule (5) would be OK, but if they'd type it in greek, then rule (5) would not be OK.

I think limiting the protection to browsers has several problems:
- it stops IDNs from having most of their usefulness
- it makes browser development more complicated due to no fault of theirs
- in countries where IDNs are widely used, they're still open to attack as they'd *have to* enable IDNs to be able to use the Internet adequately - it doesn't just affect browsers, it affects email clients, chat programs, ftp programs, news readers, etc etc etc

Even if my 'rules' were optional, well behaved registries could use them, and advertise they use them, then browsers could warn (or un-IDN-ise) IDNs from other registries, but show IDNs from the well behaved registries in their proper character set.

Paul                            VPOP3 - Internet Email Server/Gateway

support@xxxxxxxxxx                      http://www.pscs.co.uk/