[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] process
If we do *not* allow these special local characters that function in the
same way as the hyphen in the West, then people in other parts of the
world would not only claim that our spec is unfair, they might even
ignore it. If we *do* allow this Japanese example, then we have started
sliding down a slippery slope that ends with a rather large extension of
the LDH rule (for the rest of the world), and then the phishing problem
would not be alleviated as much as we might have hoped when we started
with just LDH. This would be a lot of work for little gain.
So it's a lose-lose situation.
Sorry, I said that wrong. What I meant was, "Damned if you do, damned if
you don't."
However, one avenue that might be worth exploring some more is to check
each registry's character table (for those that have one) and see what
the Unicode category is for each character. The Japanese Katakana middle
dot U+30FB has the category "Pc" which means "punctuation, connector"
and LDH's hyphen U+002D has the category "Pd" which means "punctuation,
dash".
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
http://www.unicode.org/Public/UNIDATA/UCD.html#General_Category_Values
http://vanderpoel.org/networking/i/idn.html (see bottom)
If it turns out that all or most of the registries that have tables are
using characters with only a small number of Unicode categories, then we
may wish to consider moving IDNA to that set of categories (disallowing
all others). This would keep the registries happy while keeping *some*
of the phishy characters out of DNS.
Erik