[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] An experiment with UTF-8 domain names



At 01/01/05 20:37 +0100, Patrik F$BgM(Btstr$B‹N(B wrote:
>At 18.41 +0000 01-01-05, D. J. Bernstein c/o James Seng wrote:
>>paf writes:
>>>  They have to be changed because of the nameprep stage,
>>
>>Ah, yes, the stage where af.mi1 is converted to af.mil.
>
>Not at all. There are other cases which doesn't look as stupid as this one 
>which you just suggested.
>
>Several examples have been brought up on this mailing list about a year 
>ago, so I will not repeat. Se Unicode specification.

I think we should be realistic on both sides here.
We don't have conversions for ASCII, but we have
prohibited characters. Dan, what would happen if
you try to register e.g. +-+.cr.yp.to ?
Also, we have case mapping, currently done on the
server but for many reasons better moved to the
client for UCS.

Also, for UCS, there are quite a few cases where the degree
of determinism and success can be significantly improved
for the end user by additional foldings. The best example
probably is the full-width/half-width folding for Japanese
and some other contexts.

However, we should also be clear about the following:

- Even with the best name preparation, some ambiguities,
   hopefully not worse than mil vs. mi1, will remain.

- Trying to be too inclusive in folding will lead to
   work on cases that may occur once in a million or a
   billion. A rather typical example is the 'fi' ligature.
   [An additional argument for not folding that into 'f','i'
    is that it would change the behaviour for ASCII-only
    names, which is probably not something we want.]

- Expecting that every single software component that
   will deal with internationalized domain names will
   do name preparation on every processing step is not
   realistic (Patrick, I'm not saying that you have such
   expectations, I just want to make sure others don't).


Regards,   Martin.