[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Just send UTF-8 with nameprep (was: RE: [idn] Reality Check)



Paul Hoffman wrote:

>
>Dan, it is not clear from the udns-02 document that the protocol 
>requires nameprep on the name server for every query. This is a 
>fairly important design choice of UDNS that should probably be 
>highlighted in the document. As you have seen, some people who 
>thought they were supporting udns-02 don't think this is such a great 
>idea, and it is quite controversial for name servers that run high 
>volumes of queries (like tens of thousands a second).

It would be interesting if somebody have some real data to show
if it really is that CPU heavy.
In UDNS all character data is required to be normalised.
Because of this a lot of the form insensitive matching that nameprep
represents can be done "on the fly" just like the case insensitive
matching of ASCII that is done today.

If we want to compare what things are expensive (take a lot of CPU
or memory) we should try to get data on several things. For example:
- UTF-8 (UCS) to/from ACE
- normalisation of a text (Unicode from C or KC)
- full nameprep (with both normalisation and form folding)
- just form folding
- optimised name matching where the input strings are normalised
  strings and the form insensitive matching according to nameprep
  is done in most efficient manner.

I am sure there are more things that we need data to decide how
things best should be done.

    Dan