[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UTF-8 / RACE
> > I don't think it's fair to characterize ACE as a halfway solution.
>
> It requires significant additional computations and presentation
> processing for it to be anywhere near seamless for the user.
Somehow I don't think I would characterize the additional computation
as 'significant' - it's a fairly trivial format conversion. The
presentation processing is only a significant issue for those applications
that don't already have to parse the input to distinguish components
that contain domain names from those that don't. Most appliations
that I can think off offhand already need to do this. And even with
UTF-8, any application that accepts input of domain names would also
need to do this, becaue it would need to handle nameprep.
> UTF8 encoded data does not require multiple translations.
That depends on whether you consider nameprep a translation.
> Maybe "halfway" isn't proper
> descriptor for ACE (did I use it? no matter), but it's not a complete
> solution either.
Again, I don't see it. In either the ACE case or the UTF8 case you have
to separate domain names from other text for the purpose of input.
In the case of ACE you have to separate them for the purpose of presentation
also, but often you would have had to do it anyway. So there's a chance
of needing some additional complexity on the part of ACE for the purpose
of presentation but this is more than offset by not needing the complexity
of negotiation between components before exchanging domains in the
UTF8 case.
> > since the vast majority of applications will be using standard APIs
> > which convert between native character set and encoding and those
> > used on the wire. For this purpose, the application doesn't care
> > whether the format is UTF-8 or ACE.
>
> My guess is that over the next 10 years, the majority of the systems will
> be either be native UTF8 or UTF32.
I'm not sure what you mean by "systems". Operating systems will certainly
support UTF8, but may use other encodings also. Applications will use
a variety of encodings internally. New applications will probably use
a combination of UTF-8 and 32 bit representations.
Keith