[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] time to move




> Roozbeh Pournader <roozbeh@sharif.edu> wrote:
> 
> > Would you describe why do you see UTF-8 bad for Arabic and Cyrillic?
> 
> Because for those scripts the ACE encoding is usually significantly
> smaller than the UTF-8 encoding.
> 
> Seen another way, the Cyrillic and Arabic alphabets are about the same
> size as the Latin alphabet, so the amount of information per character
> is about the same for all three scripts, but UTF-8 uses almost double
> the number of octets per character for Cyrillic and Arabic as opposed to
> Latin.
> 
> AMC

Is octet counting an argument you'd like to offer as controlling for
edge device functional requirements in the end2end-interest list?

If you'll assume for the moment that interoperation implies "over a wire",
then
	it (octet count arguements) isn't compelling where intermediary
	hops are concerned,
and
	it (octet count arguements) isn't compelling where host requirements
	are concerned,
and
	it (octet count arguements) may be defensible only in last-mile
	bandwidth settings, which are XML dominated where "interactive".

Just what does that assumption leave?

What assumption is better?

Starting about XPG/4.2, we (os vendors) have been trudging down the utf-8
road, from Redmond, Austin, Mtn. View, Coopertino, even Berkeley. Has it
been an error?

Tia,
Eric