[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UTF-8 / RACE
Dear James,
It is good to hear that the overall WG is keeping in mind that most of the
world does not understand English well enough to type in that language. For
Arabs, we write from right to left, and we have no notion of upper case and
lower case, plus it is a phonetic language. This is why a RACE solution in
the short-term would be not so good for us in the Middle East. If the
members of this WG are seeing the fact of applications being deficient in
their ability to display RACE-encoded IDN's as a problem, for us in the
Middle East it is a very big problem. Arabic and English are very
different, and our population does not have a strong command of English. As
we wait for applications to display RACE properly, it will set us back for a
long while - that is the crux of the problem I am afraid of.
For example, the number one Internet application in the Middle East today is
probably Microsoft Internet Explorer. It already has support for UTF-8.
Judging from the emails in this group, Microsoft's claims of "just send
UTF-8" are not enough. However, for the entire Middle East, the little
support that comes with IE 5 for Arabic URL's is a definite positive step,
and there is momentum there.
Note that the assumption of most of the Arabic Internet using CP1256 or
Sakhr is not entirely accurate. It is more accurate to state in what
context is what encoding more popular. For Web content, you will find more
sites written using CP1256 but there are some significant implementations
using UTF-8. I am happy to see that Mr. Adonis Fakih is already on this
list since he represents Ayna.com. Ayna is the number one Arabic portal -
and it supports UTF-8.
In the context of Arabic URL's however, there is no de facto standard,
though Microsoft has certainly given us a strong pathway to one via UTF-8.
Native Name appears to be exploiting that, and it seems to be the
front-runner at this stage. I believe it would be easy for the other
players to do the same thing, including i-DNS and Walid.
I hope this explains my worries James. In a nutshell, we already have a
starting point in the Middle East for giving users of the Web portion of the
Internet access to Arabic domain names using UTF-8. It works
"out-of-the-box." Given that most of the new users of the Internet in the
Middle East equate the Internet with the Web, I would hate to see that
momentum die. Keep in mind that Internet Explorer has over 90% of the
market in the Middle East due to Netscape 4's inability to display Arabic
fonts. Netscape 6 solves this, but it may be well too late.
Sherin
>From: "James Seng/Personal" <James@Seng.cc>
>To: "Sherin Alsoudani" <sherinalsoudani@hotmail.com>,
><amc@cs.berkeley.edu>, <idn@ops.ietf.org>
>Subject: Re: [idn] UTF-8 / RACE
>Date: Mon, 28 May 2001 09:52:32 +0800
>
>Sherin,
>
>I agree with most of your comment here on user aspect and friendliness.
>It is certainly one of the consideration the Working Group need to
>balance with interoperability and usefulness.
>
>However, I am curious about one point: You made an assumption that ACE
>would be unfriendly to end-users. Why is that so?
>
>And how would a UTF-8 solution more friendly given that most of the
>Arabic Internet is using Sakhr encoding or CP1256?
>
>-James Seng
>
>----- Original Message -----
>From: "Sherin Alsoudani" <sherinalsoudani@hotmail.com>
>To: <amc@cs.berkeley.edu>; <idn@ops.ietf.org>
>Sent: Monday, May 28, 2001 6:25 AM
>Subject: Re: [idn] UTF-8 / RACE
>
>
> > Dear Adam,
> >
> > I have some very strong reservations about some of the points you have
> > raised. Though I do not consider myself a DNS expert by any means, I
> > believe my experience as a software developer gives me enough insight
>to
> > hopefully understand most of what you wrote.
> >
> > My comments are interspersed below.
> >
> > >From: "Adam M. Costello" <amc@cs.berkeley.edu>
> > >To: idn@ops.ietf.org
> > >Subject: Re: [idn] UTF-8 / RACE
> > >Date: Sun, 27 May 2001 21:30:52 +0000
> > >
> > >Some people seem to be arguing that using ACE requires no less (or
>even
> > >more) upgrading of software than using UTF-8 without ACE. While it
>may
> > >be true that ACE-fully-working-everywhere requires as much upgrading
>as
> > >UTF-8-working-fully-everywhere, that comparison overlooks an
>important
> > >point.
> > >
> > >ACE affords incremental deployment much better than no-ACE. Suppose
>I
> > >am considering getting an IDN for my domain. With ACE, this will
>make
> > >things better for some users (who have upgraded their clients to
>decode
> > >the ACE) and worse for others (who have old clients and will see ugly
> > >ACEs) but nothing will actually break (mail will get through, web
>pages
> > >will load, etc).
> >
> > This is completely wrong because it disregards the human face of the
> > Internet. This working group should not just focus on "clients",
> > "applications", and "protocols". We should keep in mind the bottom
>line:
> > the Internet is basically a medium for communication. If by
>converting to
> > ACE, it makes it easier for the client/application/protocol
>developers, but
> > makes it harder for the *average* person to get on the Internet, then
>I have
> > strong reservations against ACE. In my part of the world, the Middle
>East,
> > the hugely overwhelming majority of the population is not
>English-fluent at
> > all. To say that "nothing will actually break" is wrong - this is
>only at
> > the protocol etc. level. However, at the human level, if the IDN is
>an ACE,
> > and my application does not support the proper display of that IDN,
>then
> > something most definitely WILL break: the human user of that
>application
> > will simply not type in that IDN.
> >
> > To me, the final result is the same: even if ACE is inherently
>friendlier to
> > the existing infrastructure, most people will not be able to utilize
>IDN's
> > because they will not understand the ACE encoded name whatsoever.
>People
> > will have to upgrade their applications.
> >
> > >But without ACE, if I get an IDN for my domain, this will make things
> > >better for some users (who have upgraded their clients, or who are
>lucky
> > >enough to already be using UTF-8 clients) and will *completely*
>*break*
> > >things for other users (mail will not get through, web pages will not
> > >load, etc). There may be nothing those users can do to fix it,
>because
> > >the breakage might be happening in their provider's software. The
> > >provider might be very slow to upgrade, because 99% of their
>customers
> > >might be English speakers, and the other 1% are just screwed.
> >
> > This is not true. In the Middle East, there are at least two or three
> > providers of IDN's currently: Walid, i-DNS, and Nativ Name. As far as
>I
> > understand, Native Name provides a UTF-8 solution, I do not know about
>the
> > others. The response has been very strong from what I know: providers
>ARE
> > switching to UTF-8; there is a good chance that my ISP will upgrade to
>a
> > UTF-8 compatible software assuming they have not done so already. I
>believe
> > it is a later version of BIND that is necessary.
> >
> > I also read on this list that CNNIC has chosen UTF-8 as their current
> > encoding. This is another indication that providers ARE switching.
> >
> > Another thing: assuming 99% English and 1% non-English in the Middle
>East is
> > completely wrong. I, and many other IT professionals here, are
>committed to
> > bringing our people online. Most of our people have little
>understanding of
> > the written Arabic language, let alone English.
> >
> > >With ACE, people might have to put visible ACEs in some config
> > >files, which is annoying, but at least it will work. Eventually
> > >the application might get upgraded to support native characters in
> > >the config files and then things will get easier. Without ACE, the
> > >application will simply be unusable with IDNs until it is upgraded.
> > >
> > >Conclusion: There should be ACE.
> >
> > This conclusion is premature to say the least.
> >
> > >Next question: Given that there will be ACE, should DNS support
>8-bit
> > >queries in addition to ACE queries? I don't know. No matter what
> > >is recommended or discouraged, some DNS servers will probably try to
> > >guess the encoding of 8-bit queries. This will help old clients, but
> > >increases the risk of spoofing, and allows some applications to be
> > >lazier about upgrading. I haven't yet formed an opinion on this.
> > >
> > >Next question: Should ACE ever be phased out? I don't think so.
>Very
> > >few systems will ever support all Unicode characters, so applications
> > >will sometimes try to display IDNs containing unsupported characters.
> > >What should they display? They should display the ACE, because
>that's
> > >no uglier than anything else they might display, and has the added
>bonus
> > >that you can copy it into any other application and it will still
>work.
> > >
> > >Furthmore, domain names are first and foremost *global* identifiers
> > >intended to be used by *humans* *anywhere* to refer to network
>objects.
> > >For people who know the Unicode characters in the domain name, the
> > >native representation is easiest, but for the billions of people who
> > >don't know those characters, the ACE is much easier to type, write,
> > >speak, and visually compare. (The Roman alphabet is one of the
>smallest
> > >alphabets in existence, and is already widely recognized. If you
>have
> > >to chose a fallback character set for everyone to learn, that's the
>best
> > >choice.)
> >
> > If you wish to try to come to Egypt or any other place in the Middle
>East,
> > and convince people here that English is a good choice even as a
>"fallback
> > character set for everyone to learn", I do not envy your task as you
>will
> > not be taken seriously. I do not understand your comment of "billions
>of
> > people" who do not know "those characters", where those characters are
>the
> > native language of the people. This is so patently wrong to me that I
> > believe I must have misunderstood you. I personally am not interested
>in
> > seeing English propagated as the default language, as a half solution
>for
> > multilingal domains. It is impossible to ask us to learn the Roman
>language
> > simply to satisfy the protocol writers. Instead we are happy to go
>with a
> > tougher solution that will give more access to the average,
>non-English
> > speaker ultimately. If we have to wait a little or a lot longer for
>that to
> > occur, that is worth it compared to a short-term solution that
>requires me
> > to send my mother to English spelling school.
> >
> > >I'm not saying that various protocols shouldn't allow 8-bit encoding
>in
> > >addition to ACE (I have no opinion on that yet), I'm just saying that
> > >ACE will always serve a useful function, and should never be
>deprecated.
> > >
> > >AMC
> > >
> >
> > Sherin
> >
> > _________________________________________________________________
> > Get your FREE download of MSN Explorer at http://explorer.msn.com
> >
> >
>
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com