[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] opting out of SC/TC equivalence
Hi, Harald,
I would like to question the original ONE CHARACTER
set charter goal. Unicode is like a character library. Even
Chinese had plenty of trouble to use 7,000 character
set through a keyboard. I know, Japanese has screamed
that Japanese input is a chaos world. How the large UCS
would be used by more variety of users worldwide? Don't
be offended if I say, it is like putting a library on the side of
a freeway and expecting drivers stop and buy a book.
However, if a sign says:" Next Rest Area, FREE books!" some
one will stop and look. That is the Script-Tag is for. If
the sign says:"Next Rest Area, Chinese books!" A non-Chinese
would thank you for that too.
As to TC/SC, I think the problem can be divided into two
levels. One level is the mechanical, one-to-one. They are
the majority and bothering the readers. The minority as many
have said those n-to-1problem, normally one of them is
the major one, and the other are minor ones. There are
always exceptions!!! So to benefit the majority, the way to deal
with it is to come up with an arbitary one-to-one, and let Chinese
to fight which one is the major one! The same principle is
applied to decomposition of a Hanja and Kanji. When there
is a base, the rest will follow , the application will have
something to conform to.
I don't care how much information is collected in BCP, the
solution is lying in IDN to come up with a base for application
to work for. If IDN is not the right level to argue this solution,
then the name is MISLEADING. However from [RFC 2825],
I think this is the charter goal of this group. That is the
reason I am arguring for an one-to-one TC/SC and GB, JIS ...
in [nameprep] here and not anywhere else. You are asking
for TC/SC equivalent table? It has been there for more than
a decade already, in the form of GB<=>BIG5. Why do we need
a printout of Unicode "U+3b33 version U+4550" form to
conforming with current [nameprep] .txt specification?
Speaking about other groups of linguistic experts. The UCS
Kangxi radical is a standard set in 300 - 400 years ago,
Shakespear time. I would though they are at least should
hide that name to show it is more current. They are quite
modest about it, I guess. We are discussing something like
[UAX15], but much larger. Where are the experts? Does
linguistists have to be an expert about GB<=>BIG5? All
the TC/SC and phonetics, radicals are common knowledge
in a grade school dictionary in China. Only when this group is
ready to listen, and give a good critics to proposals, given
the expertis this group already have, solution can be reached.
Liana
On Thu, 30 Aug 2001 08:41:52 +0200 Harald Tveit Alvestrand
<harald@alvestrand.no> writes:
> apologies if this was not clearly communicated....
> I speak here as an (opinionated) participant, not as "the IETF".
>
> --On 29. august 2001 18:17 -0700 liana.ydisg@juno.com wrote:
>
> > That is not the case. If IETF does not want to put TC/SC
> > folding in [nameprep], then it has no good reason to
> > agree a versioning table to include GB, Big5, KSC, JIS to
> > transliterated ACE map. In that case, I am no motivation
> > to push for Unicode to accept the long list of radicals.
> > I can sit back and see how long this will go, as I have been
> > assumed that by now the TC/SC should have been in there
> > long time ago, which has been proved by James that I
> > was wrong.
..... informative and interesting stuff deleted ...
>
> > If IETF has no architecture to accomodate these types
> > of script requirement, and is not planning to use a complete
> > list of radicals, please give me a reason for me to push it for
> > Unicode standard.
>
> The IETF is the creation of its members.
> We have created an architecture that is dependent on outside parties
> for
> character set and linguistic expertise as captured in character set
> standards.
> When those standards are not available, we cannot use them.
>
> > Another option is that IETF still can go ahead giving the world a
> > simple listing of 4128 TC/SC equivalent listing some where
> > else, catching up with your product delivering schedule, waiving
> > hands and say: take care of it, but out of my sight. The end
> result
> > is just like Unicode imposing a misconception of TC/SC are
> > two different languges.
>
> Delivering broken solutions is abhorrent to me.
> So this option is not an option.
>
> > I hope my explaination can shed a little
> > light on CNNIC's feeling about TC/SC arguements. You also
> > can tell me that my input is out of the scope of this group, and
> > I am ready to leave too.
>
> I think the group will have a better chance of delivering output
> that is
> useful to both you and me if we declare that the TC/SC issue is not
> going
> to be solved inside the IDN work.
>
> I believe this problem is hard enough that it has to be solved at
> another
> level.
> Sorry about that.
>
> Harald
>