[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Should we add U+FF0E FULLWIDTH FULL STOP to section 5.10 of Nameprep?
FULLWIDTH FULLSTOP(U+FF0E) is already compatibility decomposed into
Latin FULLSTOP (U+002E) in KC Normalization step in NAMEPREP.
You can confirm that in page 4 of
http://www.unicode.org/charts/PDF/UFF00.pdf .
If any label contains U+ff0e to be mapped into u+002e in kc norm in
nameprep,
it will be treated as error in the following prohibition stage in nameprep
, since
u+002E. is prohibited.
MDNkit from JPNIC has huge tables for that compatibility/canonical
mappings for all unicode points.
1694 11778 73804 nameprepdata.c
6806 38573 327222 unicodedata.c
8500 50351 401026 total
Soobok Lee
----- Original Message -----
From: "Martin Duerst" <duerst@w3.org>
To: "Yves Arrouye" <yves@realnames.com>; <idn@ops.ietf.org>
Sent: Monday, July 16, 2001 2:39 PM
Subject: Re: [idn] Should we add U+FF0E FULLWIDTH FULL STOP to section 5.10
of Nameprep?
> I haven't checked the details on this, but 'full-width period'
> is definitely also used in Japanese, in particular for horizontal
> writing. But having it appear between half-width Latin characters
> is most probably an accident.
>
> Regards, Martin.
>
>
> At 22:13 01/07/15 -0700, Yves Arrouye wrote:
> >Hi,
> >
> >I am not a Chinese expert myself, but have been told that it is quite
likely
> >that U+FF0E would be generated instead of U+002E FULL STOP using a
Chinese
> >IME, in a context where both ideographs and Roman characters were in
close
> >proximity. This sounds defintely likely to happen if you have
<CHINESE>.COM
> >for example where <CHINESE> is made up of Han characters. If that is
really
> >the case, it could be useful to put U+FF0E in the same bag as U+3002 is
in
> >Nameprep.
> >
> >YA
> >
> >PS: here's what someone I work with sent me on this topic (along with an
> >explanation on how easy it is to generate U+FF0E rather than U+002E while
> >typing mixed Chinese and Roman on MS Windows):
> >
> >A plain Chinese user would use double wide roman characters in a context
> >containing both ideographs and roman characters in close proximity,
> >that's what my Chinese speaking coworkers tell me.
> >
> >I find evidence that double wide punctuation is used in ideograph
> >dominated context, strangely even when embedded between single-wide
> >roman:
> >http://surf.sina.com.cn/cgi-bin/newlogin/regentry.cgi, near "Altavista"
> >or
> >http://www.taiwan.com/info_10.htm, near "Cookies".
> >
> >You can see clearly on sina.com or any other Chinese site that double
> >wide punctuation is used exclusively in ideograph context.
> >
>
>