[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] IDN rechartering rev 3, and nameprep
-----BEGIN PGP SIGNED MESSAGE-----
James Seng/Personal wrote:
> David Hopwood wrote:
> > This could be fixed by using the "simple" versions of case foldings
> > for characters that include ypogegrammeni or prosgegrammeni, and
> > removing the folding for U+0345. (It isn't actually important to
> > consider ypogegrammeni to be equivalent to iota, but uniform treatment
> > of canonically equivalent strings is definitely important.)
>
> Nameprep dont "fixed" NFKC. That is not the intention.
> Nameprep reference NFKC and use it as-is.
This has nothing to do with "fixing" NFKC. The problem is a result of
using the "full" case foldings for characters that include ypogegrammeni
or prosgegrammeni, before doing normalisation, when those foldings only
work as intended for already-normalised strings.
The solution I proposed above avoids the problem, and has no significant
disadvantages in the context of IDN. It is a straightforward change to
the nameprep folding table, that stays within the current model of a
stringprep profile. I'm certainly not suggesting redesigning NFC or NFKC.
> If there is a problem with NFKC, then we have to fix it at UTC, not
> here.
This is specifically a problem with how case folding is used in nameprep.
Not fixing it would arguably mean that nameprep is nonconformant to the
Unicode Standard (conformance clause C9).
> > Other issues relating to normalisation are:
> > - whether to use NFKC or NFC
> > - what set of characters to disallow (the current spec arguably
> > allows too much)
> > - treatment of Hangul compatibility Jamo
> > - the point raised by Kent Karlsson about Jamo clusters
> > - the definition of "stored" and "request" strings in stringprep
>
> Most of the these issues have been addressed in the report by Nameprep
> team in San Diego.
Am I reading the same document:
<http://www.i-d-n.net/ietf49/idn-sandiego-nameprep-design-team-report.ppt>?
The only one of these issues that is even mentioned in the report is the
set of disallowed characters (and it is not discussed in any detail).
> Those that we didn't (such as Jamo), see the above.
The issue about Jamo clusters would benefit from input from the unicode
mailing list, but it is this WG that will have to decide what, if anything,
to do about it. It *may* turn out that it is not a significant problem (for
example if input methods can be relied on to produce Jamo sequences with
clusters as single characters, or if sequences not in that form would be
sufficiently rare). It certainly should not simply be dismissed without
any discussion, though.
> > I frankly don't see how normalisation can be finalised without knowing
> > which protocol proposal will be used, or why it's necessary to
> > finalise it before then.
>
> Actually they are independent. Whether the solution is IDNA or not,
> UTF-8 or ACE, we always need a normalization process.
Yes, we do, but certain details of normalisation do interact with the
choice of proposal. In particular, the definition of the contexts in
which unassigned characters are allowed does, and that is currently
specified in stringprep.
> While we always related Nameprep to IDNA or ACE, there is nothing in
> Nameprep which enforces ACE or IDNA.
I didn't say that there was.
- --
David Hopwood <david.hopwood@zetnet.co.uk>
Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip
-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv
iQEVAwUBO/Gf6jkCAxeYt5gVAQEx7ggAruKICdCbmCgper2OtvpkVc2uKdAR2DC6
7msfb9d5invXzZQGR9t7gPS7r3jY7OseR0xIBCzAK0kkm0PKJiGHpHzBkvavoqWN
nHvKxw6DgLbXcKaDvj+z7SiSPww0HpL76D91KnmVWoYlQvBS6jr4NM5gN1kjqaNN
l2gOb3G0ANYz2kYtP9J27WbcL52pRSdasKFalxZrEsGa1unVHaC6BJPyYymHsnBB
sHj8+ZLYZ0DZO4WO+rx87xNWUMoV2iHm+AgChkCJoHExt3x7SlOQCWqV+d577PZh
ZwWX4QE5WW5aUfZd4LbvzyQGTJ5STM5xPJyjFqcPS4Gic8mSort+Og==
=WfPa
-----END PGP SIGNATURE-----