[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Back to work (Nameprep) (was: Re: Just send UTF-8 with nameprep (was: RE: [idn] Reality Check))



> >What I'd really like to see us work on is the transcribability problem.
> >This is a problem that all of the proposals have in common - there
> >are still too many similar glyphs with different code points that are not
> >folded by nameprep.  I see this as the biggest remaining problem that
> >must be solved before we can standardize an ACE lookup scheme for IDNs.
> >(even if we standardize an alternate one later that uses UTF-8 or some
> >other encoding)
> 
> Very good point. Do you have any particular kind of similarities
> in mind, or can you give some examples? 

this is well outside of my area of expertise, but several people have cited 
examples that haven't been refuted.  


> Or do you have some kind
> of principles or tests in mind that should be applied? Or any
> particular kind of procedure that we should follow?

one idea (which I don't particularly like) is to assume that all characters
within a single label are from a single langauge, and if the same glyph
maps to different code points (indicating characters from differnet languages)
then you resolve the ambiguity by using the code point that creates the
fewest number of language changes.  I won't even begin to list the problems
with this; I mention it only because I think that this approximates the
behavior that is most natural for human beings.

another idea (which I likely only slightly better) is to have two kinds
of ACE - one (using nameprep) for name-to-whaterver lookups and another 
(not using nameprep) for IDNs returned in PTR records.  That way, 
nameprep can be more agressive about folding together codepoints with
similar glyphs, because it doesn't affect names *returned* from DNS.
(unfortunately, you still have to nameprep names returned in CNAME 
records, NS records, MX records, etc.)

actually, the ability to do lookups without having to force the names
to suffer nameprep munging - in other words, the ability to do 
name comparisons without requiring that the representations be identical -
might be the most compelling reason to eventually deploy a native
IDN query interface to DNS.  

Keith