[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[idn] a few more comments on the last call documents
Some smaller comments (most, but not all, editorial):
IDNA:
clause 1:
"src attribute in an HTML <IMG> tag" --> "'src' attribute in an XHTML 'img' tag"
(promote XHTML instad of HTML, and quoting attribute and tag names)
clause 6.1:
"and can display domain names in any charset"; apart from that "any"
here is a very strong assumption, I cannot make sense of that sentence.
clause 6.2:
"not only domain names in Unicode, but also in local script"; that
does not make sense
clause 6.3:
"this memo", memo?
clause 6.4:
"displaying the name with the replacement character"? The replacement
character does not figure in displaying any other character (except
in Linux xterm, but that's twisted); a (or or one of the) replacement
glyph(s) is used however, when a proper glyph cannot be found.
"...SHOULD show the name in ACE format... This is to make is to make it
easier to transfer the name correctly to other programs"; cut and paste
works very well also for strings that contain characters not in the
font used at the moment, using an ACE display will be more of problem
than help. Please remove that UNHELPFUL recommendation.
"show...replacement character"; same error; display is done via glyphs
not via characters (the mapping is non-trivial)
clause 6.4: "all Unicode text is stored in logical order". I wish,
but this does not hold for the "logical_order_exception" characters,
nor for Khmer ROBAT, and need not hold for the Arabic presentation
forms.
General: There should be opening for transition to a long-term non-ACE
solution (probably based on UTF-8). The ACE based solution should
be explicitly declared to be short-term.
stringprep:
clause 1.1: "aepfel" is a *fallback* for "Àpfel", they are not equivalent
at all. They sometimes (only sometimes) collate the same in German.
(One could argue that combining diaeresis and combining latin small
letter e are "more equivalent" (than the given example); but small
letter e as a diacritic above has fallen out of use.)
clause 3: [case] "mapping"; nameprep does not do case mapping, it does
case folding (not quite the same).
clause 4: "for KC" --> "form KC"
nameprep:
clause 1.2: "SYMBOLS" does not occur in the tables; though it should
[as I mentioned in another comment]...
clause 3.1: Khmer "INHERENT" vowels are invisible, and should in most
cases (if not all cases) be ignored, in IDNs in particular;
there may be a few mappings that should be done for Khmer for
mistaken encodings as characters)
clause 5.3: the replacement character is not used or display, it works
on the character level, not at the glyph level
clause 5.8: 10646 does not deprecate any characters; but Unicode does
deprecate those characters.
Kind regards
/kent k