[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [idn] Some new ideas in my updated draft
- To: idn@ops.ietf.org
- Subject: RE: [idn] Some new ideas in my updated draft
- From: Karlsson Kent - keka <keka@im.se>
- Date: Mon, 14 Feb 2000 18:15:25 +0100
- Delivery-date: Mon, 14 Feb 2000 09:16:58 -0800
- Envelope-to: idn-data@psg.com
> -----Original Message-----
> From: Dan [mailto:Dan.Oscarsson@trab.se]
> Sent: Sunday, February 13, 2000 5:03 PM
Dan writes in his draft:
- If a character can be represented in the local character set,
map it from UCS to local character set.
- If a character cannot be represented in the local character set,
map the UTF-8 octet sequence for the character to a hyphen ("-")
followed by the hex code of each octet as two charcters per octet.
- If it was needed to down code because not all characters could be
represented in the local character set, all original hyphens
must be prelced by two hyphens ("--") and the entire strings
MUST end with a single hyphen.
1) "The" local character set? Who's local character set? Such things
are these days usually personal preferences, or rather just personal
defaults swiftly overridden by "charset=", heuristics, or plain temporary
change. Setting one non-UCS character encoding as "the" local one for
entire organisation or the like is highly inappropriate.
2) Though I'm slightly, but only very slightly, more sympathetic to
"say the catalogue number (in hex)" type of fallbacks (than CIDNUC-like),
it should really be the UCS "catalogue number" (like HTML/XML/modern
SGML does), and NOT be tied to any UTF or other encoding.
3) If any such fallback is to be used (occasionally), one need to interpret
the "catalogue number" fallback before lowercasing+normalising, and thus
do "read-catalogue-number-fallback+lowercase+KC-normalise, then lookup".
Kind regards
/kent k