[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Representing codepoints (RE: UTF-8 in Internet Drafts)




--On 31. oktober 2002 16:06 -0500 John C Klensin <klensin@jck.com> wrote:

James and
the rest of the JET team have been discussing (at my request)
ways to present the characters in ASCII text that would make
the document more comprehensible than the code points alone
for those reading the document in ASCII text.
advertising someone else's code, and only tangentially relevant to this list.....

there's a Perl package out on CPAN called "unidecode", maintained by Sean Burke. It can be used to generate ASCII strings from any sequence of Unicode codepoints; in some cases, these ASCII strings can remind people what the characters were supposed to mean.

For Chinese, for instance, it spits out something resembling Pinyin (probably *is* Pinyin, but I don't know enough to guarantee that).

Health warning: I've only played with it....

Harald