[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] 7CE

To: idn@ops.ietf.org
Subject: [idn] 7CE
From: "D. J. Bernstein" <djb@cr.yp.to>
Date: 4 Oct 2001 15:26:30 -0000
Automatic-Legal-Notices: Copyright 2001, D. J. Bernstein. My transmission of this message to you does not constitute a copyright waiver or any other limitation of my rights, even if you have told me otherwise.
Mail-Followup-To: idn@ops.ietf.org

ASCII is a standard _encoding_ of characters as bytes: in other words, a
function from byte strings to character strings.

ASCII specifies that, for example, the byte string 100 122 45 45 102 111
111 represents the character string dz--foo.

UTF-8 is compatible with ASCII. The UTF-8 function, restricted to ASCII
byte strings, is exactly the ASCII function. 100 122 45 45 102 111 111
means dz--foo under UTF-8, just as it does under ASCII.

In contrast, if an encoding maps 100 122 45 45 102 111 111 to something
other than dz--foo, that encoding is _not_ compatible with ASCII.

``ASCII'' does not mean ``7-bit.'' It is simply not correct to refer to
Q-P-style Unicode encodings as ``ASCII compatible.'' ASCII is not merely
a set of numbers; it assigns _characters_ to those numbers.

The correct phrase is ``7-bit compatible.'' The encoded strings are
compatible with 7-bit channels. That's the point of these encodings.

---Dan

Prev by Date: Re: [idn] Renaming "AMC-Z"
Next by Date: Re: [idn] 7CE
Prev by thread: Re: [idn] unicode and bind 9
Next by thread: Re: [idn] 7CE
Index(es):
- Date
- Thread