[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD

To: Harald Tveit Alvestrand <harald@alvestrand.no>
Subject: Re: Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD
From: ned.freed@mrochek.com
Date: Fri, 10 Jan 2003 00:15:29 -0800 (PST)
Cc: Patrik Fältström <paf@cisco.com>, IESG <iesg@ietf.org>
In-reply-to: "Your message dated Fri, 10 Jan 2003 08:55:21 +0100"<167250000.1042185321@askvoll.hjemme.alvestrand.no>
References: <2A549448-2461-11D7-96C2-0003934B2128@cisco.com><167250000.1042185321@askvoll.hjemme.alvestrand.no>

--On fredag, januar 10, 2003 07:02:57 +0100 Patrik Fältström
<paf@cisco.com> wrote:

> This is the issue. We talk about the number of bytes in the UTF-8 code.
>
> 4-byte sequences give a range up to 10FFFF.

That's what I thought -- I was having difficulty translating 4 bytes into
an FFFF limit.

Then I have no problems at all.

Neither do I.

> After talking with Paul Hoffman and John Klensin I suggest the following:

> (a) Do _not_ reference Unicode instead of ISO-10646

I think ISO 10646 has adopted the 10FFFF limit too - probably published as
an amendment somewhere. The Unicode folks would know.

That's my recollection as well. I believe 10646 and Unicode are aligned
on this, due to the need to restrict UTF-8 to the same range as UTF-16.

> (b) Say in some text that we move to Standard with just testing 1-4
> octets, changing the spec to cover just those values. That is, we should
> assume that the draft standard is for 1-4 octets, and therefore the full
> standard is too.

Saying that we eliminate the untested/unused feature of 5-6 octet encodings
is permitted under the rules for progression. We just missed eliminating
them at draft.....

Yes, exactly.

> (c) Slip in a sentence somewhere (maybe as a security
> consideration) indicating that > 4 bytes is possible in the future and
> that programs should not be designed on the assumption that they will
> never see more than four bytes.   I.e., interoperability testing at <= 4
> is fine, but I'd hate to set someone up for a buffer overflow problem.

I think this is not likely to be needed; it should be OK to treat 5+ byte
encodings as a protocol error. But I could be wrong...

Actually, I think it is preferable to treat them as protocol errors, due to
the need for UTF-16 compatibility.

				Ned

Follow-Ups:
- Re: draft-yergeau-rfc2279bis-02.txt for STANDARD
  - From: Patrik Fältström <paf@cisco.com>

References:
- Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD
  - From: Patrik Fältström <paf@cisco.com>
- Re: Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD
  - From: Harald Tveit Alvestrand <harald@alvestrand.no>

Prev by Date: Re: Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD
Next by Date: Re: draft-yergeau-rfc2279bis-02.txt for STANDARD
Previous by thread: Re: Fwd: draft-yergeau-rfc2279bis-02.txt for STANDARD
Next by thread: Re: draft-yergeau-rfc2279bis-02.txt for STANDARD
Index(es):
- Date
- Thread