[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: draft-legg-ldap-gser-abnf-06.txt and friends


Yes, I hold Jeff's discuss. And, I had an email exchange with him about it today. I was planning to discuss it on the telechat tomorrow. The way forward is not obvious to me.


At 12:47 PM 5/28/2003 -0700, hardie@qualcomm.com wrote:
Hi Bill,
In reverse order: No, the questions will never end. Yes, you are
introducing a post-facto DISCUSS on a document (two, in fact!). Since
Yergeau has not been sent off, though, we can send questions on
the 5 & 6 byte issue to that author with no harm. Assuming the answer
there confirms that 5 & 6 byte UTF8 characters should not be subject
to the standard, we have a pretty clear path forward here--remove
those (and update his doc to note this change since the previous version).
I'll suggest that whoever is responsible for holding Jeff's discuss
(Russ, no?) let you hold it on this basis (unless there is another reason to hold
that discuss). We can then do a conditional approval until we get
an RFC editor note cleaning up this issue.
Does this makes sense?

At 1:24 PM -0700 5/25/03, Bill Fenner wrote:
The "SafeUTF8Character" ABNF definition in these documents
includes the 5-byte and 6-byte encoding that draft-yergeau-rfc2279bis
seems to deprecate by not even mentioning.  The ABNF in
draft-yergeau-rfc2279bis is also more specific, rejecting some
invalid UTF-8 sequences that this document would accept.

One possible solution is to change the definition of SafeUTF8Character from

      SafeUTF8Character = %x00-21 / %x23-7F /   ; ASCII minus dquote
                          dquote dquote /       ; escaped double quote
                          %xC0-DF %x80-BF /     ; 2 byte UTF8 character
                          %xE0-EF 2(%x80-BF) /  ; 3 byte UTF8 character
                          %xF0-F7 3(%x80-BF) /  ; 4 byte UTF8 character
                          %xF8-FB 4(%x80-BF) /  ; 5 byte UTF8 character
                          %xFC-FD 5(%x80-BF)    ; 6 byte UTF8 character


      SafeUTF8Character = %x00-21 / %x23-7F /   ; ASCII minus dquote
                          dquote dquote /       ; escaped double quote
                          UTF8-2 /              ; 2 byte UTF8 character
                          UTF8-3 /              ; 3 byte UTF8 character
                          UTF8-4                ; 4 byte UTF8 character

and reference <UTF8-2>, <UTF8-3> and <UTF8-4> from rfc2279bis.
Other solutions include copying the ABNF from rfc2279bis, or not
worrying about the overgenerosity of this ABNF and just deleting
the 5 and 6 byte versions.

(Maybe we talked about this when 2279bis happened, but I must have
forgotten already -- why was it OK to delete the 5 and 6 byte
versions?  Will there never be UCS characters larger than U+10FFFF?
Should this change be mentioned in 2279bis in a "changes since 2279
section"?  Am I attempting to retroactively apply a DISCUSS on a
document?  Will the questions never end?)
