[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: draft-legg-ldap-gser-abnf-06.txt and friends



The "SafeUTF8Character" ABNF definition in these documents
includes the 5-byte and 6-byte encoding that draft-yergeau-rfc2279bis
seems to deprecate by not even mentioning.  The ABNF in
draft-yergeau-rfc2279bis is also more specific, rejecting some
invalid UTF-8 sequences that this document would accept.

One possible solution is to change the definition of SafeUTF8Character from

      SafeUTF8Character = %x00-21 / %x23-7F /   ; ASCII minus dquote
                          dquote dquote /       ; escaped double quote
                          %xC0-DF %x80-BF /     ; 2 byte UTF8 character
                          %xE0-EF 2(%x80-BF) /  ; 3 byte UTF8 character
                          %xF0-F7 3(%x80-BF) /  ; 4 byte UTF8 character
                          %xF8-FB 4(%x80-BF) /  ; 5 byte UTF8 character
                          %xFC-FD 5(%x80-BF)    ; 6 byte UTF8 character

to

      SafeUTF8Character = %x00-21 / %x23-7F /   ; ASCII minus dquote
                          dquote dquote /       ; escaped double quote
                          UTF8-2 /              ; 2 byte UTF8 character
                          UTF8-3 /  		; 3 byte UTF8 character
                          UTF8-4		; 4 byte UTF8 character

and reference <UTF8-2>, <UTF8-3> and <UTF8-4> from rfc2279bis.
Other solutions include copying the ABNF from rfc2279bis, or not
worrying about the overgenerosity of this ABNF and just deleting
the 5 and 6 byte versions.

(Maybe we talked about this when 2279bis happened, but I must have
forgotten already -- why was it OK to delete the 5 and 6 byte
versions?  Will there never be UCS characters larger than U+10FFFF?
Should this change be mentioned in 2279bis in a "changes since 2279
section"?  Am I attempting to retroactively apply a DISCUSS on a
document?  Will the questions never end?)

  Bill