[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: draft-legg-ldap-gser-abnf-06.txt and friends
The "SafeUTF8Character" ABNF definition in these documents
includes the 5-byte and 6-byte encoding that draft-yergeau-rfc2279bis
seems to deprecate by not even mentioning. The ABNF in
draft-yergeau-rfc2279bis is also more specific, rejecting some
invalid UTF-8 sequences that this document would accept.
One possible solution is to change the definition of SafeUTF8Character from
SafeUTF8Character = %x00-21 / %x23-7F / ; ASCII minus dquote
dquote dquote / ; escaped double quote
%xC0-DF %x80-BF / ; 2 byte UTF8 character
%xE0-EF 2(%x80-BF) / ; 3 byte UTF8 character
%xF0-F7 3(%x80-BF) / ; 4 byte UTF8 character
%xF8-FB 4(%x80-BF) / ; 5 byte UTF8 character
%xFC-FD 5(%x80-BF) ; 6 byte UTF8 character
to
SafeUTF8Character = %x00-21 / %x23-7F / ; ASCII minus dquote
dquote dquote / ; escaped double quote
UTF8-2 / ; 2 byte UTF8 character
UTF8-3 / ; 3 byte UTF8 character
UTF8-4 ; 4 byte UTF8 character
and reference <UTF8-2>, <UTF8-3> and <UTF8-4> from rfc2279bis.
Other solutions include copying the ABNF from rfc2279bis, or not
worrying about the overgenerosity of this ABNF and just deleting
the 5 and 6 byte versions.
(Maybe we talked about this when 2279bis happened, but I must have
forgotten already -- why was it OK to delete the 5 and 6 byte
versions? Will there never be UCS characters larger than U+10FFFF?
Should this change be mentioned in 2279bis in a "changes since 2279
section"? Am I attempting to retroactively apply a DISCUSS on a
document? Will the questions never end?)
Bill