[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] Re: stringprep and unassigned code points



> At 9:15 PM -0800 10/29/01, Yves Arrouye wrote:
> >I think that at the end of the day, while we agree that adding more code
> >points to Unicode will break some applications that use a Nameprep
> released
> >before the code points were assigned,
> 
> We don't agree here. Following the rules in stringprep, no
> application will be broken by adding new code points. We (or, more
> accurately, Patrik Fältström and Mark Davis) specifically designed it
> that way. Applications taking queries should look at characters that
> they think are unassigned and pass them in the query.
>
> [...]
>
> I heartily request that people read this section of the stringprep
> draft and, if you find places where we can make it clearer, by all
> means let me know soon.

Well, I heartily request that people read my emails carefully or use replies
different than "you're wrong" (not you, Paul). I did read section 6 of
Stringprep, even before Patrick suggested I do so (would make sense,
right?). My understanding is still that the following scenario is possible
(rephrasing it since my previous attempts at quoting Stringprep section 6
were apparently unproductive):

- Application A uses a profile of Stringprep where say U+20000 and above are
unassigned, because U+20000 was not in Unicode at the time that application
was written. It generates queries for server S, getting Unicode code points
from an operating-system provided input method. Per the Stringprep spec, and
your explanation above, when A gets U+20000 it passes it untouched in the
query.

- Server S benefits from a more recent implementation whose profile of
Stringprep has U+20000 to U+200FF assigned. And it has some mapping for
U+20000. Let's say we're talking DNS (what people here relate the most to),
and it is using a Nameprep version with a map that says:

	20000; 20021; Case map

(I could use deletion as well to go around David's argument of bicameral
scripts). Strings that have been Name/String-prepped for server S have
U+20000 mapped to U+20021. As a result, S and A cannot understand each other
when S receives a string that has U+20000 in it. Unless you suggest that S
should reprocess A's query, but I do not think that has been suggested.

I will be glad to understand that I am wrong in saying that this scenario
shows that some queries from older applications won't have the expected
result against more recent servers (older/recent as far as Stringprep goes).
But can you please point out where my reasoning is then?

Thanks,
YA