[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Document Status?
Someone once told me "You know you are done when you couldn't find things to
take out".
You dont complete document by adding more stuff. You complete it by taking
out the noise and amplify the signal.
-James Seng
----- Original Message -----
From: "JFC (Jefsey) Morfin" <jefsey@jefsey.com>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Tuesday, September 03, 2002 6:22 PM
Subject: Re: [idn] Document Status?
> Dear Adam,
> thank you for your response. I think you perfectly described the problem.
I
> will then make a bore of myself and will list my exiting (increased)
concerns.
>
> After reading this mail and explanations which are not in he RFC I still
do
> not know for sure what is intended to be written. You also refers to
> external knowledge such as "punnycode".
>
> Should we not have a clear terminology section defining every word or
> concept we are going to use? All I know is semantically "international
> names" could only means the names which are unaltered by the ACE process.
> "Here I understand they mean the Unicode scripting which can be ACEd", so
> should it not be at least be "internationalizable" (no action occurred
yet)?
>
> Also I understand that that the RFC deals with the ACE process.
Terminology
> section should therefore
>
> 1. define the ACE process
> 2. define the group of the applications requiring ACE
> 3. define the international names : left unchanged by the process
> 4. define the non ACEable names
> 5. define the ACE labels
> 6. define the the class of all the labels (ACE and International) which
can
> be processed by an application requiring ACE
>
>
> On 04:24 03/09/02, Adam M. Costello said:
>
> >"JFC (Jefsey) Morfin" <jefsey@jefsey.com> wrote:
> >
> > > I must say that with my limited French speaking IQ I tried to figure
> > > out the meaning of "ACE" in the proposed text: sorry, but I was
> > > totally unable to grasp it.
> >
> >It is defined in the terminology section:
> >
> > An "internationalized label" is a label composed of characters from
> > the Unicode character set; note, however, that not every string of
> > Unicode characters can be an internationalized label.
>
> to me this is an Unicode label/script? No action occurred yet?
>
> >That much is clear, yes?
> >
> > To allow internationalized labels to be handled by existing
> > applications, IDNA uses an "ACE label" (ACE stands for ASCII
> > Compatible Encoding), which can be represented using only ASCII
> > characters but is equivalent to a label containing non-ASCII
> > characters.
>
> IMHO, let stop at "ACE label". And let define what an ACE label, without
> "can" which implies there would be other ways.
>
> >In other words, internationalized labels can contain non-ASCII
> >characters, which can't be handled directly by existing applications
> >that expect domain labels to be ASCII. Therefore, we instead use an
> >"ACE label", which is an ASCII label that is equivalent to a non-ASCII
> >label.
>
> All this added explanation is external and only adds to the text. Let us
> try to compact it into one single initial crystal clear definition?
>
> > More rigorously, an ACE label is defined to be any label that the
> > ToUnicode operation would alter.
>
> So an ACE label is here defined negatively. Feeling is that it means "when
> ToUnicode will fail". When we mean that the ToUnicode (sucessfully)
> transform into an ACE label.
>
> >That one sentence is the full and exact rigorous definition of the term
> >"ACE label". The rest of the explanation is there only to provide
> >intuition.
> >
> > For every internationalized label that cannot be directly
> > represented in ASCII, there is an equivalent ACE label. An ACE
> > label always begins with the ACE prefix defined in section 5.
>
> My first reading was puzzling: "whenever the ACE process will not work,
> there will be an pre-existing equivalent ACE label". This obviously does
> not make any sense.
>
> >Those are clear, yes?
> >
> >By the way, the notion of "equivalent label" is also defined in the
> >terminology section:
> >
> > In IDNA, equivalence of labels is defined in terms of the ToASCII
> > operation, which constructs an ASCII form for a given label.
>
> this means ACE label.
>
> > Labels
> > are defined to be equivalent if and only if their ASCII forms
> > produced by ToASCII match using a case-insensitive ASCII comparison.
>
> Then International names - ie non modified names by the ACE process (now
> reduced to ToASCII only(?), should it not also be reversible and the
> ToUNICODE results in the original scripting?) - cannot be compared (I know
> it is wrong, but this is what I read here. Again sorry for my Frenglish,
> but I think here it helps).
>
> > Traditional ASCII labels
>
> What is "traditional". Has it been defined?
>
> > already have a notion of equivalence: upper
> > case and lower case are considered equivalent. The IDNA notion of
> > equivalence is an extension of the old notion.
>
> Old notion? Is that the correct wording? Is that not the DNS current and
> stable notion?
>
> > Equivalent labels in IDNA
>
> Unicode scriptings and ACE label + non modified scriptings.
>
> > are treated as alternate forms of the same label, just as "foo"
> > and "Foo" are treated as alternate forms of the same label.
> >
> >Is that clear enough?
>
> Sorry to be boring. My point is not that I do not understand, but that the
> reading seems confusing. I only try to help it to be clearer from my own
> personal reading difficulties.
>
> >Getting back to "ACE", maybe some examples would help:
> >
> >The Japanese phrase <sono><supiido><de> (pretend I wrote it using kana,
> >which are non-ASCII characters) could be an internationalized label. It
> >is not an ACE label, because it cannot be represented in ASCII.
>
> Well I thought that ACE label resulted from the ACE process and not the
> labels left identical by the ACE process (IMHO both ways:
"iesg---name.com"
> is perfect ASCII, but if ToUNICODEd it will have no meaning).
>
> >If you feed it to ToUnicode, it will not be altered, because the check
for
> >the ACE prefix will fail.
>
> If the registering process has prevented the creation of the "iesg--"
> names. This is probably forbidden but it should be mentioned. Because when
> someone is going to use IDNA for a database entry, that filtering must be
> implemented in a consistent way.
>
> >There exists a label equivalent to <sono><supiido><de> that can be
> >represented in ASCII, namely IESG--d9juau41awczczp (where IESG-- means
> >the ACE prefix, whatever is eventually chosen). This is an ACE label
> >because it can be represented in ASCII and it is equivalent to a label
> >containing non-ASCII characters. If you feed IESG--d9juau41awczczp to
> >ToUnicode, it will be altered (it will become <sono><supiido><de>).
> >
> >The label helloworld is not an ACE label, because it is not equivalent
> >to any non-ASCII label. If you feed it to ToUnicode, it will not be
> >altered, because the check for the ACE prefix will fail.
>
> see above about iesg--helloworld necessary filter reminder (please
remember
> VRSN problems with ASCII preregistrations of iDNs).
>
> >Those are the three normal cases. There are also a few corner cases,
> >labels that begin with the ACE prefix but are not ACE labels:
> >
> >The label IESG--foo-bar-2 is not an ACE label, even though it begins
> >with the ACE prefix, because it is not equivalent to any non-ASCII label
> >(because the Punycode part is invalid).
>
> The punnycode is no part of the document and should be introduced.
>
> >If you feed it to ToUnicode, it will not be altered, because the Punycode
> >decoding step will fail.
> >
> >The label IESG--3ba is not an ACE label, even though it begins with the
> >ACE prefix and the Punycode part is valid, because it is not equivalent
> >to any non-ASCII label (because it is not nameprepped;
>
> is namepreparation part of the process. If yes it has to be included in
the
> ACE or ToASCII conditions above. If not this restriction does not apply as
> such, it can only be noted. There may be a lot of variations in the
Unicode
> scripting the users/developers may want to result in the same ACE_label.
>
> >it decodes to a
> >capital A with grave accent). If you feed it to ToUnicode, it will not
> >be altered, because the comparison in step 7 will fail.
>
> You understand that this is as long as DNS does not support à but
> that other applications may. The definition given above talking about
> "existing applications" - there are a lot of existing application
> supporting it, and at any given time in the future there will be more.
>
> So it means that we also want to define the ACE character set.
> jfc
>