[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] ietf london idn wg meeting minutes
One note:
The minutes is combination of notes from David Lawrence and Donald E.
Eastlake. Thanks for been the wg scribes.
-James Seng
----- Original Message -----
From: "Marc Blanchet" <Marc.Blanchet@viagenie.qc.ca>
To: <idn@ops.ietf.org>
Sent: Wednesday, August 22, 2001 2:17 AM
Subject: [idn] ietf london idn wg meeting minutes
> included is the first version of the idn wg meeting minutes.
> please send editorial comments to me, substantive comments to the
mailing list.
>
> Marc.
>
> ===========================================
>
> IETF IDN Working group session
> 9 August 2001
> London, England
>
> Agenda Bashing:
>
> Agreement on the floor to cut
> Reordering, nameprep update, Uname proposal, Hangulchar, tsconv
> from planned agenda.
>
> ================================================================
>
> WG UPDATE, Marc Blanchet
>
> Coordination with other groups/efforts:
> - IETF apps area
> - "requirements" for encoding: ACE or UTF8
> - directory efforts: directory@apps.ietf.org
> - Unicode/ISO
> - Any modifications to Unicode/ISO tables should be done by those
> parties, not IETF
> - IETF dnsext WG
> - Any modification to DNS protocol should be discussed in dnsext
> - ICANN/IANA
> - Policies
>
> - Pool W, pool of documents that identify core of interest by WG
> - Currently:
> - requirements
> co-chairs believe there is a wg rough concensus and intend to
forward
> it to IESG for Informational.
> - idna
> - nameprep
> - dude
> - aceid
> - jpchar
> - ace-eval-jp
> - mace
> - uname
> - tsconv
> - udns
> - amc-ace-z
> - hangeulchar
> - lsb-ace
> - Today's focus is on standards track proposals
>
> ================================================================
>
> ACE EVALUATION WITH IDNs ALREADY REGISTERED, Yoshiro Yoneya
>
> - Done by CNNIC, KRNIC, TWNIC and JPNIC with data they have for
> registered domain names, focusing on ACEs in Pool W.
> - Most important evaluation criterion to study is to maximize number
> of characters, raw speed is less important because nameprep is the
> slow stage.
> - Long IDNs (more than 15 Han characters) are already registered.
> - Evaluated ACEs: DUDE, AMC-ACEZ, MACE and RACE
> - Focus on DUDE and AMC-ACE-Z with MACE&RACE as reference
>
> Graphs of efficiency of domain names from each of KRNIC and TWNIC,
> where AMC-ACE-Z shows best compressions
>
> Charts showing that the four NICs consider AMC-ACE-Z to be either good
> or very good, while others were "bad" or "very bad" for at least one
NIC.
>
> MACE co-authors (including the presenter, Yoneya-san) support
> AMC-ACE-Z.
>
> Recommendation from the study is: AMC-ACE-Z
>
> ================
> WG Questions for sense of the group:
>
> Question: If there is a need for an ACE, choose one:
> - DUDE few hands
> - AMC-ACE-Z most hands
> - MACE (removed at request of
authors)
> - don't care but want an ACE chosen fair bunch of hands
>
> Erik Nordmark: question is, if you use an ACE, this is the one. Not
> saying you need to use an ACE anywhere.
>
> ?: What is re-ordering?
>
> James Seng: pre-processing to make more frequently used chars more
> compressed.
>
> Paul Hoffman: Not binding vote. Should be comfirmed on mailing list.
>
> Concensus: AMC-ACE-Z (with many don't care so long one is choosen)
>
> Should we do reordering?
> - Yes some hands
> - No some hands
> No clear result of poll.
>
> Erik Nordmark: A lot fewer people participated in the former poll but
> not the latter. Why?
>
> Bill Manning: We read the draft but didn't understand it, and need to
> read it again.
>
> Paul Hoffman: Don't understand the re-ordering draft. Does not
> broaden to other scripts.
>
> Larry Masinter: Re-ordering adds complexity.
>
> Kilnam Chon: Re-ordering is critical for CJK but add complexity.
>
> Paul Hoffman: This draft adds complexity, so perhaps people are
waiting
> to decide how to judge whether the added complexity is worth it.
>
> Eric Chen: This is just intended to help CJK. Most of the interest
> is in CJK. Why not?
>
> James Seng: What I'm hearing is that the authors should do a
> cost/benefit analysis, but it is clear the draft is not ready to move
> forward.
>
> Erik Nordmark: Can someone do a pro/con analysis draft, or someone do
pro and
> someone con, to help drive the discussion on the mailing list?
>
> Paul Hoffman: Let's make Adam [Costello] do it. [laughter]
>
> Kilnam Chon: This straw poll process isn't really valid because not
> enough representation from people for whom this is really important.
> There's always a trade-off.
>
> James Seng: Could someone who voted against the lsb draft just explain
> why you are against it?
>
> Paul: I'd rather someone else did, but I will ... the reordering draft
> is somewhat of a hack to optimize for certain scripts, but it is at
> the cost of other scripts, isn't really generalized, and there has
> been no analysis of how beneficial it is for DUDE and AMC-ACE-Z.
>
> Dongman Lee: The author was not trying to propose this as a
> generalized mechanism. It is not surprising that since CJK is driving
> internationalization, that proposals would be specific to that.
>
> Ted Hardie: As Paul pointed out, this has different effects on
> different scripts, but now that we are focused on one ACE we can ask
> more specifically for the authors to focus on just how it affects
> AMC-ACE-Z.
>
> Concensus: discuss the reordering on mailing list and request authors
> of ACE and reordering to come to a proposal with analysis.
>
> ================================================================
>
> MATCHING (NAMEPREP)
>
> - Need for a standardized pre-processing step regardless of what IDN
> protocol we choose?
> Yes lot of hands
> No one hand
>
> (Discussion clarified the question from the original.)
>
> Other comments:
>
> Patrik Faltstrom: Doesn't preclude other pre-processing before it,
> which some people have worried it would. But even so, IETF really
> needs to have one standard way of processing Unicode.
>
> James Seng: When you say one standard way, do you mean one with
flexibility
> for locale, or essentially fixed?
>
> Patrik Falstrom: Essentially fixed.
>
> Dave Crocker: I thank Patrik for his comments that helped clarify
> things for me. I used to be resistant to it, but am coming to accept
> it. It is quite a bit like the case-insensitive/sensitive thing we're
> so used to in ASCII. There are two processes here: case-mapping and
> determining the legal character set. Keep them cleanly separate.
>
> ? Russell:
>
> Wenhui Zhang: Should have a standard that includes where local issues
> can be defined, which can include their standarized pre-processing.
>
> ?: Goal of working group is noble, but are trying to kill all the
> birds with one stone, and so we need a really large stone. So many
> legacy systems are optimized for their local languages, and will have
> a lot of pain to switch to what is being planned. They don't have
> much of a voice here, those who are going to suffer most.
>
> ?: Look into what happened in the LDAP group, how they ended up with a
> bunch of language-specific things. It is difficult, but it can be
> done, and since it has already been solved, build on it.
>
> Erik Nordmark: Can we get back on the topic of this question? We seem
to be
> wandering into the general requirements area.
>
> POLL: Many to 1 in favour of standard pre-processing step.
>
> Post poll:
>
> John Klensin: I can agree that a standard pre-processing step is
> needed, but I can't agree if that necessarily means having a single
> binary result even in ambiguous situations. Very concerned about
that.
> This working group might be resulting in something that is totally
> irrelevant.
>
> Eric Brunner-Williams: The ambiguity need not exist in "uniprep" (the
> first of the stages observed by Dave Crocker), the problem arises in
> the other part.
>
> Paul: I think we should now work toward an architecture that includes
> pre-nameprep, nameprep and post-nameprep. The middle one can be
> generally standardized while the other stages need not be.
>
> Erik Nordmark: Addressing John's concern of irrelevance, I can see how
> this work would eventually be superseded by something better, but that
> doesn't mean we have to stop doing this very useful work now.
>
> Dave Crocker: Dealing with "language" is out of scope for this group,
this
> working group should just be about expanding the set of strings that
> are usable as domain names. In that context canonicalization makes a
> lot of sense, but not when we start talking about natual language.
>
> Ted Hardie: I have to take exception to Paul describing a system that
> is not standarized end-to-end; it can't include processing that is not
> standardized. Also agree with Dave that we can't work with natural
> language, we don't have the expertise.
>
> ?: Rigorously avoid natural languages.
>
> Eric Chen: We need to consider natual language!
>
> Dave Crocker: The scope is very narrow and does not include languages.
>
> Harald: "Yes."
>
> Paul Hoffman: Please defer all questions of language, there will be a
draft
> soon that addresses where it should be addressed.
>
> Next step will be for the authors to clarify the relation between
> the various proposals for processing into a cohesive architecture,
> namely nameprep, tsconv, jpchar, hangeulchar.
>
> ================================================================
>
> PROTOCOL PROPOSALS, Dave Crocker
>
> Dave's Disclaimers:
> - System oriented person
> - Not a Unicode expert, or even naif
> - Entirely biased -- wanted to be objective, but failed
>
> IDN Task:
> - Enhance range of domain names that are useful
> - Not human "name"
> - Not "language"
> - Has no sets
> - Requires: fairness, efficiency, reliability, transition, ...
>
> The Usual Suspects:
> Encoding Approach
> 1. ACE only IDNA
> 2. UTF-8 only IDNA-mod, uDNS
> 3. ACE then UTF-8 IDNA-mod, uDNS
> 4. ACE & UTF-8 both uDNS, uNAME
> 5. Anything goes uNAME
>
> Encoding efficiency:
> - ACE is an encoding scheme
> - UTF-8 is an encoding scheme
> - Both map many bits to a variable length string
> - All variable length strings are unfair to some poeple
> - Fair vs unfair unfairness:
> - longer mapping mean shorter names
> - shorter names restricted to information dense character sets
>
> Encoding comparison:
> 1. ACE is three minuses bad.
> 2. UTF-8 is two minuses bad.
>
> Charts showing that there are a lot of modules in both systems, and we
> have to worry about all the modules in both systems.
>
> ACE has an extraordinarily minimal amount of change necessary to make
> an IDN useful, just two applications. This is about as good a
> transition scheme as you can possibly get.
>
> UTF-8 is an extreme in the opposite direction, it requires that
> everything work end-to-end.
>
> 1. ACE only four pluses good
> 2. UTF-8 only five minuses bad
>
> Transition Interactions:
> ----------------------------------------------------------------------
---
> ----------------------------------------------------------------------
---
> Client-> Server-> ACE UTF-8
> Server Client
> ----------------------------------------------------------------------
---
> 1. old client old dn new dn transparent UTF-8 and
ACE
> new server maybe break
client
> ----------------------------------------------------------------------
---
> 2. new client, new dn old dn transparent break
server?
> old server
> ----------------------------------------------------------------------
---
> ----------------------------------------------------------------------
---
>
> Specification comparison:
>
> ----------------------------------------------------------------------
---
> ----------------------------------------------------------------------
---
> Efficiency Transition
Risk/Operational
> Expense
> ----------------------------------------------------------------------
---
> IDNA (ACE) bad(data) automatic none
> ----------------------------------------------------------------------
---
> how to detect
> uDNS (UTF-8) poor(data) when to use ACE? high
> (poorly defined
> and not realistic)
> ----------------------------------------------------------------------
---
> unstated
> uName (both) bad (round trip) (and based on CNRP, very, very
> with no meaningful high
> deployment)
> ----------------------------------------------------------------------
---
> ----------------------------------------------------------------------
---
>
> Olafur: Hard for me to say this to you Dave, given our history, but
> good job.
>
> Harald: Think you underestimate the cost of ACE a bit, in that leakage
> will confuse users. But UTF-8 leakage will also confuse users, but
> likely even a bit more! But the ranking is still good.
>
> Paul: uName doesn't actually have CNRP in it; it was put in the draft
> and then explicitly shot down in the draft. It uses a new RR, but the
> end result is pretty much the same as far as your conclusions go.
>
> Erik Nordmark: Can we vote on it without a UTF-8 draft in the pool?
> Would need a draft very fast.
>
> Poll:
> - idna?
> Yes Most
> No Some
> - udns?
> Yes Few
> No Most
> - uname?
> Yes Few
> No Most
>
> Interpretation by Harald and Marc was that: IDNA was the only strongly
> supported proposal in the room and the other two had
> strong opposition. Interpretation was agreed by the floor.
>
> Nameprep discussion back (some time remaining)
>
> Paul Hoffman: Good (from a marketing sense) user interfaces will do a
lot of
> mucking with input. Really should have it defined how and where they
> can do that. If you change machine, different local translation
> tables can yield different names.
>
> James Seng: It can be very hard to determine what local conversion
> option to turn on. Not sure if this wg has capability to deal
> with codepoint matching. We need to reference code points outside
> the IETF, at Unicode Consortium.
>
> Paul Hoffman: Unicode has put mapping tables out of scope.
>
> Harald Alvestrand: This working group is internationalized access to
domain
> names, not localized. This group is trying to specialize what a
> client must do no matter where it is in the world. I would accept a
> statement that the relationship between the pre-processing drafts. It
> has to be made mandatory though or it should not be part of the output
> of this group.
>
> Wenhui Zheng: IDNA draft should be explicit where the local
> interface/mapping should be done.
>
> Eric Chen: We have built a house and opened some gates but not others.
> Some languages can come in and others can not. IDNA should open its
> gate to allow other languages to do their thing.
>
> ================================================================
> NEXT STEPS
>
> - AMC-ACE-Z as chosen ACE.
> - Reordering to be discussed on mailing list.
> - relation between nameprep/tsconv/hanguelchar/jpchar/stringprep
> to be consolidated into one architecture.
> - Go forward with IDNA.
>
>
>