[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] hostname history hell



These are very sensible rules, and we have 20,000+
CJK characters at hand to be resolved, and can not
affort to dive into more variations at this stage.  
However, I agree with Tim that there is a way to represent
some drawing characters in IDN, but not in DNS. 
Many Latin language users will benefit from proper 
solution of  the 20,000+ CJK problem. If you can think 
whois database as a more broader DNS database, 
instead of IDN database ( which there is yet to be a  solid 
image, or is like  "a left-wizzlepop" plus a  "left-popplewiz")
 then the what you are worring about "market research" 
may be relexed.  

Liana

> > John C Klensin wrote:
> > 
> >> It was also noted at the time that hyphen was a common
> >> argument ("qualifier") introducer in some command languages
> >> and a common introducer for negative numbers in input strings
> >> in others, and hence better avoided.
> > 
> > The argument-introducer aspect is probably still applicable.
> > Should we keep the no-leading-hyphen rule for i18n host
> > identifiers? We're not explicitly tagging -GW or -TAC
> > postfixes so that is not required (but may be desirable for
> > compatibility).
> 
> My own personal position/ bias is that we should be as
> conservative about what is permitted in an IDN as possible
> consistent with meeting the requirement for identifiers drawn
> from any of the world's languages or combinations of them.  Put
> differently, I think the general model of the old "hostname"
> rules has served us well.  I would like to think of the IDN work
> as expanding that model to include additional alphabetic and
> ideographic characters, rather than discarding the model and
> seeing how much "stuff" we can put in.
> 
> If a too-restrictive model turns out to be a mistake, it is
> possible to expand it later (just as "leading digit" was
> unblocked); if we adopt a model that turns out to be too broad,
> there is probably no way back.
> 
> On that basis, my inclination would be to:
> 
> 	* continue to prohibit leading (or trailing) hyphens
> 	
> 	* continue to prohibit all spacing characters
> 	
> 	* continue to prohibit all punctuation characters except
> 	for that hyphen and the label-separating period (full
> 	stop, ".")
> 	
> 	* prohibit, in the spirit of the hostname rules, all
> 	symbol and drawing characters
> 
> We don't _need_ them for identifiers.  Some of them will, sooner
> or later, run up against a legitimate command language or cause
> "interesting" lexical parsing problems (even if they don't cause
> problems in today's URI syntax definition).  High risks,
> marginal benefit.
> 
> Just my opinion, of course, but I have these scars...
> 
>     john
> 
>