[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Combining characters (was: Re: [idn] hostname historyhell)

To: ehall@ehsco.com
Subject: Re: Combining characters (was: Re: [idn] hostname historyhell)
From: liana Ye <liana.ydisg@juno.com>
Date: Sun, 25 Nov 2001 14:55:37 -0600
Cc: idn@ops.ietf.org

Sorry, I fogot about checking cc field.  Thank you for catching it,
and I'd better clean up this message a little for a easier reading.

> > > liana Ye wrote:
> > >
> > > > I'd like to propose a more specific layering of IDN symbols:
> > > >
> > > >  From the top where the user input buffer offers:
> > > >
> > > >  Layer 3:  label seperators and label order normalizing
> > > >
> > > >  Layer 2: Bidi label normalizing (or verticle label >
normalizing)
> > >
> > > What is the current display order for unstructured and  structured
> > > data in right-to-left display systems? Does unstructured data "the 
> files are on server1") typically flow RTL, while URLs and other
structured 
> data display as LTR?
> > >

 As far as I know, unstructured data mostly based on LTR  internally,
and displayed as RTL or swaped to topdown display. But I don't
 know how people handles Mongolian and Man text specifically,
they are also the cases I have in mind when I have mentioned Layer 3
 and Layer 2.

> > > It seems that these questions are for the structured data groups to
> > > handle when they decide on an output presentation mechanism. 
        But if  URLs and
> > > other structured types will display RTL then that may affect us 
> as well (your Layer 3 label ordering in particular).

Yes.  These are issues have been raised before by Ben.  For
example, Chinese addresses from general to specifics:
 China, province, city, district, street, building#, appartment#

While English is the reverse but only partially:
  Building#, street, appartment#, city, state, country

From processing view point, the Chinese way produces less
confusion, while our TDL has been following the English way.
For IDN, if we are speaking of backward compatibility with
 the current DNS, we do need to allow different local label 
ordering. Which group handling this part?

 As this group concerns, I think we need come up with
1) a list of equivalent symbols as label separators;
2) a list of special character processing protocol, which 
mostly have been in use from the first day of ASCII 
standard, for example:   $/%/?/*, which may indicate 
how to handle special characters whthin a label.  
I think this the base for us to work on bidi and vertical
 issues in Layer 2 nomalization.

 After we get the two lists out of our way, and we can have 
a table of prohibition and a sensible equivalent list to 
work with before we broadly exclude all of them in
 [nameprep].

 Liana

> > > >  Layer 1.5: diacritic marks and combining symbol normalizing
> > > >
> > > >  Layer 1:  IDN identifier matching or whatever comes out
> > > >                of [nameprep].
> > > >
> > > > The reason for Layer 1.5 is that these symbols can be
> > > > treated in a similar way with Han characters depending on
> > > > what architecture we end up with, and what ACE will be
> > > > our focus.
> > >
> > > --
> > > Eric A. Hall
> > > http://www.ehsco.com/
> > > Internet Core Protocols
> > > http://www.oreilly.com/catalog/coreprot/
> 
> -- 
> Eric A. Hall                                        
> http://www.ehsco.com/
> Internet Core Protocols          
> http://www.oreilly.com/catalog/coreprot/

Prev by Date: Re: [idn] International v. Universal
Next by Date: Re: Combining characters (was: Re: [idn] hostname historyhell)
Prev by thread: Re: Combining characters (was: Re: [idn] hostname historyhell)
Next by thread: Re: Combining characters (was: Re: [idn] hostname historyhell)
Index(es):
- Date
- Thread