[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Figuring out what can be displayed
Gentlemen,
I have discussed here the IDNA issue. I have related with Adam Costello on
the list and off list and I thank him for that. I have heard the different
remarks of many. I will now tell you how I will document and implement my
axcess engine system "IDNA" support.
I offer that to permit an analysis on a real project. I do not expect to be
criticized on religious/theoretical grounds but to be helped towards a
better end-user and OPES support, in a way creating the minimum discrepancy
or no discrepancy with the standard you propose.
I do this because I fight for my own business survival in a real
international world, not as a corporate manager or a distinguished
academic. I only possibly paid by real customers with real users.
My premises:
1. I live for 25 years (1977) in the international data communications
world. I always made my money there for ma family, except since I came into
the IP world. I still accept it as a long investment period.
2. Since the very beginning I am involved with the international namespace
(INS). It interconnected in 1984 with DNS (there are some differences in
perspectives quoted by John Klensin who was not involved here by then but
has a command of the Internet history. Probably because of the time flow,
and because when you build a gateway there is by nature a male and female
perceptions). This is of low interest. What is of interest is the
experience: we must take advantage from the pros of both sides and avoid
the cons of both sides (I understand that is what he means). This gives me
an unique experience about the way the real world behaves; and some more
intuitive knowledge of the needs and constraints than those who related on
the matter with Govs, operators, market, corporates, law makers, end users,
local tech support, end-application developers. Sorry to quote that: it is
only to explain why I suppose this mail might be of use.
3. for 20 years the only thing which really worked and made the "Internet"
in most people minds is the DNS part of the namespace. The reasons why it
works is because it is stable, simple and most of all consistent with
intuitive addressing feeling.
I co-created the namespace in a form which developed all over the world,.
But with a semantic which would NOT have had the same impact as the DNS.
We started with the root names, then from LEFT to RIGHT (as in a sorts,
disk directories etc.) we permitted operators to add hostname as an
extension, then sub-hosts. We created the hierarchy, but it was a mess.
Because we had no zone (as in IP addresses). What the DNS brought was to
respond intuitively to the need of the users in sorting out clearly that
mess. The reversing from right to left at the gateway and the use of the
"." as a reversing indicator did it all. RFC 920 the "prefix" of the time
was the suffix ".arpa".
(When you start with an unique thing you start with two: it and not it. Two
is by nature a family and a family grows).
It did it all, because the resulting addressing logic was brainware
consistent with postal addressing and EDI rules. Country last. And you can
keep adding details first. Like on an envelope the name (e-mail name) comes
first (and as in the "vielle France" etiquette: the use the "@" [at the
proper "à Monsieur," place] as it was created for this in Middle-Age (@ is
latin "ad"). This means a brainware consistent system with an habitus of
more than 8 centuries.
We kept BOTH system working. We interfaced a few people to ARPA in right to
left hierarchy and supported in left to right the numeric names X121
hierarchy. From 1982 to 1986 I carried the same task as IDNA - but the
other way : instead of expanding from 38 characters to billions, I reduced
from 36 to 10. For the same public.
From experience, the reason why it worked (IMHO) is that we adapted to the
least able technology. We were using names and had to support digital only
addresses. We used "numeric names". But we had all the capacities of the
names, so when Transpac was only using an IP-address like scheme , we had
the flexibility of a complete real time, flexible and powerful database. So
we proposed value added services but not a new system.
I think the only way IDNA can succeed is the same way. Unicode is the
powerful system: DNS is the less powerful one. I am an seaman: the speed of
a convoy is the speed of the slowest vessel less the zig-zags. IDNA must be
considered as a DNS service, subject to the constants of a dual system.
IDNA can only be DNS value added, not as an alternative system, even
embedded.. The only alternative system would be to study, specify, develop,
experiment a new DNS. IMHO it cannot be done in two minutes (IMO it will
take years and due to the political implications that it is a big, big task
ahead which will involved 190 countries universities, Telcos, Govs,
communities). I also think it cannot be done one shot. This is why I
suggest to look into DNS.2 (improving stabilizing the system, its
operational architecture, its political insertion, etc... in compatible way
with the current DNS - I do not even know if we need to increase the
character set). This is why I also work on top of it on extended DNS
services, like IDNA, authetification, access engines, etc (DNS+). This is
the rational of the Dot-Root proposition ( http://dot-root.com ) still in
infancy and only partly documented in English. But with gaining interest.
So I only consider serious and conform use of the DNS: the rule and the
bible. Errors/tricks in using it are of no concern to me. Errors and tricks
do not make the basis for a rule (this is a basis of the Roman Law and of
social life).
There is a hierarchy of the addressing information. That hierarchy is
necessary to get the next info. There is no use to know an ASCII or Unicode
the forename of someone if you do not know the name, the city and the
country. The way you write that information does not make it different.
Question: are there existing additional hierarchies in postal (brainware)
addressing? ie something the mailman need to chose between different mail
destination - and or mail path?
- not the title (Mr, Mrs, Dr, M. Mme, Snr, M.M., Her, Esq. etc...)
- not the type of information: a nickname is accepted as long as the layer
below accepts it. I mean than "Jack" is accepted for "John" in an Irish
family and "Jefsey Morfin" will be accepted in the "Morfin" mailbox.
- partly the service : is it a hierarchy/an added information in the naming
hierarchy? Postal Services all over the world use "Port Payé" or "Port du".
The same as all the DNS Managers use "MX". The routing: "AirMail" or
"ParAvion". (note: these hierarchies are not in naming. They are used when
applying before or in parallel to the routing).
- the scripting is an intuitive (real) hierarchy: is the scripting local to
the sender, to the sendee or international scripting? Usually only
international and sendee scriptings are accepted. The place where the
discrimination is carried is the first point where a scripting is not
understood anymore. In some cases it can go beyond through an alternative
path: in adding "c/o some go-between" who will translate.
4. access engine. To understand the way I am going to use IDNA+, and why,
one has to understand the access engine concept. I name and develop "access
engines" DNS resolvers which use an extended resolution strategy. For
example my domain name is http://utel.net . There is no access engine
implemented there, but http://jefsey.morfin.utel.net could accept the call
as the default for utel.net and resolve the 3 and 4LD as
jefsey.morfin.utel.net, morfin.jefsey, jefsey, morfin, vacations etc...
This kind of OPES provides a directory service but also an easy ULD (upper
level domains) management system in using LDAP like system or more advanced
directory solutions, etc...
This means that I have no problem to introduce transparently punnycode
support in my existing names. http://U+xxxU+yyyyU+zzzz.utel.net are OK.
The access engine resolves in two ways. Either as a reroute or as a normal
DNS server. This depends on the economics and on the user system
architecture. They are transparent. http://jefsey.utel.net can resolve one
day as a reroute to http://jefsey.com and the next day as a CNAME for
http://jefsey.com. I note that the test for a French application supports
1.500.000 names with a target for 150.000.000, with some added value
variations (abbreviations, accents,e ct) what corresponds to billions of
names. The DNS could not support all this, but provide a stable, simple,
existing support for the ULDs.
These access names are either free or cheap and will be legally enforced
one day (we will support the national IDs, immotic telemates, etc). We
cannot plan to spend much energy to support billions of names for millions
of people.
Obviously these names must be transparent to different accepted writings:
upper, lower, accentuated cases. "eleve" must be bijective with "élève".
That engine can to some extend accept the most common typos and the
sound-alike wording. But it will not easily accept that a French "O" is
replaced by a Greek "o".
5. legal issues. Due to the IETF current lack of documentation between the
software domain name as an alphanumeric pointer to an IP address, and the
brainware mnemonic cannonic/alias to a domain name, there are out of
context laws/jurisprudence such as ACPA, Whois, etc. we must live with.
I do not want to run into endless UDRPs and "a la Joe Sims" contractual
issues because a Chinese way to write something will print on all the
French non-IDNA displays and printers as "iesg-ibm.fr" or as a racist text.
Not a minute nor single penny to spend helping other to understand if what
will legally counts is one or the other formula.
Now the "Jefsey's solution".
6. From all this (and other economy, marketing, social etc... points) my
implementation and explanation will be as follows.
a) there is an unique DNS naming sequence. That sequence utilizes the
International Domain Names System Character set, named "IDNScode". Today
IDNScode includes 0-9, A-Z dash and dot. Nothing prevents it to be extended
by the DNS designers.
b) the extended-Unicode set (e-Unicode) includes the current Unicode
version plus the current IDNScode version..
c) the support of natural Internet names is organized through specialized
sub-domains the charter of which defines the e-Unicode supported sub-set,
and the bijective e-Unicode/IDNScode reading/writing function. Their ULD
(upper level domain) will be of the form ".prefix--suffix.tld", where
- prefix indicates the conversion system
- the suffix indicate the used e-Unicode restriction.
The prefix will be the "iesg--" prefix for the IDNA system. The null suffix
will default to "IDNScode".
d) registrations will use current NIC management systems with a
transwritting using a punnycode routine after a sub-namespace character filter.
e) abbreviated name presentation will be supported to possibly hide/insert
the script sub-domain. This will be insured by "prefix--suffix" loaded or
embedded plug-ins (on the system) or as OPESes.
Discussion
7. I will use the French name "élève" (pupil) fro this discussion, since it
is supported by my keyboard and should probably be by all of yours.
a) support of international registration by AFNIC : http://eleve.fr
b) support of the French registration by AFNIC:
- uppercase scripting : http://ELEVE.FR - existing as http://eleve.fr
- accentuated lower case : http://élève.iesg--fr.fr
entered also as :
- http://eleve.fr (decision to be taken by AFNIC to comply with
French law)
- http://eleve.iesg--fr.fr (until the IDNScode == Unicode)
c) update of the e-Unicode character set
- one shot ^parallel registration of the new scripting equivalent
- management of the possible conflicts
- de registration of the obsolete e-Unicode scripts once every user
supports the new version.
d) work to be done to support this service
- creation of the "iesg--fr" domain name
- adding of a French character set list in the Punnycode C code.
- delay to the market: 24 hours after the release of the "iesg--"
characters.
e) user implementation
- this is transparent to the current situation
- e-Unicode support can be provided by who wants and different natural
character sets can be implemented depending on the user keyboard.
f) legal implications
- none
- the e-Unicode scripting is by nature 3+LD names outside of the WIPO
area of influence.
- jurisprudence will develop probably on brainware mnemonic second
level, but it should be related to the documented good, rather than a good
by itself. if the good is the domain name: the standard UDRP apply on the
registered DNS entry: nothing changed. If it applies on any other good:
domain names are not even only more considered as such.
g) extensibility
- there is no addition to the DNS which can be as freely extended as before
- there is an unlimited extension of the natural character set through
the suffix. There an unlimited capacity of extension on the processes
through the prefix. There is a quasi unlimited capacity of extension though
another semantics.
h) load
- the load forecast on the DNS imposed by current IETF proposition may
be tremendous a TLD may become a new worldwide ".com" with hundred of
millions of entries, on the whim of a fashion. The sub-domain approach
obviously permits the transparent load of billions of Internet names.
- however the load shifts towards the sub-domain information management
within the work frame of the new loads imposed on the root server system
and on the DNS. This load is to be considered all together with the load
imposed by Microsoft's "dynamical" DNS management, portables, DNS lookup
leaks, racing multiple root calls for speed of some resolvers, ENUM
support, security concerns etc..
Summary.
The whole Recommendation is that the Internet Community agrees that a
domain name including a double dash "--" means a domain name which can been
hidden/created by applications should the corresponding set of recursive
processes (defined by its charter) it describes, have been applied.
Some examples:
- iesg-- : this domain specializes in IDNScode scripts
- iesg-fr this domain specializes in "French names".
- fm-- : this domain specializes in telephone follow-me services
- iesg--fm-fr : this domain specializes in telephone follow-me services in
French accentuated names.
- enum--fr: enum names tel://139500510.enum--.fr
- tel--fr: enum through french name names : tel://eleve.tel--fr.fr entered
on my mobile as "éléve.fr" that may be transcoded into tel://eleve.fr,
tel://élève.tel--.fr.fr http://eleve.iesg-tel--fr etc. depending on the
services organized by the operator.
I certainly accept that it is rather different from the current wording and
concerns. But I think it is compatible. I hope it may help. I will keep
posting on the implementation ... as I find the funding.
jfc