[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [idn] homograph attacks
At 02:35 17/02/2005, Michel Suignard wrote:
Martin,
I don't think we are so far off. My concern is that many people are
abusing the term 'language' for these tables.
Dear Michel,
you are absolutely right. The problem is the content of the tag and the term.
- IANA TLD Tables are described by language/script/ccTLD from ISO 3166 alpha-2
- languages are defined from ISO 639
- RFC 3066 appointed ietf-languages@alvestrand.no advisory mailing list
Members have introduced a "langtag" which also concatenates the same
content: language (ISO 639) - Script ISO 15924 - Country (ISO 3166 -
therefore missing some ccTLDs). The idea is to document the real language
being used with some more accuracy.
The term that ISO 639 corresponds to "langues" in French and the proposed
langtag to "langage" in French, and what they really need to permit
automated interintelligibility would be "parler" in French. This might lead
to complex and unnecessary debates. This is why we have adopted the
following wording:
- langtag for a tag including lingual oriented information. We do not speak
of language but of tag. Whatever the use.
- lang3tag is internationalization with language/script/localization
(localization should be far more complete than just ISO 3066)
- lang4tag adds the authoritative reference for the used form of that
language (multilingualization)
- lang5tag adds the style to document the vernacular environment of usage.
I am just saying that creating exclusive subset of Latin characters in
European context is not necessarily a bad idea but will result in future
problems because they will always discover that few characters are missing
from the subset.
It is reasonably easy for .de to establish a table as they did and again
it is ok. It is much more challenging for a worlwide TLD such as .com to
establish registration rules. Typically script is a much better selector
than language to establish those tables and associated rules.
This has been extensively discussed by the authors of the proposed "RFC
3066 bis" Draft. I disagreed enough with only small but key details, to be
credible I think when I say that I fully agree with their need of ISO 639
to define the language (7260 in the current revision), but also the script
(you are right this is necessary, but this is not enough for many reasons
you described in part) and localization which may change some accepted
usages. They identified that the political boarders were the leading
descriptor here due to legal obligations, common education, etc.
Scripts defined by ISO are also disputed. Better to have an authoritative
decision of the TLD Manager as a part of his namespace rules. After all
what we want is "normality", so there are cultural, legal, IPR related
matters to consider. I understand that M$ has decided to wait and see the
way the market reacts. This time I approve M$.
jfc