[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
alpha v0.2
- To: idn@ops.ietf.org
- Subject: alpha v0.2
- From: James Seng <jseng@pobox.org.sg>
- Date: Sat, 22 Jan 2000 23:57:09 +0800
- Delivery-date: Sat, 22 Jan 2000 08:04:05 -0800
- Envelope-to: idn-data@psg.com
take two! i added new comments, took out some, edited others and added a few
other clause. comments pls.
few points to discuss and i realised missing.
1. localization needs. for example, some queries raised on the issues of
doing <COUNTRY>.<2LD>.<DOMAIN>.<HOST> in the reverse way. Make more sense
for some languages and country convention.
(Oh boy, this is going to get flame!)
2. localization needs again. Does double-width period counted as a domain
delimitator like a single-width period? On a similar notes, this is
related to double-width Alphanumeric vs single-width alphanumeric.
3. localization need once more. How to handle right->left writing order
such as Arabic. One consideration is treat this as an non-issue because,
for example, MS Windows CP1256 which defines Arabic actually encodes
the domain name in the correct byte order as per norm from left->right
but the render reverse it. On the other hand, this may not apply on
some other system such as Mac or Unix.
-James Seng
Requirements of Internationalized Domain Names
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To view the entire list of Internet-Draft Shadow Directories, see
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on DD MMMM YYYY.
Copyright Notice
Copyright (C) The Internet Society (2000). All Rights Reserved.
Abstract
This informational document describes the requirement for encoding
international characters into DNS names and records. This document
should be considered as a guidance for developing of solutions of
internationalised domain names.
This document is being discussed on the "idn" mailing list. To join
the list, send a message to <majordomo@ops.ietf.org> with the words
"subscribe idn" in the body of the message. Archives of the mailing
list can also be found at ftp://ops.ietf.org/pub/lists/idn*.
Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
"I18C" is often used in this document to refer to internationlized
characters or characters not within the US-ASCII.
Characters mentioned in this document are identified by their
position in the Unicode character set. The notation U+12AB, for
example, indicates the character at position 12AB (hexadecimal)
in the Unicode character set. However, this is not an indication
of any requirement to use Unicode.
"IDN" is used in this document as an abbreviation for
"internationalized domain name". This is defined as a domain name
that contains one or more characters that are outside the set of
characters specified as legal characters for domain names in
[RFC1034] Section 3.5.
"RR" is used in this document as an abbreviation for "Resource
Record" as defined in [RFC1035].
A master server for a zone holds the main copy of that zone. This
is sometimes stored in a zone file. A slave server for a zone
holds a complete copy of the records within that zone. A caching
server holds temporary copies of DNS records, it can use these
records to answer identical queries but restricted by time-to-live
(TTL) of the records. Further explaination on master/slave server
can be found in [RFC1034] and [RFC1996].
Table of Contents
1. Introduction ............................................... 2
...
...
1. Introduction
This informational document describes the requirement for encoding
international characters into DNS names and records. This document
should be considered as a guidance for developing of solutions of
internationalised domain names (idn).
...
Examples quoted in this document should be considered as a form to
further the explaination of the meanings and principles adopted by
the document. It is not a requirement to satisfy the examples.
2. General Requirements
2.1 Compatibility and Interoperability
The DNS is essential to the entire Internet. Therefore, IDN must not
damage present DNS interoperability. It must do minimum amount of
changes to existing protocols on all layers of the stack. It must
continue to allow any system anywhere to resolves any domain names.
Implementation of IDN must preserve the basic concept and facilities
of domain name as described in [RFC1034]. It must maintain a single,
global, universal and consistent hierachary namespace.
The same name resolution request should generate the same response,
regardless of the location (or localisation settings) of the resolver,
the master server and any slave or caching servers involved.
IDN should also allow a caching server which does not understand the
charset in which a request (or response) is encoded to be build, and
which works as well for IDNs as in the ASCII-only case. The caching
server must performs correctly if it gives the essentially the same
answer as the master server would have done if presented with the same
request (of course without the authoritative bit).
If the IDN implementation specifies a canonicalisation algorithm then
a caching server should perform correctly regardless of how much (or
how little) of that algorithm it has implemented.
Implementors may proposal modification to DNS protocol [RFC1035] and
other related work undertaken by [DNSEXT] WG. However, these changes
should be as minimal as possible and it must be approved by the DNSEXT
WG.
The best solution is one that maintains complete compatibility with
current DNS standards as long as it meets the other requirements in
this document.
[JS: ?? i am not sure if there can be such solution!! :P]
"There can be no "flag days" nor a split DNS."
[JS: I do not understand what this means. I think we need to elaborate
this further]
2.2 Internationalization (I18N)
Internationalized characters (I18C) must be allowed to be represented
and used in DNS names and records. Implementation must specify what
character set is used and how these characters are encoded in the
domain names and DNS records.
This document does not recommand any character set for I18N. However,
non-standard character set must not be used to avoid duplicate work
on general I18N. If multiple character sets are used then the
implemention must specify all the character sets being used and for
what purpose.
IDN should not make any assumptions where in the domain name that I18N
might appear. In other words, it should not differentiate between any
part of a domain name as it may impose a restrict on future I18N
effort.
IDN should also not make any culture restrictions in the protocol.
For example, an IDN implementation which only allows domain name to
use a single script would immediately restrict multinational
organisation.
IDN must be able to handle localized requirement of different languages.
For example, IDN must be able to handle right-to-left writing order of
Arabic.
In addition, IDN must
1. provide a record which can contain internationalised text
(similar to TXT RR). [1 request to remove this as it is not IDN]
--- need comments ---
Must allow I18C in DNS queries.
Must allow I18C in DNS RR response.
Must allow I18C in DNS TXT records.
Must allow I18C in DNS CNAME records.
Must allow I18C in DNS PTR records.
---
2.3 Canonicalization
Matching rules are the most complicated process of I18N of domain
names. Canonicalization of characters must follow precise and
predictable rules to ensure consistency.
In order to retain backward compatiblity, the implementation must
(should?) retain the case-insensitive comparsion for US-ASCII as
according to [RFC1035] Section 2.3.3.
For example, Latin captial letter A (U+0041) must match Latin small
letter A (U+0061).
If other canonicalization is done, then it
1. must be done before the domain name is resolved.
2. must be easily upgradable as new languages and writing systems
are added.
[CHARREQ] is a recommanded as a guide on canonicalisation.
Any conversion (case, ligature folding, punctuation folding, ...) from
what the user enters into a client to what the client asks for
resolution MUST be done identically on all requests.
"Thus, it must be specified in the protocol, not in the requirements
document. The requirements document might list the kinds of conversions
we might expect, but should not mandate where the converstions happen."
[JS: In this case, can i remove the rest of the section? Personally,
I think this is one of the most troublesome part of IDN and if this
is not cleared now in the requirement, we might get into a lot of
argument in future.]
Case folding should also be used.
For example, Latin captial A with a ring above (U+00C5) should match
Latin small A with a ring above (U+00E5).
[JS: This opens up cans of worms for context sensitive folding? What
about CJK? How is the folding to be done?]
On the other hand, similar glyphs given different codespace on a
character set should be treated differently.
For example, cyrillic A (U+0410) should not match to Latin A (U+0041).
For example, Greek captial letter omicron (U+039F) should not match
to Latin captial letter O (U+004F).
2.4 Operational Issues
Zone files should remain easily editable.
Character set of a signed zone file should be capable of being the same
as the character set of the unsigned zone file.
IDN capable resolver or server should not generate any more traffic
than a non IDN capable resolver or server.
IDN should add no new centralized administration for the DNS. A domain
administrator should be able to create internationalized names as
easily as adding current domain names.
IDN must allow offline DNSSEC signing. It should also be able to look
at the signed file and see that it is the same as the unsigned one.
2.5 Others
The DNS protocol should remain deterministic. No DNS element (resolver,
server or zonefile) should be required to do guess work.
[JS: One request to remove this.]
3. Specific Requirements
3.1 Client Requirements
3.2 Server Requirements
3.3 Zone file Requirements
4. Technical Analysis
There are many standard protocols and RFCs which is dependent on the
domain name and have make various assumption on its character set.
Therefore, any implementation must contain a summary of the
compatiblity issues and security consideration with, but not limited
to, the following protocols:
<...list the sets of RFCs which we would like to have an summary...>
In addition, the implementation document must contain a summary of the
technical opinion of the working group.
5. Security Considerations
Any solution that meets the requirements in this document must not
be less secure than the current DNS. Specifically, the mapping of
internationalized host names to and from IP addresses must have the
same characteristics as the mapping of today's host names.
Specifying requirements for internationalized domain names does not
itself raise any new security issues. However, any change to the DNS
may affect the security of any protocol that relies on the DNS or on
DNS names. A thorough evaluation of those protocols for security
concerns will be needed when they are developed.
References
[RFC2119] "Key words for use in RFCs to Indicate Requirement
Levels", rfc2119.txt, March 1997, S. Bradner.
[RFC1034] "Domain Names - Concepts and Facilities", rfc1034.txt,
November 1987, P. Mockapetris
[RFC1035] "Domain Names - Implementation and Specification",
rfc1035.txt, November 1987, P. Mockapetris
[RFC1996] "A Mechanism for Prompt Notification of Zone Changes
(DNS NOTIFY)", rfc1996.txt, August 1996, P. Vixie
[CHARREQ] "Requirements for string identity matching and String
Indexing", http://www.w3.org/TR/WD-charreq, July 1998,
World Wide Web Consortium
[DNSEXT] "IETF DNS Extensions Working Group",
namedroppers@internic.net, Olafur Gudmundson, Randy Bush
Author's Address
Appendix A. Acknowledgements
The editor gratefully acknowledges the contributions of:
Harald Tveit Alvestrand <Harald@Alvestrand.no>
Martin Duerst <duerst@w3.org>
Patrik Faltstrom <paf@swip.net>
Andrew Draper <ADRAPER@altera.com>
Bill Manning <bmanning@ISI.EDU>
Paul Hoffman <phoffman@imc.org>
James Seng <jseng@pobox.org.sg>
Randy Bush <randy@psg.com>
Alan Barret <apb@cequrux.com>
as authors of corresponding sections and the contributions of:
for their useful comments.
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implmentation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC editor function is currently provided by the
Internet Society.