[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Requirements of Chinese Domain Name



Dear IETF&IDN WG:
 
    Attachment is our draft about Requirements of Chinese Domain Name[Version00], which defines the requirement of Chinese domain name.
    We do wish it will helpful for you to understand the problems.
    Thanks for any suggestions and comments!
 
Li Ming Tseng,Ho Jan-Ming,Xiang Deng, Kenny Huang,Erin Chen, 
Xiao-Dong Lee, GuoNian Sun
CNNIC&TWNIC&CDNC
 
Internet Draft                                  Authors: Li Ming Tseng
<draft-ietf-idn-CDNReq-00.txt>                           Ho Jan-Ming
Jan 1, 2002                                              Xiang Deng
Expires in six months                                    Kenny Huang
																												 Erin Chen
																												 Xiao-Dong Lee
																												 GuoNian Sun
 
                Requirements of Chinese Domain Name

Status of this Memo

This document is an Internet-Draft and is in full conformance with all 
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task 
Force (IETF), its areas, and its working groups. Note that other groups 
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and 
may be updated, replaced, or obsoleted by other documents at any time. It 
is inappropriate to use Internet-Drafts as reference material or to cite 
them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html

1. Premise to be emphasized

All requirements of such memo focus on the requirements of Traditional 
and Simplified Chinese Domain Name Equivalence Matching and delimiter 
folding. So which is important in this paper is not the definition of 
Chinese Domain Name but that Internationalized Domain Name SHOULD satisfy 
such requirements. Any Internationalized Domain Name that includes any 
character defined by [1] and appendix A SHOULD satisfy such requirements, 
no matter what character is included in all the labels of it. That is, 
for any IDN-aware application with IDNA support, if it is CDN-aware too, 
it should check if the domain name inputted is defined by [1] and 
appendix A, furthermore, it SHOULD satisfy the requirements defined in 
this memo.

2. Terminology

The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and 
"MAY" in this document are to be interpreted as described in RFC 2119[6].

"TC" is an abbreviation for Traditional Chinese.

"SC" is an abbreviation for Simplified Chinese.

"CDN" is defined as an acronym of Chinese Domain Name that represents 
internationalized domain name, which contains at least one Chinese 
character. As to the scope of Chinese character, please refer to ISO/IEC 
10646-1:2000(E) [second edition 2000-09-15], if one character is marked 
"C and G-Hanzi-T", it MUST be a Chinese character, such definition does 
not mean it is not the character of other countries that use HAN 
ideograph.

"Equivalent CDN" is defined as CDNs that have at least one character from 
SC-TC tables [1].

"TC-only CDN" is a CDN that all characters of all its labels are TC 
characters.

"SC-only CDN" is a CDN that all characters of all its labels are SC 
characters.

"Mixed-use TC and SC CDN" is a CDN that in all labels of the domain name, 
at least one traditional and one simplified Chinese character appear.

3. Problems
Traditional Chinese and Simplified Chinese themselves are not a problem. 
It is a fact of life. If IDN does not deal with this fact, then it isn't 
a complete solution.

There are mainly four problems associated with CDN as follows:
[1] TC and SC CDN equivalent matching
SC is derived from TC, and Chinese people use both SC and TC. So Chinese 
people consider that TC CDN as being equivalent to its corresponding SC 
forms.

[2] Mixed-use TC and SC CDN cause an exponential problem
If we want to ensure a CDN in both TC/SC forms to be resolved correctly, 
we could register all combinations with mixed equivalent TC and SC 
characters. But, along with the length of a label, the number of 
different combinations grows exponentially. An ordinary Chinese domain 
name may have dozens, hundreds, even thousands of TC/SC records. That is 
unreasonable for users to register, and is also difficult for 
administrators to manage.

[3] Registration and delegation of multiple equivalent CDN
Without the support of proper delegation and resolution architecture, 
when a user registers a Chinese domain name, he may have to obtain many 
forms of it and must operate many domains. The lower level delegation 
domain name servers may adopt a different domain administrative policy 
which differs from the one adopted by the upper level, Consistency of 
TC/SC domain names then can't be ensured.

[4] Multiple possible periods (e.g. U+3002 , U+2022, U+FF0E) 
In Mainland China, there is a different period other than dot. While user 
input Chinese domain name, he or she types the delimiter of domain name, 
and he or she will certainly get period (such as: U+3002). In Taiwan 
Chinese IME, user might type or copy and paste U+2022 or U+FF0E as the 
delimiter.

4. Requirements
4.1 Requirements of Traditional and Simplified Chinese Domain Name
[1] Traditional/Simplified CDN solution MUST be consistent for all CDN 
users, including but not limited to end users and administrators.

[2] The need to do multiple registrations and delegation for an 
equivalent CDN MUST be minimized. There MUST be only one registration for 
equivalent CDN. The delegation(s) for an equivalent CDN MUST be 
consistent.

[3] Equivalent CDN SHOULD be treated as equivalent in IDN comparison.

[4] Applications that support CDN MAY display the equivalent CDN to users 
depending on the priority order of user preference followed by default 
original form and then lastly ACE fallback.

[5] Implementation of IDN that supports CDN MUST preserve the original 
form of CDN.

[6] IDN requirements MUST accommodate CDN user requirements.

4.2 Requirement of Delimiter Folding
[1] U+3002, U+2022, and U+FF0E MUST be treated as domain names delimiter.

5. Wish List
[1] We wish that every implementation would support CDN if and when there 
is an IDN standard.

[2] We wish to see a quick conclusion to the CDN/IDN standardization 
process.

[3] We wish software to have the capability to support both Traditional 
and Simplified CDN.

6. Authors
Li Ming Tseng, tsenglm@cc.ncu.edu.tw, TWNIC
Ho Jan-Ming, hoho@iis.sinica.edu.tw, TWNIC
Xiang Deng, deng@cnnic.net.cn, CNNIC
Kenny Huang, huangk@sinica.edu.tw, TWNIC
Erin Chen, erin@twnic.net.tw, TWNIC
Xiao-Dong Lee, lee@cnnic.net.cn, CNNIC
GuoNian Sun, sun@cnnic.net.cn, CNNIC

7. Acknowledgement
The original list of problems, requirements and wish list are derived 
from the result of the consensus of 7th JET meeting held on Nov 19th, 
2001 in Beijing. Thanks for all participants of the meeting. Moreover, 
some persons as follows are high appreciated.

Shian-Shyong Tseng
Wen-Sung Chen
Wenhui Zhang
Wei Mao
Hualin Qian

8. References

[1] A Complete Set of Simplified Chinese Characters, published in 1986 by 
the Committee of National Language and Chinese Character of China.
 
[2] Dictionary of Chinese Character Variants, compiled by Mandarin 
Promotion Council of Taiwan. Version 2 was published in Aug 2001 on Web 
site.
http://140.111.1.40/
 
[3] Paul Hoffman, Marc Blanchet, " Stringprep Profile for 
Internationalized Host Names" September 27, 2001, 
draft-hoffman-stringprep-00.txt
 
[4] Patrik Falstrom, Paul Hoffman, "Internationalizing Host Names In 
Applications (IDNA)", July 20, 2001, draft-ietf-idn-idna-06.txt
 
[5] The Unicode Consortium, "The Unicode Standard", 
http://www.unicode.org/unicode/standard/standard.html.
 
[6] Scott Bradner, "Key words for use in RFCs to Indicate Requirement 
Levels", March 1997, RFC 2119.
 
[7] ISO/IEC 10646-1:2000(E). International Standard - Information 
technology -- Universal Multiple-Octet Coded Character Set (UCS)

Appendix A. Delimiter Set
U+3002
U+2022
U+FF0E