James Seng wrote:
I already mentioned this in my first posting.There is already restriction in the number of codepoint or octets in UTF-8 (or other) encoding. The restriction is defined currently as octet(ToASCII(X)) < 63.
-James Seng
That is true only in protocols predating IDNA draft.(SEE HERE)
IDN labels can be typed in/ displayed/ copy&pasted/ or exchanged in
UTF8 (or other) encoding
in now and future applications or protocols slots as described in IDNA
draft itself.
See enclosed excerpts from IDNA draft ( "SEE HERE").
I think some length restriction in code points is needed, rather than in
octets ....
IDNA is the right place to put such things..
Soobok Lee
6.3 DNS servers
Domain names stored in zones follow the rules for "stored strings" from
[STRINGPREP].
For internationalized labels that cannot be represented directly in
ASCII, DNS servers MUST use the ACE form produced by the ToASCII
operation. All IDNs served by DNS servers MUST contain only ASCII
characters.
If a signaling system which makes negotiation possible between old and
new DNS clients and servers is standardized in the future, the encoding
of the query in the DNS protocol itself can be changed from ACE to
something else, such as UTF-8. The question whether or not this should
be used is, however, a separate problem and is not discussed in thisHERE )
memo.
6.1 Entry and display in applications
(snip)
In protocols and document formats that define how to handle
specification or negotiation of charsets, labels can be encoded in any
charset allowed by the protocol or document format. If a protocol or
document format only allows one charset, the labels MUST be given in
that charset.
In any place where a protocol or document format allows transmission of
the characters in internationalized labels, internationalized labels
SHOULD be transmitted using whatever character encoding and escape ( SEE
mechanism that the protocol or document format uses at that place.
All protocols that use domain name slots already have the capacity for
handling domain names in the ASCII charset. Thus, ACE labels
(internationalized labels that have been processed with the ToASCII
operation) can inherently be handled by those protocols.
6. Implications for typical applications using DNS
In IDNA, applications perform the processing needed to input
internationalized domain names from users, display internationalized
domain names to users, and process the inputs and outputs from DNS and
other protocols that carry domain names.
The components and interfaces between them can be represented
pictorially as:
+------+
| User |
+------+
^
| Input and display: local interface methods
| (pen, keyboard, glowing phosphorus, ...)
+-------------------|-------------------------------+
| v |
| +-----------------------------+ |
| | Application | |
| | (ToASCII and ToUnicode | |
| | operations may be | |
| | called here) | |
| +-----------------------------+ |
| ^ ^ | End system
| | | |
| Call to resolver: | | Application-specific |
| ACE | | protocol: |
| v | ACE unless the |
| +----------+ | protocol is updated |
| | Resolver | | to handle other |
| +----------+ | encodings | (SEE HERE)
| ^ | |
+-----------------|----------|----------------------+
DNS protocol: | |
ACE | |
v v
+-------------+ +---------------------+
| DNS servers | | Application servers |
+-------------+ +---------------------+