IETF July 2000 Proceedings

Current Meeting Report
Slides

2.2.7 Internationalized Domain Name (idn)

NOTE: This charter is a snapshot of the 48th IETF Meeting in Pittsburgh, Pennsylvania. It may now be out-of-date. Last Modified: 17-Jul-00

Chair(s):

James Seng <jseng@pobox.org.sg>
Marc Blanchet <Marc.Blanchet@viagenie.qc.ca>

Internet Area Director(s):

Thomas Narten <narten@raleigh.ibm.com>
Erik Nordmark <nordmark@eng.sun.com>

Internet Area Advisor:

Erik Nordmark <nordmark@eng.sun.com>

Technical Advisor(s):

John Klensin <klensin@jck.com>
Harald Alvestrand <Harald.Alvestrand@maxware.no>

Mailing Lists:

General Discussion:idn@ops.ietf.org
To Subscribe: idn-request@ops.ietf.org
Archive: ftp://ops.ietf.org/pub/lists/idn*

Description of Working Group:

The goal of the group is to specify the requirements for internationalized access to domain names and to specify a standards track protocol based on the requirements.

The scope of the group is to investigate the possible means of doing this and what methods are feasible given the technical impact they will have on the use of such names by humans as well as application programs, as well as the impact on other users and administrators of the domain name system.

A fundamental requirement in this work is to not disturb the current use and operation of the domain name system, and for the DNS to continue to allow any system anywhere to resolve any domain name.

The group will not address the question of what, if any, body should administer or control usage of names that use this functionality.

The group must identify consequences to the current deployed DNS infrastructure, the protocols and the applications as well as transition scenarios, where applicable.

The WG will actively ensure good communication with interested groups who are studying the problem of internationalized access to domain names.

The Action Item(s) for the Working Group are

1. An Informational RFC specifying the requirements for providing Internationalized access to domain names. The document should provide guidance for development solutions to this problem, taking localized (e.g. writing order) and related operational issues into consideration.

2. An Informational RFC or RFC's documenting the various proposals and Implementations of Internationalization (i18n) of Domain Names. The document(s) should also provide a technical evaluation of the proposals by the Working Group.

3. A standards track specification on access to internationalized domain names including specifying any transition issues.

Goals and Milestones:

Feb 00



First draft of the requirements document

Mar 00



Presentation and discussion at IETF-Adelaide

May 00



Second version of the requirement document

Jul 00



Final discussion on the requirement document

Aug 00



Req document wg last call

Sep 00



First draft of comparaison document

Dec 00



Final discussion of comparaison document

Dec 00



Protocol RFC first draft

Jan 01



Comparaison document wg last call

Mar 01



Protocol RFC second draft

Mar 01



Transition RFC first draft

Jun 01



Protocol RFC wg last call

Jun 01



Transition RFC second draft

Sep 01



Transition RFC wg last call

Internet-Drafts:

· Requirements of Internationalized Domain Names

· Comparison of Internationalized Domain Name Proposals

· RACE: Row-based ASCII Compatible Encoding for IDN

· Internationalized domain names using EDNS (IDNE)

· Preparation of Internationalized Host Names

· Using the Universal Character Set in the Domain Name System (UDNS)

· Architecture of Internationalized Domain Name System

No Request For Comments

Current Meeting Report

Internationalized domain names (idn)
wg meeting notes
IETF Pittsburg, Aug 2000

Notes done by David Conrad, Thanks, David.

Agenda bashing -- no changes.

1. Marc Blanchet: WG update
1.1 new rev of charter since last meeting. major changes:

- specifiying standards track protocol based on requirements
- fundamental requirement to not disturb existing dns
- WG must identify consequences of resulting protocol
- WG needs to insure good communication with interested groups
- goals and milestones have been modified
- finishing early would be nice

1.2 New working group web site: http://www.i-d-n.net

- complement of main IETF web site
- official IDN-WG site
- managed by Marc Blanchet

1.3 RFC 2026 reiteration

- per POISED, contribution means: presentation, email, internet-draft, comment, etc.

2. Requirements Draft (James Seng for Zita Wenzel)

James, as WG co-chair is no longer editing requirements draft.

No presentation, will go through the ID and highlight the important bits

Version 3 removes many of the requirements of version 2 which was felt to have too many (35). Likely no proposal could meet all requirements in v2. We spent 3 months going through the requirements to see what could be removed, what would be nice, etc.

We have come to a consensus that we should use Unicode as the base character set. Any proposal which uses localized encoding will not meet IDN requirements.

New section to clarify difference between hostnames and domain names.

Graphic representation of DNS architecture/infrastructure from Harald included. Focus our energy on the big box in the diagram (forwarding, caching, parent-zone, and root server). Will consider the other boxes, but not the focus.

KM: most important parts aren't in the picture. If you concentrate on wire protocol and don't consider users then the effort will fail. Must consider wider picture. Thorny issues lie in non-protocol interactions

JK: Computers don't care. WG is important due to the interaction of people with computers.

JS: We won't ignore the other aspects, but must remain focused on what must be done, not on what is outside of WG scope. IAB has an RFC on internationalization that addresses things the WG should consider. If we can't solve the basics, then we can't go on to the next steps.

DC: There is a standard that we make that isn't over the wire. In constrained circumstances -- business card model -- we must deal with non-protocol stuff. What can go on business cards will affect what we're doing.

Requirements:

- IDN must not break existing DNS. Minimize changes.
- must preserve basic concepts and facilities, must maintain single, global, universal, and consistent hierarchical namespace.
- new addition: no restriction on Unicode codepoint in wire format, but restrictions can be imposed elsewhere (registration, etc.).
- domain names must resolve correctly.
- document recommends Unicode only. If multiple character sets are allowed, each charset must be tagged and conform to rfc2227. We don't want to try and invent new unicode system.
- canonicalization must be done for internationalized domain names. What normalization form should be adopted? (C, D, KC, KD, new form KR?). Where should normalization should be done (server or client)? Canonicalization/normalization rules should be locale independent.
- Zone files should remain easily editable.
- Protocol must work with DNSSEC.
- Protocol must work with v4 and v6 and all features of the DNS.

AB(?): Which Unicode version? (3.0), bidirectionality? (yes)

KM: Fundamental assumption appears to change the DNS -- I don't see that as appropriate for a requirements document. The interactions you care about are app to app, app to user, and user to app -- none affect the DNS. Stuff that happens at higher layers is much more important that what happens on the wire.

JS: Does the reqs doc give the impression that the DNS is to be changed?

KM: There are implications, yes. The focus is on the DNS protocol but the problem is higher up.

MB: Will your concerns be sent to the mailing list?

KM: Yes

JS: Didn't mean to give the impression that the DNS was going to change.

HA: when thinking of the DNS as a set of services, if we are to keep sane, then we should think of interationalized equivalents as new services that are to be made available, not as changes to existing services. Mapping of name to address should have to services -- map as we know it and map as the future may require. We shouldn't expect to convert applications by switching lower layers. The new services might not work exactly the same way the existing services.

MB: will you write an draft about this idea

JK: you can assume a draft will appear

KM: I agree with Harald. I believe there is a whole set of missing requirements for incremental deployment. You have to have the least possible disruption. Changes must be independent of each other.

JS: see requirement 10.

JI: this is a problem we should not be solving. What problem are we trying to solve?

JS: this should go to the mailing list.

3. RACE (Paul Hoffman)

draft-ietf-idn-race

How to do an ascii compatible representation of internationalized characters.
This proposal does not specify how it is to be used.

Fully compatible with today's DNS.

3 step process:

- compress input text
- convert compressed string to base32
- mark with a prefix (currently 'ra--')

Prefix will change.

Each name part must be 63 octets to conform with the existing DNS. race favors names that are all in one row.

Can get up to 35 characters if single row.

Can get up to 17 characters if two or more rows and one of the rows is non-zero.

Can get 17 to 33 characters if usign two or more but also using row 0.

RACE is an ace format in ace-1 in the comparison document. Includes an identifying mechanism for ace-2 namely ace-2.1.1

HA: have you considered using UTR-6? Yes, you don't get alot of advantage and UTR-6 does a lot of bit shifting which will be hard to implement.

KM: do you define a canonicalization form? No.

KM: are their multiple outputs? No. Another reason not to use UTR-6.

AG: strange to propose ways of compressing into 63 ascii since the wire format doesn't care -- the 63 limitation is at the resolver. Yes.

AG: restrictions are likely not per label.

JK: applications are likely to make bogus assumptions.

BS: not using ACE on the wire? Yes.

BS: what is ace expecting to receive? it is expecting unicode code points. input to the compression is utf-16.

4. UDNS (Paul Hoffman for Dan Oscarsson)

draft-idn-udns-00.txt

Attempts to be a full protocol specification. how do you flag idn awareness in dns queries so idns can be handed back. If not flagged, you must not give back internationalized names.

How to flag: use the IN bit in the DNS query. Last unused bit in the second word. Arguably safe.

Proposes UCS normalization form C encoded in utf-8 with an ACE for backwards compatibility.

DC: how does the length limit issue affect idn?

PH: UTF-8 restricts length of non-English idn's.

OG: deployment problems due to forwarding or recursive servers -- some servers blindly copy those bits.

PH: Right.

MA: Broken servers are broken servers. Don't try to work with them.

JS: On length issues, Thai names are very very long.

DC: some length limit is a fact of life.

PH: yup.

5. ICU (Hyewon Shin)

draft-ietf-idn-icu

Uses IN bit to identify queries
Use UTF-8 as wire format
Case folding/canonicalization before transmission

IN bit indicates wehether the query is from IDNS resolver/server or not and reduce overhead of canonicalizatin

unicode as CCS
utf-8 as CES
all domain names queries should be encoded into Unicode before being
used in resolvers.
resolvers convert the queries into UTF-8

Case folding in locale independent before transmission indicated by IN bit

Valid query formats are indentified with the IN bit

JS: change the title of your internet draft -- calling it the architecture of internationalized domain names is misleading.

PH: you talk about case folding, but you don't talk about canonicalization of the more complex stuff (a+umlaut vs. a-umlaut). Will non-canonicalized names get passed to the resolver? canonicalization is not addressed. it would be done at the same place as case folding.

BS: the resolver does the UTF-8 encoding -- what is the application sending to the resolver? we assumed unicode.

DC: proposing the creation of a parallel DNS service? yes.

DC: do you discuss interworking with existing DNS? not yet.

6. Microsoft's approach (Stuart Kwan)

draft-skwan-utf8-dns-04.txt

Microsoft had a requirement to move people off WINS. WINS allowed the use of Unicode names.

KM: who imposed the requirement

SK: WINS didn't scale.

In Win2K client can initiate query with unicode name, resolver converts names to UTF-8 (does not downcase).

On the server side, database load downcases. On query, downcase and do a byte-for-byte comparison.

Very few changes since -00 draft. Win2K implemetnation hasn't changed.

Biggest flaw: there is no normalization. Not sure what is the best.

Would like to be published as informational.

Michael Patton: make editorial changes to update about what've you've learned.

SK: there is a big emphasis to only use these names when absolutely necessary. But we'll update the draft as requested.

PH: What is WSALookupServiceBegin/Next but that doesn't exist in the draft.

SK: Application gives us unicode and we turn it to utf-8.

PH: so this sends utf-8 over the net.

SK: yes. this tends to be self-correcting.

PH: needs to be discussed in the document.

SK: OK.

BS: any experience with existing applications.

SK: userbase is too large to poll, but nobody has complained.

SK: Microsoft will implement the idn standard when ready.

7. IDNE (Marc Blanchet)

Until a month ago, no proposal using EDNS.

Rationale:

- use 8 bit
- use only one character set and encoding
- transformation on client side
- versioning control to adapt new chars, languages, etc.
- use standard dns extension mechanism

Description

- chars in labels are UTF8
- strings in labels are pre-processed by nameprep
- idn labels use ENDS extension

strings in labels are pre-processed

using edns,

- elt 0b000010 is used
- size of idn label
- idn label encoded in utf-8
- idne labels can be mixed with std13
- regular compression scheme is supported

current maximums:

- label = 63
- dn = 255

idne maximum are:

- label = 255
- dn = 1023

rationale:

- utf8 encodes 1 i18n char up to 4 octets so multiply by 4
- idne udp packet size must support 1220 octets equiv to ipv6 minimal
- MTU
- sender must announce via OPT

- idn protocol version in opt pseudo-rr rdata field with option code
- this doc with nameprep defines v1 of idn
- permits idn revisions

idn api

- getnodeipbyname and getnodeipbyaddr specified in RFC 2671
- idn flag to be added
- no more return codes seems to be needed

transition and deployment

- idne depends on edns
- need for an ace for short term?
- depends on speed of edns deployment
- v6 and dnssec require edns
- 2 protocols make things more complex. one can be chosen forever
- names defined in the ACE must be represented in IDNE

Enhancements?

Yergeau proposed major and minor revision numbers

- minor for incremental table changes that do not require new algorithms so no code change, just load a new table.
- major for major revisions that need code change

Language tagging?

Compression needed?

MA: extending total overall length of a name is problematic.

MB: yes but application must be IDN aware.

JS: language tagging using plane 14?

MB: yes, since using edns give more space.

Are you using stateful encoding?

MB: No.

OG: Very good first start. Since you use EDNS, you only use modern servers and you can determine if downstream servers can work with EDNS.

DC: Your statement that near term may replace the long term is very insightful. A lot of pressure now to deploy.

MB: Yes.

8. Name Preparation (Paul Hoffman)

draft-ietf-idn-nameprep

Requirements:

- output of a single unambiguous string given an input
- lets user to enter anything that might look right to them
- typical user should be able to follow logic of preparation

current order:

- check for prohibited input (many)
- fold case
- canonicalize with normalization form KC

Possible altenative

- check for prohibited input (a few, just for case)
- fold case
- canonicalize with normalize form KC
- check for prohibited output

open issues

- prohibit on input or output?
- allow characters that would be ignored, e.g. hebrew vowels are optional?
- include folding that is specific to the language of the name (not just script used)? How would language information be known?

KM: locale specific feedback mechanism may imply the DNS is simply unsuitable to do internationalization.

PH: Yes. Currently no documents xxx

Where do we do name preparation?

3 places possible:

- application
- resolver
- dns service

There are reasons to do it in each and really good reasons to not do it in each. document is neutral.

TH: seems hard to get the error conditions out of the 4 step model. Is there any other group else who can solve this since we don't have the expertise to do this?

PH: No one has stood up to this task.

JK: There are a few organizations who have looked at this and run away.

PH: individuals at those organizations have indicated they'd help

DC: dns service will guarantee failure -- there will be enough infrastructure change that it'll take years to deploy. What is the goal? DNS has no semantics on the strings. Probably in the resolver.

JS: reverse logic -- forbid characters by default, permit specific characters.

PH: they are equivalent.

JS: easier to check what you want than what you don't want.

=====================================================================

1. Using DNS for Canonicalization data

- IDN working group has no consensus on how to apply local canonicalization rules.
- Unrealistic for all systems to have all local rules
- Dynamically learning rules is desired.

Items that should be defined as local canonicalization rules:

- list of characters can be used as internationalized domain name lables case folding or mappings
- common normalization/canonicalization rules to be adopted
- order of nomralization/canoncialization rules
- How to get this information: use the DNS
- Provide the mappings via the DNS

Advantages:

- Rules can be administered by domain authoritites
- version of rules can be controlled via serial number
- caching effect works

Disadvantages:

- increases dns queries
- hard to adopt to intermittantly connected sites
- CJK has lots of data
- simple rules only

How to provide:
define usable characters as txt rrs
define meta information as txt rr

- normalization rules
- version of normalization rules

use idn.arpa domain for table defintion

norm-form - early normalization method name
norm-form-version
norm-form-url

"." - a character the same as the label
".^" - a character the same as label but not allowed as the first (e.g., '-')
".$" - same as above but can't be used as the last
"a" is the character itself

how it works:

- query for tld.idn.arpa
- if fails, no rule adopted
- look for common normalization method and its version
- look up each character

Issues:

- what are internationalized TLDs
- need special methods to canonicalize TLDs themselves if they were not ASCII-only
- reducing the number of queires
- each character generates a query
- examination of meta labels
- escape syntax to use symbols in ASCII as labels

Using DNAME to reduce queries

- folding charactges into sequence, queries can be reduced

add DNAME and CNAME for each canonicalized character
use DNAME instea of TXT

Issues for this method:

- both servers and clients must be able to use DNAME
- servers for IDN.ARPA must be recursive
- do no resolve overhead
- restriction on the number of aliases
- easily exceeds packet size limits
- need EDNS0
- clients must analyze response
- move overehead

OG: both DNAME and CNAME are terminal nodes. Also
CNAME can't be used with anything else but security records. You assume servers don't do case folding but they do. DNAME can't point up in the hierarchy.

LJL: suppose I want to register a name in .JP. Rules are connected to the TLD and not to the language -- the rules should be connected to the language not the nation. Rules are fixed per name.

Using TXT: won't the resolver get confused? IDN.ARPA being recursive won't work. How does this work with DNSSEC?

YY: TLD defines the rules for registration.

JS: can we move it to the mailing list? The TLD defines what characters are valid.

HA: what is the advantage of this per-character approach? Why not put posix local def into a domain?

YY: DNS is not the best method of doing this, but this is used for the DNS, so only the DNS is being used.

HA: revise the proposal without storing character data in the DNS.
Also think about how this works with clients that do not have the code and what the size of the client code will be and how often you expect the client to upgrade. I think this approach has some good things, needs more discussion.

2. Han ideograph for IDN (James Seng)

why I did this draft:

- lack of information on han ideographs (HI) in IETF
- HI is very complex
- over 103,000 characters, each having their own pronunciation, etc.
- draft also talks about CJK
- encourage discussion and encourage others to write drafts on their scripts

HI are CJK composed of radicals which are made of simple strokes

HI originated from China

HI commonly used in China Japan Korea Taiwan Hong Kong Singapore Malaysia

Case folding:

- conversion between ideographs can be done in various ways.
- character based == word to word
- lexicon and context based == translation related issues

- Unicode does CJK unificatin
- 27,786 HI in Unicode
- Unicode normalization & canonicalization defined in UTR15 and 21, but handling is limited

zvariants are HI which share the same etymology but the glyph varies in some minor way -- should be considered equivalent

Chinese:

- Originated from pictographs, but not all ideographs are pictographs
- because of origin, each HI has a meaning
- Chinese was simplified in 1950 (simple), original known as traditional

there are 2244 SC in last official count and Unicode has 2145

there are multipe TC for one SC. SC-to-TC is almost impossible (need context information)

TC-to-SC may be workable -- may not be perfect, but it can work.

TH: we should do code point to code point because mapping will make thing far too hard due to the need for contextual information.

JS: please read the draft -- just discussing the issues.

TC and SC aren't usually mixed.

TH: not true.

SC and TC are seldom used in the same phrase. You can solve mapping using CNAME and DNAME.

Korean:

Hangul is more commonly used now instead of Chinese derived characters. Hangul doesn't have meaning like Chinese ideographs.

Have their own ideographs with simplified forms.

Japanese:

Kanji, Hiragana and Katakana. Kanji is based on Chinese. Hiragana is a sylabary

Japanese in written form is a vocal script which maps how it is pronounced fairly accurately. Most verbs and nouns are written in Kanji.

Depending on context, pronunciation may be different. Conversion between hiragana and kanji is not practical.

Has their own ideographs (kokuji) with simplified forms.

Ideograph Description:

The same characters can be constructed in multiple ways.

Mechanism:

HI may or may not be folded for the comparison of domain names.
Folding may occur at

- DNS clients or by user agents
- DNS servers
- registration time

In particular, folding during registration time is critical for operational reasons even if we do not adopt any Han folding.

HA: to summarize, one of the real problems is that when someone presents a domain name, they feel they have the right to all the variants, e.g., a chinese name can be represented with different characters in Korean chinese chars and Japanese chinese chars.

PH: saying people will expect certain things. We shouldn't listen to what will be legal or not -- just focus on languages.

3. DNSII-MDNP (Edmon Chung, David Leung)

written an internet-draft, but missed deadline.

been working on this for more than a year

goal: put all the characters into the internet

- must pass the business card problem
- must not break anything
- must be flexible
- less impact on the client -- pain should be in the resolver

dnsii protocol has two parts

- identifier
- inserted before a label
- first two bits, use of the bit sequence "10"

Prefer 10 over edns (01) since the 3rd and 4th bits for future expansion. EDNS reduced possibilites for future expansion

- packet label
- a 12 bit number used to determine the encoding scheme
- valid encodings specified in a list.

there should be no ambiguity

from RFC 2277: all protocols must identify for all character data which charset is in use.

compression and edns will still work as expected

should not require any adjustment to dnssec or ipv6

charset encoding:

- uses iso10646
- flexible to encompass other encoding schemes
- all legal symbols

canonicalization:

- applications should
- servers must
- use form C recommended

Han folding is similar to treating color and colour identically.

we have working code.

this approach is patented

HA: nameservers everywhere must be able to convert between all 400 characters sets, right?

An implementation decision.
HA: What do you do when it encounters a character set it doesn't understand?

The fall back should be back into UTF-7, if still can't be found, return an error.

HA: So the client must know how to convert?
No. Take it to the mailing list.

OG: two observations, I strongly encourage everyone using EDNS label types allows clients to discover capabilities on the server. Don't worry about saving a few bits. Think more about how ideas should be expressed.

DC: One of the considerations of EDNS we worry that the second byte is a second count.

OG: send a note to namedroppers on protocol issues.

PF: two questions: re-emphasize what Harald says. From the email world, best to use as limited characters. Do client side conversion into simple character sets. second: what about future character sets -- you will have to fallback all the time.

DC: We encourage the use of Unicode.

PF: For this to not become a local solution it should be done as close as possible to the client.

4. Evaluation of proposed Encodings for IDN (Yashuhiro Morishita)

mDNKit -- multilingual domain name evaluation kit
objectives:

- evaluation of technology
- promotion of standardizatino
- technical contribution

Developed by JPNIC, released Jul 13

components:

- dnsproxy server
- codeset coverter
- commmon library for handling multilingual domain names
- patches for 8-bit clean bind and tools to override unix shared libraries

evaluated drafts:

- race
- skwan-utf8
- jseng-utf5

evaluation points
limitations of usage

- name length
- interoperability with dns
- ease of operation
- usage of multilingual domain names

race and utf5 are ascii compatible encoding

- available hostname length shorter than utf-8
- compression but needs all characters in a single row

utf8 has incompatibilities with present DNS

- hostname not 8bit clean
- applications use 'checknames'

ace strings need identifier to distinguish from normal ascii strings.
-- uses ra--

utf-5 uses zld

charset encoding/conversion tools are essential

race currently best method suitiable for transition
utf8 incompatibile with current dns

todo:

- interop testing
- evaluating more drafts

LM: if you evaluate cut and paste, most of these systems
cut and paste doesn't work very well. any of these systems had usable cut and paste?

YM: race is best current mehtod for cut and paste since it uses only ascii.

LM: if you have a display method that show race in ASCII and you cut/paste it how do you have interoperability between application and display?

JS: properly implemented system will use MIME on the paste.

5. NuBIND Implementation (Bill Semich)

original goal was to internationalize BIND
maximum support for internet standards

rfc2277, 2279, iso-10646, Unicode UTR15 and 21

3 components:

- rdns
- lute (transitional)
- uvce

JK: intellectual property?

BS: not submitting this as a IETF submission. "nubind" name is
trademarked.

Current implementation status:

- nubind operational for 8 months
- 5 in second-level domain 'eu.nu'.
- 3 in a slave server for .NU
- many current and potential external implementation problems that will need to be dealt with

mail servers, others unexpectedly fail
legacy dns servers

security considerations

- ssl and x.509 certs must be modified to work with 8 bit dn
- inverse lookups

application problems

- browsers no standard support

client environment problems
dhcp server configuration
host lcient with an idn resolver setting
all current unix resovlers support ascii only

http server problems

postpone implementation of idn in the DNS until minmal impact
standards or alternatives are accepted
minimize impact on DNS

Why use the internet infrastructure to achieve application goals.

JS: will you submit an ID?

BS: Can submit

TN: need to remove copyright notice in presentation.

BS: take it offline.

6. Comparison of IDN proposals (Paul Hoffman)

draft-ietf-idn-compare

wrapup of technical presentations. talking about comparison doc.

Basic idea of doc is to describe significant features that we need to think about. Includes pros and cons and what features are really needed.

sections of docs

- architecture
- names in binary
- names in an ACE
- prohibited characters
- canonicalization
- transitions
- root server considerations
- security considerations

arch:

- just send binary
- send binary or ace
- just send ace

will be updated to add details about what is sent between app<->resolver, resolver<->server, server<->server

names in binary

- utf-8 or labeled charsets
- distinguished binary from current format

will be updated to add where the different markings will be used

names in ascii

- format
- how to distinguish ace

prohibited characters

- identical or near-identical characters
- separators
- non-displaying and non-spacing characters
- private use characters
- punctuation
- symbols

BM: had a machine called <ctrl-s>. There is a distinction that needs to be made between hostname and domain names.

PH: Yes.

MA: reference RFC is 952, not 1035

JS: lot of confusion on this issue.

HA: this is in the requirements doc

canonicalization

- type of canonicalization (normalization form C or Form KC)
- other canonicalization (case folding (ASCII/non-ASCII), han folding) if you want a good description, see the current Unicode standard.
- where is canonicalization done
- location of canonicalization may determine how quickly idn can be deployed.

transitions

- always do current plus new architecture
- transition period

draft will be updated to add specific details for transition and what needs to be transitioned?

EN: are there drafts that talk about transitions?

PH: No. Drafts on transitions do not need to be associated with a proposal.

PF: need clear distinction between Unicode consortium work
and this WG.

PH: part of transition will include how to get groups outside the us to transition with us.

ISO has groups which determine code points in the repertoire. Unicode consortium does not add code points to ISO standards, ISO does. As such, we don't need to liaison with ISO -- just need to be aware of what ISO does in this space.

Root server considerations

- don't want RSes to blow up
- how quickly can we have real IDNs in the TLDs

JS: RS ops worried about operational implications, e.g., how the RS op
will verify data is correct.

Security considerations:

- don't want to reduce general security of the DNS
- biggest issues include IDN names in digital certs and name spoofing

Expected changes:

- add ideas from new drafts
- add comments about the drafts from the list
- give more detail on canonicalization
- additional details about effects on apps, resolvers, and servers

please specify categories

- will help readers which parts of IDN the draft covers

please talk about this on the mailing list

draft will be updated within a few weeks

BS: might be important to list patent and IP issues.

PH: maybe. will defer to the AD.

EN: we already have a process to do this.

PH: might be worthwhile to list IPR IETF has been notified of.

MB: I can put it on the website

EN: use a generic notice, not a listing of IPR

7. Working Group Next Steps (Marc Blanchet)

Requirements doc

- contentions issues? incomplete?
- ready for WG last call?
- RFC informational?

Need minor revisions. Pretty ready to move to last call.

EN: there are some items that need to be resolved.

HA: need to look at Keith's comments. The chairs/author should declare that the comments on the draft should be 'identify problem, old text, new text'. No comments will be accepted that aren't in this form. Have a hard deadline (2 weeks).

MB: document editors agree?

JS: yes.

MB: OK.

Comparison document

- keep it going by enhancing it
- RFC informational?

JS: should consider transition period.

HA: could discuss the transition properties without a proposal, but mechanisms will depend on proposals.

MB: wg agreement to keep it going.

3 types of solutions

- do not change the DNS, application layer solution
- do change the DNS
- directory based solution (with or without changes to the DNS)

Want one protocol at the end.

Convergence process:

- have all current authors work together on a converged solution?
- use comparison document as the seed doc for discussion

EN: list 3 solutions, but no proposals to converge. should focus on the dns proposal

OG: good way to procede. might be better to 'cherry pick' from all the proposals. maybe use authors as the design team.

JK (pretending to be Zita): she wants to reiterate taking discussions to the list. Also agrees with Harald's proposal.

HA: in the solution space, trying to converge on the best dns based solution. if we can't come up with a solution that meets the requirements or can't be deployed, then we look at other approach.

BS: a long term solution may be more appropriate to look at than short term.

JS: other solutions not using the DNS may exist, e.g. directory based solution.

Everyone agree to the process? Need to set up the design team -- please send mail to MB.

AB: Aaron Brunner
AG: Andreas Gustafsson
BM: Bill Manning
BS: Bill Semich
DC: Dave Crocker
EN: Erik Nordmark
HA: Harald Alvestrand
KM: Keith Moore
JK: John Klensin
JI: John Ioannidis
JS: James Seng
LJL: Lars-Johan Liman
LM: Larry Masinter
MA: Mark Andrews
MB: Marc Blanchet
OG: Olafur Gudmundsson
PF: Patrick Falstrom
PH: Paul Hoffman
TH: Ted Hardie
TN: Thomas Narten

Slides

Comparison of Internationalized Domain Name Proposals
DNSII-MDNP
Evaluation of Proposed Encodings for IDN by mDNkit
Han Ideograph for IDN
Architecture of Internationalized Domain Name System
IDN Using EDNS (IDNE)
Name Preparation in IDN
WG Next Steps
RACE: Row-based ASCII Compatible Encoding for IDN
(draft-skwan-utf8-dns-04.txt)
Using DNS for Canonicalization Data
Using Universal Character Set Data in the DNS
IDN WG Update