Site Multihoming in IPv6 (multi6) Charter

2.4.10 Site Multihoming in IPv6 (multi6)

NOTE: This charter is a snapshot of the 61st IETF Meeting in Washington, DC USA. It may now be out-of-date.

In addition to this official charter maintained by the IETF Secretariat, there is additional information about this working group on the Web at:

Additional MULTI6 Web Page

Last Modified: 2004-09-22

Chair(s):

Brian Carpenter <brc@zurich.ibm.com>
Kurt Lindqvist <kurtis@kurtis.pp.se>

Operations and Management Area Director(s):

Bert Wijnen <bwijnen@lucent.com>
David Kessens <david.kessens@nokia.com>

Operations and Management Area Advisor:

David Kessens <david.kessens@nokia.com>

Mailing Lists:

General Discussion: multi6@ops.ietf.org
To Subscribe: majordomo@ops.ietf.org
In Body: subscribe multi6
Archive: http://ops.ietf.org/lists/multi6

Description of Working Group:

A multihomed site is a site that has more than one connection to the
public internet with those connections through either the same or
different ISPs. Sites choose to multihome for several reasons,
especially to improve fault tolerance, perform load balancing, etc.

Multihoming today is done largely by having a site obtain a dedicated
block of address space and then advertising a route for its prefix
through each of its ISP connections. The address block may be from the
so-called provider independent space, or may be a sub-allocation from
one of its ISPs. A site's ISPs in turn advertise the prefix to some or
all of their upstream connections and the route for the prefix may
propagate to all of the routers connected to the default-free zone
(DFZ). As the number of sites multihoming in this manner increase, the
number of routes propagated throughout the DFZ increases and overall
routing stability decreases because of the burden on convergence
time. This WG will seek alternative approaches with better scaling
properties. Specifically, the WG will prefer multihoming solutions
that tend to minimise adverse impacts on the end-to-end routing system
and limit the number of prefixes that need to be advertised in the
Default-Free Zone (DFZ). Just as sites have multiple reasons to choose
multihoming, they may have multiple reasons to want to provide these
benefits more directly to hosts within their sites, for instance,
because some of those hosts may have network stacks capable of
detecting and surviving a provider/prefix change. Phasing in hosts with
capabilities of multihoming might be part of the Multi6 solution space.
In the course of this work, the WG may also study the deeper underlying
questions of identity and location of services, hosts and sites as they
directly affect multihoming.However, the working group is not chartered
to make significant changes to the nature of IP addresses or to
inter-domain routing.

This WG will consider the problem of how to multihome sites in
IPv6. The multihoming approaches currently used in IPv4 can of course
be used in IPv6, but IPv6 represents an opportunity for more scalable
approaches. IPv6 differs from IPv4 in ways that may allow for
different approaches to multihoming that are not immediately
applicable to IPv4. For example, IPv6 has larger addresses, hosts
support multiple addresses per interface, and relatively few IPv6
address blocks have been given out (i.e., there are no issues with
legacy allocations as in IPv4). Also, IPv6 deployment is at an early
stage, so modest enhancements to IPv6 could still be proposed.

The WG has already produced a document, RFC 3582, on goals for IPv6
site multihoming architectures. It is recognised that this set of goals
is ambitious and that some goals may conflict with others. The
solution or solutions adopted may only be able to satisfy some of the
goals presented there.

The WG will take on the following tasks:
========================================

Produce a document describing how multihoming is done today in IPv4,
including an explanation of both the advantages and limitations of the
approaches.

Produce a document outlining practical questions to be considered
when evaluating proposals meeting the RFC 3582 goals, including
questions concerning upper layer protocols.

Produce a document describing the security threats to be addressed
by multihoming solutions.

Solicit and evaluate specific proposals to multihoming in IPv6
(both existing and new), extract and analyse common architectural
features, and select one or a small number of proposals for further
development. The architectural analysis will include applications layer
considerations and transport layer considerations, as well as lower
layer issues.

Development of specific solutions will require chartering of work
in the appropriate Area or Areas.

Goals and Milestones:

Done		Goals for a multihoming solution as RFC
Done		Final solicitation of proposals
Done		Begin architectural evaluation of proposals
Done		First draft of architectural evaluation
Oct 04		Submit informational I-D to IESG on how multihoming is done today
Oct 04		Submit informational I-D to IESG on security threats
Nov 04		Submit informational I-D to IESG on architectural evaluation
Dec 04		Identify proposal(s) for further development, recharter
Jan 05		Submit informational I-D to IESG on practical questions

Internet-Drafts:

draft-ietf-multi6-v4-multihoming-02.txt

draft-ietf-multi6-things-to-think-about-00.txt

draft-ietf-multi6-multihoming-threats-01.txt

draft-ietf-multi6-architecture-02.txt

Request For Comments:

RFC	Status	Title
RFC3582	I	Goals for IPv6 Site-Multihoming Architectures

Current Meeting Report

Multi6 notes, IETF 61, Nov. 9 2004 - Session 1 ---------------------------------------------- Notes taken by Tom Henderson Jabber log by George Michaelson Meeting opened 9:02am. Chaired by Kurtis Lindqvist and Brian Carpenter Brian Carpenter (BC): looking for a jabber scribe? BC: Tom Henderson will take the minutes BC: Who will be our jabber scribe today? George Michaelson (GGM): I can do jabber. BC: please, Roy Brabson will also help Kurtis Lindqvist (KL): Agenda review. Meeting continues tomorrow-- may have to cut some time. Close tomorrow on the next steps for the working group. KL: Reminder to note the terms and conditions of IETF meetings and contributions, and signing blue sheets. KL: Starts the agenda. Notes that we may have to cut the line, save discussions for tomorrow. ** draft-ietf-multi6-multihoming-threats-01.txt i) threats documents-- did have some comments. BC: IESG made mistake and looked at previous version of the draft, but most comments relevant anyway, and new version addresses them. Will run through group as mini-repeat last call, but nothing substantial? Erik Nordmark (EN): No substantial changes. KL: so short last call! ** draft-ietf-multi6-v4-multihoming-02.txt KL: It was observed that it won't pass id-nits check due to lack of index. BC: One substantial comment received: is it worth publishing? Any comments on this question? Elwyn Davies (ED): Needs serious rework, if it is to go out at all. Can't deal with upstream failures, not on the local link. John Loughney (JL): Would be a good document if editorial could be cleaned up. ED: Offers to help to edit language and get remaining issues fixed. BC: Will send out for new last call after that. ** draft-ietf-multi6-things-to-think-about-00.txt KL: As far as I know there is a new version coming, comments being factored in. Eliot Lear (EL): Pekka Savola and Tim-- only comments received-- specifically clarification on what problem are you trying to solve, piggybacking issues, IPv6 renumbering, and Pekka wants more detail on DNS. Wants people to spend more time and see whether and how the document is useful to them. ** draft-ietf-multi6-architecture-02.txt Geoff Huston (GH): I got one comment from Pekka Savola, and one form John. They where the same and are close to being solved. I will resubmit an -03. BC: Will do mini-last call on all four documents. One week repeat last call. 2. General report from design team (Erik Nordmark) =================================================== EN: We met twice in SanDiego, to get common understanding. We have then had email discussions, strawmen. We also met for 2 full days in Manchester/RIPE. EN: The forced deadline, and writeup pressure kept us away from contentious issues. EN: We have tried to accomplish: - Minimal/no dependency on DNS (working for hosts without FQDN). - Tried to allow application referrals to work. Maybe not optimally, - 'Good enough' security. - Time-shifting attack avoidance . - We wanted to think about privacy. - Id's in v6 addresses. EN: We acknowledge we can't solve mobility but extensible to it. EN: We also wanted to (perhaps controversial?) avoid hard-coded /64 subnet boundary. EN : Things that we assumed - Ingress filtering can make it hard to go out to ISPs. We want to cope with that with interactions btween routing, filtering. - We assumed DNS has its own idea of 'redundancy' with multiple DNS recs. No known circular dependencies. EN: Things still needed are - Congestion control. Test for multiple locator at both ends, cartesian product is the worst-case test. Can we parallelize? EN: Things we didn't explore in depth - No packet formats in draft. Not even how many types. - Nothing on state management. How to remove state? When? Coordinated e2e? EN: We also discussed non-reachable locators, used as ULIDSs. eg ULAs. I believe nothing in approach developed to prevent. But don't have all the pieces to do it. EN: We can use registered form of ULAs. People that use them can decide to populate reverseDNS, add capabilities, given ULA find all locaters via DNS. EN: One optional thing. Can we handle non-/64 boundary stuff? Shorter or longer prefixes? Believe can be done. But it is not clear if community care/support. EN: The next steps for the design team is - Drafts in-hand, - Team should cease to exist. - Maybe an editing team. - Do stuff on WG ML. EN: Next steps are not clear, chairs discussion needed on what to do with loose ends, things like that and specific questions. Iljitsch van Beijnum (IvB): what's this dnsop unreachable issue? George Michaelson (GGM): I have a draft in DNSOP about not publishing unreachables. This needs liaison. 3. Erik next presented the first design team draft: Multihoming L3 Shim Approach draft (draft-nordmark-multi6dt-shim-00.txt) ======================================================================== EN: This is an overview. It does not introduce a new namespace. Where does the shim go? Deferring context, assumptions about DNS, receive side demux, and open issues . EN: Define upper-layer identifier, ULID as a 128 bit quantity. It is not well defined, useful to have term. EN; Locators are not different to what we do today. ULID seen by TCP, UDP as well as applications. EN: Placement of l3 shim? - Above IP endpoint layer, below frag, IPSEC. EN: Has to be above first hop router, i/f logic. EN: This is not written down, but is clean split between these functions. EN: Benefits are, things above can operate based on ULID. Do re-assembly on them. IPSEC security association bound on them. When we rehome, security assoc still there. EN: The disadvantages are that the shim needs to insert bytes in packet. This implies needs to make that visible to upper layer proto for fragmentation issues. EN: Deferred context establishement EN: 3 events occurring at different times. i) Initial contact. ii) Deciding to setup multi6 congtext state. Based on local policy-- port numbers, #packets sent, etc. iii) Rehoming the connection after a failure. Also need to handle failures during the initial contact, which is the base case-- punt to the application layer to try different ULIDs possible to optimize by having shim do something. EN: We need to handle failures during initial contact. Multiple handles for local peer, first attempted connect doesn't work. The shim draft talks about easiest case: punt to application. EN: Assumptions about the DNS EN: none. (!) EN: FQDN may be for service or host. DNS returns set of potential ULIDs. Try one until we find a working locator. This is not neccessarily a locator used for failover (could be 3 different hosts). When we set up Multihoming context, find out 'other locaters, not ones found in DNS'. EN: But we want to optimize initial contact failure issues. This means we need to be aware of multiple host issues. EN: Receive side demux issue EN: Some people call this approach NAT. EN: Packets going from one upper-layer protocol to other, sending side, on a failure IP adddresses get changed. There is a need to reset receiving side before passing to upper layer. EN: To do this, need to avoid being confused! Example was discussed. ULID A starts to communicate with two hosts ULIDs B and C. then discovers they are the same. Then locator B fails. now,send packets from locator A to locator C. Some need rewrite to ULID B, some don't. EN: Things upper layer explicitly sent to B and C have to be separated. How to make sure enough info to correctly re-write things. Pekka Savola (PS): Two clarifications (one from previous slide) i) I think you have some assumptions about how you deal with DNS for referrals, right? EN: Maybe confusing things. There is a potential to do things differently with registered ULAs. We could use DNS for some things or have applications using DNS for referrals without that. Applications which operate on domain names are part of reason to put stuff up here, to contrast with other things, like NOID. PS: Yes. Basic service doesn't have assumptions on DNS. Great. But there might be underlying hidden assumptions about stuff. BC: There is a draft on this nontrivial application layer issue. We can discuss this tomorrow. PS: I think one thing that needs explicit consideration is whether receiving ICMP errors from network is sufficiently satisfied as special case of referral. EN: I have slide on that. This was also brought up on the mailinglist. EN: RSD: prevent receive side confusion EN: Only use locator with single ULID. EN: This means that a host with 3 prefixes would have 3 prefixes and 9 locators. EN: During initial contact we have fallback opportunities, during dialogue, we need six additional locators, these are locators ONLY, and never used as ULID. uniquely identify ULID. EN: Just looking at locators, addresses sufficient to map. EN: RSD carry additional info EN: eg call it context tag. Something to help identify where does it go in the packet? Believe can use flow-id. Or as a new extension header? EN: After rehoming, failures, add ext header, minimum after size/alignment rules is 8 bytes. Please send comments to the mailinglist. EN: Open issues are - Carrying flow label across the shim. - Hard to understand constraints due to overloading. - We need to discuss checking againts things to think about. EN: We also need to discuss Pekka's issues with ICMP error demux. Erik thinks less severe problem than Pekka. EN: We need more clarity on ULAs. Our intent is to study centrally assigned ULAs. These may make it easier for applications. Erik's' personal opion is that we can not only support these. We will need non-central as well. EN: Handling unreachable ULIDs. Two issues. Registered (in reverse DNS), are they permanently unreachable? During initial context establishment, using permanently unreachable from most of internet (most unique locals) then should know, use other ones. end of presention. BC: Encourages people at the microphone to stick to clarification issues, and keep discussion tomorrow Christian Huitema (CH): Very nice piece of work. One question: you propose using locator as ID for flow-- we actually have some experience at Microsoft and ran into problem-- what happens if locator is only on loan? You move to a place, they give you an address, you move on, then address is given to someone else. EN: These addresses are not permanent-- for how long are these addresses assigned to you? BC: I am not sure. This on boundary of mobility/MultiHoming. EN: These addresses are not permanent-- for how long are these addresses assigned to you? CH: Using temporary locator to maintain long duration connection creates an issue-- should be listed as such. EN: Normally, address would become unreachable, but here we are making it explicitly bound in the TCP state. EL: This is crux of Tim's issue that he wanted to add to things-to-think-about. Can't get into situation where we always sweep it under the mobility rug. BC: After next talk, ask question again. 4. Marcelo presents next draft: Hash Based Addresses (draft-bagnulo-multi6dt-hba-00.txt) ======================================================== Marcelo Bagnulo(MB): presents security issues, and threats in Erik's draft. Hijack/flooding issues. But focus on hijacking for now. MB: Have approaches to deal with hijacking. This is based on using cookies, (time shifting attacks issue) or PK crypto (seems overkill). MB: Characteristics of HBAs MB: resulting set of HBA are static. can't add new ones. MB: Main idea is to include info about multiple prefixes in addresses themselves. HBAs are a hash of interface plus a random number. MB: Discusses example of two host dialogue. Then he loses one path, can check validity of new dialogue via HBA. HostA in multihomed site: P1&P2 Generate HBAs for HostA - iid = Hash(P1|P2|rand) - Addresses for Host 1: P1:iid and P2:iid - Host communicates set of prefixes and rand to peer (can be in clear text) - If failure occurs, communication rehomed to P2, because peer can verify that P2 belonged to the hash-- this is main idea behind idea MB: Shows flow/event sequence MB: Compatibility with CGAs MB: Use same interface id bit. SeND uses CGAs. Interesting not to impose requirement to use ours or theirs. MB: CGA based multi6 can do dynamic cases. Useful for renumbering. If they are compatible, we can support both. MB: We can define HBA as CGA extension MB: Resulting address types i) CGA-only addresses ii) CGA/HBA addresses iii) HBA addresses MB: Discussing hash/generation sequence, some consideration of privacy outcomes. MB: HBA set verification, be sure the prefix included is the proper one. This is verified by runing CGA process. MB: Security considerations MB: How to attack. Generate CGA structure with own prefix. Then find a modifier such that it matches H1 under hash test. O(2^59) or O(2^(59+16*Sec)). MB: For this a brute force attack is required. MB: Privacy considerations. MB: There is no fixed identifier on low 64 bits. MB: The first implementation of HBAs by Frances Dupont {ENST}, feedback on ML. end presentation CH: Marcello, one point of extension that I want to understand. If address is CGA address, do I still need to hash all prefixes, know in advance to be secure? MB: If just CGA you don't need that. You can build only using CGA. CH: in signalling done negotiating locators, can use either CGA or HBA? MB: No. It is an idea, and could be an option, but idea here is to generate HBA addresses. If move between ones made in HBA, we can use without PK crypto. If we want to add new ones, have to move to PK crypto, use CGA. CH: Idea is to stay within HBA, we can use simple hash method. If move outside that site, constrained to use CGA. MB: We can have one addresses, but stay within prefixes, can use properties of HBA. But new prefixes have to use CGA. Don't have to change address but verification method is different in one case to other. CH: I did not understand from the presentation. Secondary hash for null bits. I don't understand applying that ? How to do that without any reference to PK. MB: Security parameter increases amount of work. It is used exactly the same way as in CGA. CH: Intellectual property on SEND also probably applies here. BC: (co-chair hat is off): I sent a mail to the mailinglist that I got no answer to. The remote host need copy of CGA parameter structure, right? I don't quite understand how man in the middle can't capture the parameters and add one more bogus address to the set by running the loop in the algorithm one more time. MB: I didn't get the mail. IID will be different-- will be hijacking. If you add something to the prefix set, the IID will change for all of them. BC: The remote host, needs complete copy of CGA param data structure. This includes in HBA case, pseudo random modifier which replaces PK. I don't understand why Man in the middle can't run the algorithm one more step, add one address at end. MB/BC: Discusses implication of mitm able to add extra address. now believed by receiver. BC: We can take this discussion off-line. EN: Following on that, there is a class of MiTM, eg TCP relay, NAT box, what being described, those attacks exist in V6, will exist in this one Property of these attacks is that it is only effective while MiTM is on the link. as soon as attacker off link, he cannot intercept pkts. BC: I am not sure, not thinking about time-shifter attack. EN: To Christian's point about CGA vs CGA/HBA hybrids. There are a few things we haven't written up about hybrid HBA/CGA addresses. If you use HBA property, can just use the hash. If you fall back on CGA properties, verification becomes more expensive. This could be a useful property. BC: We need to wrap up soon on this one. ??: Time shifting attack-- can change interface ID-- but then moves to where his other prefixes are, and has shifted the traffic. EN: I am not sure. If we have created TCP connection in peer for new interface id, prefix <x> different state. Fundamental MITM attack is same as a TCP proxy. CH: If there is no link to actual identity, then you risk a third party capture of one of the locators of the host, and then I can move a TCP connection-- that is the attack if you do not have a secret. (EN or MB): If not using SECURE neighbour discovery, can be caught. this doesn't change under Multi6. MB: We don't need CGA for secure neighbour discovery. what do you publish in DNS as IP addresses ??: What do you publish in DNS? MB: The set. Use any one. BC: How many people actually read this draft? It is complicated. Is this too complicated to implement except for ambitious implementors? CH: Would be more comfortable if you could run a pure CGA solution. Tradeoff between performance and complexity of software. Doing CGA should be enough if that is all I have implemented. Hannes Tschofenig (HT): Concern is IPR issue. It is great that these addresses have self-certifying properties, but there is a fundamental piece in the Internet, and strikes me strange that it has intellectual property attached to it. Matt Ford(MT): Complexity question. HBA is actually simpler to implement than CGA. CH: If already done CGA, it is more work to also do HBA. BC: Don't confuse technical discussion with IPR discussion right now. EL: My concern with Christian's proposal that only implementing CGA should be sufficient is that you need at least one way to interoperate. 6. Marcelo presents next draft: Functional decomposion of M6 (draft-bagnulo-multi6dt-functional-dec-00.txt) =================================================== MB: On initial contact we need to do capabilities detection stuff. MB: Failure during startup. Hhow to deal with legacy hosts. Either retry using different address or try same ULID, change locator, needs v6 support so implies capability detection. MB: M6 capabilities detection. There are several ways to do it. The non-scaleable is manual config or DNS config. Let's say target is M6 capable or make part of protocol. MB: This is preferred, simple, more flexible. MB: M6 host-pair context establishment. Whats in the context? ULIDs, at least one locator, can add more, means sending more info. but gives fault-tolerance MB: It is probably a good idea to leave open, leave to host, balance fault tolerance/cost issues. We may need context tag for multiplexing. security info. MB: Continuing M6 host establishment. State can imply memory exhaustion. What we need is - ULIDs - at least one locator per host - additional locators? - context tag? -> demux - security info - cookie/key/hash chain anchor - additional info to prevent future time-shifting attacks MB: Goes through slide of time-sequence exchange diagram. Then goes through the sequences of adding additional locators, security exchanges, cookies. MB: For locator set management we have to manage (add/remove) locators. Need to communicate e2e. We may need to remove for local reasons eg router deprecates, want to let other end know. This could be state differencing with ack, or atomic send of set with ack. There are issues with synchronizing incremental. atomic doesn't require, but impose other additional overhead. re-sending stable state. MB: Locator set mgt security. We need to provide strong security to avoid time-shifting attacks. EN: It might seem odd, that it doesn't talk about HBA at all, but it's intentional. This presentation doesnt assume any particular security scheme. HBA might/would satisfy requirements, but not assumed. BC: But it does say 'we need solution in this space'. MB: Removing locators, simpler. MB: Rehoming. Sequenced move to a new locator. Failure detection, reachability testing. This is not trivial with unidirectional paths. Jari presents on complex cases. MB: Removal of M6 session state. May be unilateral, or coordinated. could be DoS attack to do unilateral. coordinated means ACK flows (eg NOID). EL: In case of unilateral, how to envision case handled have long standing state, one side reboots? I am concerned about error msg, ddos attack, but don't get answer, is the idea to timeout and re-initialize state 0? MB: One side reboots, other end send new packets. error msg should be answered? EL: But there are ddos concerns there right? MB: Yes. I am not sure how to deal with it. But hoping we could do better. EN: Not the main concern here. If you establish the state, presumably you can define things such that MITM that shows up late is not as effective. Such an error message can aid the MITM attacker that shows up late. EL: Another question going back. Is it possible to take advantage of pre-established stuff? MB: First we have initial context, then we have deferred. BC: (not co-chair) You indicate that it is not the complete solution right? So my comment is that we've seen other bits of functional proposal, and Eliot reminded me of the database where stuff could be stored (CELP). Does design team want to go there-- a complete functional decomposition with architecture and interfaces and diagrams? EN: Answer question with question. What is complete? can store quality metrics. can choose good ahead of just working but complete is 'scary' BC: Complete not to the level of UML, but so that Francis (Dupont) could implement all of the boxes. EN: How would hardware? work together with this software? Probably not looked at yet. BC: Very happy with documents, but all leave open questions right now (possible exception is HBA). How quickly to get rid of these questions? Another three years? CH: Some of the points similar to mobility. I wonder to what extent there are possibilities to use common code? GH: This is not a new question. Not a new answer coming: asked in March, again in August. The response was drafted by Marcello in arch draft. The response is that fundamental issues in MobV6 depend on 'tethering' with timers, update timers. A bunch of timers would be a showstopper in Multi6, never sure where tether point is. Not a lot translates, but if you disagree, doc is not published, if feel analysis is incorrect 'send words' to the list. BC: We have 10 minutes to continue then have to move on[chair] EN: There aren't many things undefined that would prevent you from getting up and running. However, people will likely be concerned with detailed packet design-- up to working group to resolve that. We could build something today, we don't have big architectural questions-- but we have to agree on details. MB: Some open issues with state management, though. We've identified that. 7. Jari Arkko presents next draft: Failure Det./ Loc. Selection (draft-arkko-multi6dt-failure-detection-00.txt) ============================================================================ Jari Arkko (JA): Background. Some problems are quite similar to MULTI6 and HIP, and what MOBIKE needs. Work ongoing on those two. SCTP is a bit different than what is being worked on here. More static, with apps telling transport what to use Mobility and host address configuration work as well as Multihoming basics JA: What is multihoming? Typically multiple prefixes for some of tte participants. May be one or both ends. Nodes know their own addresses, but don't know all the locators for peer. Need to learn those through multi6 protocols. Need to learn peers addresses before a problem occurs. Both peers can loose their addresses, and you can't switch to the other since you haven't told the peer. Where do node's own addresses come from? From other parts of the stack - e.g., DHCP, Neighbor Discovery Processes are not trivial - DAD, etc. JA: Where do a node's addresses go? Addresses taken away by same mechanisms. JA: Addresses - what is their status? JA: Security: need to believe what configuration mechanisms tell us. If we have an address, and a spoofer tries to convince a node the address is no longer valid - while this is possible, can't do anything in MULTI6 as other parts of the stack believe this information. If lower layers believe link is down or address not usable, MULTI6 can't do anything about that. Even if assign address to interface, link may be down so you may not be able to use it. [A few address-related definitions] JA: Available address: address assigned to interface, valid, but may not be able to use it. Locally-operational address: address is available, default router is reachable. JA: Interface to related modules. Configuration modules (e.g., DHCP) that handle address assignment tasks. Other work in DNA and DHCP to improve characteristics to changing connectivity at the "lower layer"? Answer is no. Not even if you can talk to someone else, it doesn't guarantee you can talk to another node. JA: Definition of address pair. Address pair: pair of addresses (src,dst) used in communications between two peers. Operational address pair: both addresses locally operational, traffic flows when the pair is used. This is what you need in multihoming. JA: Symmetric vs. assymetric address pair reachability. Reachability may not always be two-way. Can use one pair in one direction, but can't be used in the other. Can construct multi6 to handle this, but do we want to add complexity for this? BC: Do we want to add this to the goals to address this? Not in goals, but may be useful. CH: Also noted in analysis of egress filtering that it is something that can happen. If we can support without making too complex, then we should do it. BC: In other words, reachability is not commutative. JA: Selecting an address pair. How do we know if there is a problem? If lower layers tell us the address went away or can't be used, we know we have a problem. Can also have explicit tests - periodic ping between pairs, for instance. If this doesn't get through, may have a problem (might be transient). Lack of TCP progress also can be used as an indication of a problem. ICMP may or may not be a problem. Picking another pair. SCTP has some functionality, but no other protocols to pick pairs. HT: ICE and STUN use some of these types of mechanisms. JA: Picking another address pair. Need exponential backoff on selecting new pair - e.g., to handle site link going down. Downside is backoff can cause the time to recover can take a long time if there are enough addresses. As a result, try address pair that is most likely to work. Nodes may have preferences on addresses they want to use. You can signal partner about your preferences for your addresses within multi6. For rest, can have heuristics. Leave these to implementations. Testing for bidrectional reachability is easy. Unidirectional connectivity is harder. JA: Finding pairs - unidirectional case. Chart showing case: - Peer A sends a poll to Peer B using a particular pair. - B sees A has a problem, and starts the same process. - B sends a packet to A using hte same pair, but that message is lost, so you can't answer. - A continues a pair, with a different src address, with the same dst. This is lost. - B sends a different packet with a different src to A, and it reaches A. - A now knows its packet reached. - A then sends a poll to B saying its poll messages are getting through. JL: Whatever mechanism you decide, please make sure it works through middleboxes. Must assume something that generally works-- if you use TCP for probing, for example, your UDP might not get through. EN: Depends-- does it make sense to make test packet get through if failover traffic will not. BC: Does discovery mechanism meet same constraints as data traffic? JL: Polling needs to work no better than regular traffic, but can work worse. CH: Avoid e.g., IPsec problem where IKE might work and IPsec doesnt. JA: Suggested Design Principles. Multi6 should not reinvent DHPC, and should believe what ND tells us. Own addresses learned locally, peer addresses are communicated. Search proecedures need to apply exponential backoff multi6 only works as a fail-over. No load balancing (would cause prolems to TCP), and no selection of "best" path (harder than a "working" path). No mandated search order, and no application input on "primary" and "backup". JA: Some open design principles. Do we need to support unidirectional? OK not to support link-locals? JA: Some architectural issues. Need to tell ULPs that we changed prefixes or addresses. Division of work in multi6 and transport/app layers. Some reachability at multi6 and at transport (e.g., TCP). Congestion information at transport, and application requirements on what is acceptable connection. For instance, application needs higher throughput even if there is connectivity, and wants multi6 to switch to new connection. end of presentation. BC: You are hinting at some components not in Marcelo's presentation. EL: I have two questions. First, I don't understand issues with different levels of scoping, where things break down. until he does, don't know why it is architectural. EL: Second is the application requirements - want app to make decision on whether current connection is usable, and if not, get more? JA: More of a question EL: Yes, we do. CH: I have two questiosn, on assymetric, the diagram shows assymetric and walking through it. -- may not be that hard, then, to handle it. CH: Second, on negotiation issue. If we think about firewalls, it would be good that verifying reachability is side effect of sending TCP packets. Firewalls will be open for application traffic, so if signaling piggybacked on this, then it will get through. JA: What if packets only sent in one direction? CH: Yes, but would still be nice. CH: I understand there are border conditions. But in common case, would be nice. CH: Erik's idea to solve multiplexing is to carry in the flow an index to use for demuxing/rewriting. Might be beneficial to have a symmetric index so you can indicate "which of my sources to reply to". CH: Also the point on applications and addresses-- common scenario is some apps allowed on some interfaces but not on others, so we need to be careful there. JA: We have, but just haven't included it in this presentation. Can say that addresses only used for some apps. EN: Contrast between app requirements, and just finding a working path. Way for app to say "try to find something better if you can"? EN: Was wondering about search order-- any work in SCTP on this? Are there heuristic in SCTP search order (differences in high order bits?). Also, if you don't know whether source address or dest. is problem, do you change both at same time? JA: SCTP doesn't say what "different" means. Probably is some previous work there, but I don't know. EN: Reasonable strategies that can be used? If so, we should write them down. EL: Do we need a draft on application requirements? Don't think we need to go that far. Three issues,though. Don't talk to app. App says "not good enough, get me something else". Or mechanism that app can indicate preference based on performance in app. That one needs work. KL: May need section in all the drafts on this topic. JL: More useful also to talk about what are the forms of the API? e.g., OK, try another path... rather than trying to figure out too strictly how to figure out the division. BC: John just said 1/2 of what I want to say. Don't focus on algorithm, but what is interface between module that determines that and the module that requests it. CH: Today, socket API allows application to pin itself to interface, by binding it to a specific address. Means, application can bypass certain multihoming capabilities. Need to preserve this semantics. JA: Only works in one direction. CH: Peer can do what it wants. EN: Interesting thing in application interaction, but we need to understand what it means to bind to an address. CH: Bound to an interface, and all traffic goes through the interface. EN: Maybe we should make this more explicit-- more than just address (such as prefix?) END OF DAY ONE DAY TWO Multi6 notes, IETF 61, Nov. 10 2004 - Session 2 ----------------------------------------------- BC: solicits a name for the minutes... Roy Brabson takes the leaden chalice! BC/KL: next steps discussion, chairs have done some work and will present later. now design team Erik Nordmark will do referrals draft. draft-nordmark-multi6dt-refer-00.txt EN: Was also presented in OpenApps meeting to wake them up a bit. whats needed/tbd for them, in multihoming. EN: [slide: solution approaches] EN: There are several possible solutions. One aspect the design team has explored uses multiple locators without a separate identifier space. This can be contrasted with HIP, which uses a new identifier space that maps to a locator. EN: The identifiers are a 128-bit quantity, called a ULID. Underneath, there are multiple 128-bit locators. A new identifier name space will either take a long time to deploy, or implies a hierarchical allocation in DNS. EN: [slide implications of approaches] EN: Imply at the API, have to see 128bit IDs. calling it the ULID. underneath are multiple 128bit locators. Can be one of the locators. or can be HIP approach, separate thing. might, doesn't matter if reachable or not. EN: [slide, picture of stack/shim issues] EN: To make it explicit where it might be. EN: [slide likely outcome] EN: New ID namespace. delays introducing it, mapping. locator/id. HIP people talking about distributed hash tables. interesting longterm but take a while to get EN: or reuse what we have. AAAA or A6 allocations, in hierarchical manner, ensure only one owner. manage ID space. costs of management. fees but don't want to constrain, apps have to deal with multiple locators when they do stuff. EN: [slide the good news] EN: Applications use 'short term handles' from DNS. no changes required. don't care. EN: If application uses IP addresses in other ways, have to deal with setup, doc tries to name these things. - 'long term handles' - 'callbacks' - referral - identity comparison -dont know to what extent people do this (do people do this? higher level ids sure, cookies, certs..) EN: [slide possible application approaches] EN Use FQDNs as much as possible, bu not always possible. Use single IP addrress. Could use set of locators, or set plus ULID, if different. EN: [slide recommendations] EN: If possible, switch to FQDN. EN: I could say 'use URI' but can contain IP literals. EN: Use set of locators, ULID to get additional benefits (robustness). EN: Need new api getlocalallocators() getpeeralocators() setpeerlocators(). EN: [slide open issues] EN: Where do we do this work? here or apps people or ? EN: Some comment. Makes clear not proposing app interpret set of locators. Just a bunch o bits to store and use in referrals. EN: Questions about applications, should they already have fixed this? could be both 4 and 6?, done ad-hoc today. EN: Some clarifications. People commenting, need to specify API to permit FQDN in the scopes where it would make sense. some people only use the first thing they get from DNS UDP worries, EN: Questions? BC: Not in chair mode. People do comparisons on addresses in the long term. E.g., cookies used by web. Christian H. said in apps area that real serious app suites already handle using multiple addresses, so is it worth replicating at a lower level? Jim Bound(JB): Reading specs, watching, trying to be silent. good presentation. do not tell me again we can't do SCTP because you don't want to change the end node. Changing applications reopens handling multi6 via SCTP. EN: I agree want to be able to do it, but without being SCTP specific. have some technical work, non-specific, shim approach;. L2 shim and want to run SCTP need to work well JB: I want HBA. waiting until we decide. apps have to change. EN: We don't have to change everything. Only those doing referalls, not already using URIs or something else. This seems like mostly p2p which run into this. run between clients with referrals. servers have FQDN, can do things. dont think we need to change that much. JB: The greatest use is 2 providers on wireless. Don't have a way to do this right now. Want to. If we start whacking APIs, then that means we can change transport layer. BC: Need to be careful forcing upper layers to change to get very basic multihoming. If you do, then you might as well make a large change. But, if you can avoid this for simple case, then have a deployable way forward. If this also cuts off alternative solutions, then big mistake. Andrew McGreggan(sp)(MG): Serious apps go to some effort to get things right. need to provide methods for unserious, old, broken apps to do things right as well. Tim Shepard (TS): Any reaction from IPsec or SAAG on this? EN: More of a discussion on running IPsec on top of shim approach, but has not happened yet. Bill Sommerfeld (BS): Entire WG on multiaddress/multihoming for MobIKE. Useful to wave in that direction. TS: Helps people with broken networks. Give in plenary. BC: Line is empty. We want to spend 15-20m on summary discussion on design team output. BC: We now need to clear out open issues with design team. Everyone understood HBA's? I wonder if going to HBA solution would be viewed as complex by implementers. Jim? comment on complexity? is it something which implementers are going to find too complex? JB: I don't think so. my Q is, will it scale? By the time I build the key, for 10,000,000 google sessions...We build servers with that number per hour. But it would be easier with SCTP! MB: We have to make hash verification each time you re-home. assuming that not likely re-home all communications simultaneuously. If it happens, just have to hash. The goal is to provide better performance than PK for these cases. EN: Having thought about it, and after some discussion with Christian H the way we cope with performance, don't think its verification, think its state creation, dealing with lots of connections, thats an issue, say in drafts, have local policy to say when I am going to try and establish the state. What we haven't talked about, client sitting with one connection 'think its great idea to have state with google' and google says 'I don't think so' not in spec. Other end needs to deny and push back. I want to distinguish between that, and server not implementing v6. BC: Flow label or extenstion header? Wondering. I agree extension header is clean/simple design, except takes 8 bytes out of some packets but not all of them. IvB: Extension header is dangerous regardinge firewalls. And it is additional n*2 -n locators. If there is an additional header between ip and tcp a firewall may decide this is dangerous and block the packet. PS: There are at least two issues here. Use 8byte ext header, something current f/w impl don't support. So, want to use something like TLV. Other point is that some look for TCP/UDP header, might not want the ext headers. bitof a fine line. With regard to flow label or ext-header. don't have big opinion, saying if going to loose some bytes, doesn't matter if its 8 or 16 or 20 in my opinio. BC: We are not meant to do protocol design here. This no simple decision. TS: I am wondering about the hash based address approach. It seems to allow you these three choices (if I understand it correctly): user work factor attacker work factor ---------------- -------------------- 1 2^59 2^16 2^75 2^32 2^91 and it's not clear that any of those are useful. If security is not needed, then perhaps we don't even need to use the HBA scheme. If security is needed, then perhaps this scheme isn't much better than using some public key cryptography. The high level question is, what are we trying to secure, and is it secure enough? BC: Threat mitigation. Everything is relative. PS: I am not sure if relevant or important in previous comment, I don't think we should be asking if somenbody wants to route behind 64 bits, or behind 64 and also wants to use MH solution. MB: Security of HBAs. The limit of security is 64 bits. It doesn't have to do with PK, if you need more security, you need more bits. This brings us back to apps and referalls, if we want to use routable ULIDs all we have is 64. If we want more than this, have to use ULIDs which are non-routables. This doesn't relate to HBAs CGA have same problem. BC: Key management in internet as whole is hard problem. EN: What threats, it's redirection attacks. Security need e2e crypto anyway, ssl or whatever. If you do that, e2e IPSEC, combine with whats going on underneath and get better protections? redirect to /dev/null. Just DoS. BC: I am trying to persuade Marcello there is a MiTM attack. He thinks I'm wrong. I hope I'm wrong. TS: Key mgt. Even Christian mentioned this, can use PK in ways which dont need mgt, purpose built keys, draft written a while ago. BC: But without the advantages of HBA. ??: Purpose built keys, guarantees person start talking to will be person finish talking to. Other examples check neighbour discovery IvB: Could someone mention that (IIRC) the design team meant HBA to be a good option for situations where additional strong crypto isn't appropriate. So it's not meant to be the pinnacle of strong security. Ingress Filtering Compatibility for IPv6 Multihomed Sites (Marcello) ==================================================================== Personal submission with Christian H and, Richard Draves and Marcello. draft-huitema-multi6-ingress-filtering-00.txt MB: Scenario description of 'legacy host' in the m-h site. Approaches: relax ingress filtering (remove it) OR some form of source routing posits some boundary domain that allows an exit selection based on the source address in the packet. MB: A degenerate case is a single site exit router (in which case why multi-home?) or the DMZ as a boundary zone IvB: Wouldn't it be possible to make sure all the legacy hosts only talk to one router? i think keeping legacy / new hosts apart would be relatively easy with vlans. Francis Dupont(FD): Tried this before. Not easy to setup. Based on an access list, so it isn't dynamic. Need to find something better. EN: Something we need. Concerned that full SAD is too much. Need to keep this as small as possible. If you are at home, and have two routers you don't control (e.g., one from ADSL, one from cable), need a solution. This means you have to push it into the host. MB: If you want to support legacy hosts, then changing the host isn't an alternative. BC: Giving a host a second prefix introduces failure cases. MB: You need to configure internal routing to get the packet to the correct exit router. KL: Problem you are trying to solve is simpler than the SAD terminology. Most hosts only have a single default router that they need to reach. It is the default router that gets the traffic to the correct exit router. MB: If you can change the host, the host can tunnel packets to the correct router. KL: Not much you can change at the host. The hosts next hop is a default router. Short of giving it a full routing table, there isn't much more you can give. EN: A host would need to know internal vs. external, which it doesn't know today. EN: Hosts today maintain multiple default routers. The routing in the host doesn't tend to check the source IP address in making the choice of the default router. BC: Need to list this as an in issue in the next revision. Address Selection in Multihomed Environments (MB) ================================================= draft-huitema-multi6-addr-selection-00.txt MB: This is a short presentation. MB: Address selection in MH env is the same basic problem. Depending on source addr, assuming some sort of ingress filter mechanism, when outage in one of the ISPs, depending on src addr, selected by host in MH site, packets fail. Host has to select src addr containing right address for filter but, if host wants to commnuicate after outage, it will use one, doesn't consider this. We need to change mechanism for selecting src addr, complement, in order to take this into account. MB: This draft tries to analyse mechanisms. The goal is not to make changes to external hosts. MB [slide possible approaches] MB: Two types of mechanism. Proactive: let them know which not to use eg deprecate addresses via RADV. Other failure modes. MB: However need to learn. eg routing information from BGP could be used to complement, need more fine grained. Based on proposal of a while ago. Other mechanism are reactive. Try one, if it fails, pick other until one works. MB: Question here is who will do the trials? The application, or the IP layer, or try all at once. open issues. EN: How to get src address back into packet. Split of conceptual address selection, because of API, done in two places, getaddr() and down in TCP. BC: This is a good moment to switch speakers.. Arifumi. Source Address Selection Policy Distribution for Multhoming (Arifumi Matsumoto) ===================================================================== draft-arifumi-multi6-sas-policy-dist-00.txt Arifumi Matsumoto (AM): Three drafts closely linked on source address selection policy. AM: These items may not be in scope of the WG, but we want to have comments about the approach. The longest-match algorithm won't solve two-prefix two provider problem. AM: If one provider is not fully connected (eg closed network) the wrong src address cannot return to internet. AM: Broadband users in JP, 1/2 fall into mix of global and closed network providers (ie have both potentially available to them) eg closed file share, TV streaming services using closed nets, so this is a real problem. AM: Our approach: distribute src address selection policy to endpoints. RFC3484 defines address selection process. The policy table specifies matches, and delivery is done through route advertizement. Wecan use client router to control route visibility. AM: If endnode can handle more than one route default, better to give them all. AM: Why use DHCP-PD instead of RIPng, OSPF? Information may be the same, but usage is different. AM: Using OSPF or IPRng would require changes to routing, SAS policy should be more stable. AM: RA or DHCP? each has good points. AM: Summary: propose method to distribute SAS policy to end nodes. methods provides failure avoidance, but not detection or recovery. AM: This method can be used soon. This is not entire MH solution but it is a neccessary part of many networks. PS: Coudl you get back to 'walled garden' slide. Why do you need policy? Wouldn't it go to longest prefix? Tony Hain (TH): I can conceive of scenarios which can be constructed to cause this. Scaling. in terms of protocols, scaling issues. Have Service Provder push to end devices, don't think it scales. RA from edge router possible, from service provider to every edge device I am not sure. AM: I dont see issue. TH: Break the problem in half. PS: Not sure we are able in all cases, to talk to all devices using these mechanisms. can't know using DHCP. BC: Since affects 3848, then should follow up on IPv6 WG, not here. DHCP solution would have to be done elsewhere as well. Ask on ML if we should follow up here, or in IPv6 WG. Next Steps for the WG (Kurtis) ============================== BC: One remaining agenda item until end of tape.. next steps See slides following agenda slides... KL: The chairs have been talking over with ADs since yesterday. We now have work done by the design team, and there has not many questions. Do we agree the way forward is the way proposed by DT? (noting missing people) BC: Jim Bound has left. I need to channel Jim. Earlier on, raised issue of SCTP approach. I think point is, not rejected SCTP, not in position whole internet should switch to SCTP. Not practical proposition. But we do have to remember what Jim said, make sure have to talk to SCTP people. We need to make sure what we do, way we do it, affects stack, means SCTP can take advantage of shim, rather than fighting. BC: We want a discussion before show of hands. IvB: So what's the question on the table? PS: I am not sure what's the way forward. I am not sure what's being asked of me here. I think there are missing peices, though we are going the right direction. could be useful. asking for more here, or whats the assumption? BC: We can show all the slides, and cycle back to the beginning to ask the questions. JL: Read them all,I want to see work continue. I can guess where going, not going to talk to that. I think design team did good work, and this is a viable way for problem space. EN: The way forward may be nebulous. Is the question that the WG recommends, IESG sees work in this area or something? BC: That's is why I think it is better to show slides. KL: (not in chairmode) The question is also, that we have the solution categories, and we said we would evaluate them as equals until some point, then limit solutions on the table. This is part of that discussion. KL: I am now assuming yes to last question. KL: We have come long way. We are at the end of milestones, and end of charter. All docs are either with IESG or close. KL: We need to produce a solution architecture document. KL: This is not explicit in charter, but can be interpreted as that. KL: The proposal from chairs is to complete it in multi6, then take on current documents as WG docs, but not publish as RFCs. Publish the solutions document, as informational RFC or kick over wall to some other WG. Then close multi6 down. KL: Re-charter before next IETF, before 2005, new WG in internet area. can either/or complete document, develop protocols, APIs, for solution. We'd need an Internet Area Director to agree... Margaret Wasserman (MW): 'already asked me the question yesterday and I said yes.' KL: Question to WG: Do we agree to charter new WG in internet area? Agree to close multi6 when work done? agree to need for solution or arch document. GH: Going to say yes yes and yes, but we are better off starting solution arch document now, not a lot of work. It needs to document relationship between thinking of design team and framework. BC: So do it here, then when have other WG throw to them GH: Start now KL: Yes that was the intention GH: I will volunteer to work with others. BC: Procedural question is if we leave the design team, or form a editorial team PS: I am curious what was justification to take design team as ed team. Is it a recognition issue? BC: Wouldnt ask on 00 draft, wait for 01 drafts. MW: I think I was listed as member, but I was more a fly on wall, read, little care if published as separate or one. I want to see published in some form. More than 6 months to standardize, want doc on justifications, goals. arch. IvB: Do you guys want us (the design team) to make the current docs publishable, or to go on developing them technically? MW: Yes. BC: Question, so is the direction the way we see things going forward? Meta Q is this is the right question? GH: IT is not in our interests to publish incomplete docs, solutions architecture, nail to wall. BC: These are work in progress drafts. Some parts justification. Solution needs finishing, justification shouldn't get lost. Rationale documents. PS: On Geoffs comment, solutions arch doc needed soon. IvB: Maybe it would make sense to have a non-dt editor create a rough draft out of the parts of thye current documents that make sense to publish now? KL: Now asking other questions. KL: Is this the right advice to give IESG? charter, internet area, close multi6, arch document PS: I am not 100% sure internet area is right place. MW: WG doesnt get to decide on areas but is important to call the question. IvB: Brian: do you mean now (soon) or in 6 or 12 months or so? Because I feel we should have some time to prepare splitting up between a new wg, stuff that should stay here (if any) and stuff that needs to go too tother existing wgs BC: I want charter by March IETF, before next IETF. We may discover when we do analysis, little bits don't fit in internet, need ops WG. PS: I am bit uncomfortable about saying we should close multi6. I don't think we have missing pieces. BC: It may be new work we don't know about to be done. AD Question. David Kessens (DK): I like to get opportunity to close my first WG [laughter]. I like to thank everyone. You have done a great job on difficult topic to work on. come to the point quick. longer than thought, ready for recharter, can continue in internet area BC: WG chairs glad too. BC: Do people agree we need solns arch document? BC: Do we agree need to charter new work, probably new WG, probably internet area [not 100%] [but many] BC: Do we need to charter new work? [yes] BC: Question about area, discussed over weeks, personally do not want 9 month discussion. BC: Middle Q rather silly one. WG finished, stops. We are really asking we're done? [tentative hum] BC: and haven't finished? [silence] IvB: Are there no angry draft authors anymore? BC: In good shape. now need to encourage design team to do 01 versions. and, 00 solutions. Geoff Volunteered. BC: I would personally leave design team in place. True we had bunch of proto solutions drafts. close to infinity. clearly design team has selected major options, excludes things on the thable. IvB: Re interim meeting where we took a whole day to deal with unhappy draft authors (my take). BC: Clear chosen direction, recommend IESG to charter work in that direction. people who feel strongly wrong direction, tell IESG, or write draft. BC: Last words from Area Director? BC: still some tidying up in march, may have to meet. will try to make one hour session. BC: we're done. [close]

Slides

Agenda
Multi6 Design Team report
Multihoming L3 Shim Approach
Hash Based Addresses HBA
Functional decomposition
Address Selection, Failure Detection and Recovery in MULTI6
Multihoming and Applications
Address selection in multihomed environments
Ingress filtering compatibility for IPv6 multihomed sites
Source Address Selection Policy Distribution for Multihoming