Last Modified: 2005-03-01
Done | Submit draft on calculation of IGP routes over TE tunnels to IESG for publication as Informational RFC | |
Done | Submit initial Internet Draft on IP Fast Reroute Framework | |
Jun 04 | Submit initial Internet Draft on Basic IP Fast Reroute mechanism | |
Aug 04 | Review various mechanisms for Advanced IP Fast Reroute | |
Oct 04 | Submit IP Fast Reroute Framework to IESG for publication as Informational RFC | |
Oct 04 | Submit specification on Basic IP Fast Reroute mechanism to IESG for publication as Proposed Standard | |
Nov 04 | Select the Advanced IP Fast Reroute mechanism | |
May 05 | Submit specification on Advanced IP Fast Reroute mechanism to IESG for publication as Proposed Standard |
RFC | Status | Title |
---|---|---|
RFC3682 | E | The Generalized TTL Security Mechanism (GTSM) |
RFC3906 | I | Calculating IGP Routes Over Traffic Engineering Tunnels |
RTGWG
IETF-62 RTGWG Meeting Minutes. THURSDAY, March 10, 2005 1300-1500 Afternoon Sessions I Chairs: Alex Zinin and Bill Fenner Alex Zinin started the discussion with a WG status update (see slides). - GTSM document will be LC'ed before Paris IETF, Alex to send comments - Basic IP FRR updated. - Draft MIB move to WG doc. Alia Atlas Gave a presentation on the status up the Loop Free Alternates draft (draft-ietf-rtgwg-ipfrr-spec-base) (See Slides for details). Alia gave a run down to the Changes to Draft including: Updates to the draft. There are some issues with various topologies in OSPF: - Multi areas: Can get paths back and forth across Area boundaries. Can cause the LFA to loop. - Virtual links: Different router than the back bone topology. - Alternate ABRs: - Multiarea ABRs: - AS External Routes: Recommend we do not support Virtual links or Alternate ABR If ASBR is in multiple non-backbone areas, then no other ABR is also in more than one of those non-backbone areas. AS External Route of specific type (1 or 2) must not be announced with a forwarding addresses from multiple non-backbone areas IF those non-backbone areas share at least one ABR. Future changes: Added Other applicability and Strict Downstream Alternates. Strict Downstream Alternates guarantees loop freedom but Cannot guarantee SRLG protections. Alia: Would like opinions to the list please. Alia: Applicability was for OSPF. Alex: Comments? Ted Seely: Was it Inter AS you were talking about? Alia: Yes, it was AS-external Ted : Any operators would use this for inter-AS? Isn't it easier to solve the intra-AS problem first? Alia: these are external routes, but the discussion is about behavior within a single AS Dave Ward: Regarding your applicability statement note, are you planning on having the protocol automatically figure out what's supported and what's not or have it as a user burden? Alia: It is a User burden today. Dave: Many problematic situations can be discovered automatically, may want to converge on some protocol mechanisms for that, rather than rely solely on the user. Alia: OK, we can discuss this Next Alex discussed :Micro-loop prevention DT update \ Reference: http://psg.com/~zinin/ietf/rtgwg/neverloop/ >> draft-bryant-shand-lf-conv-frmwk-00.txt >> draft-zinin-microloop-analysis-00.txt Alex described a summary of the Micro loop design team summary which was on the RTGWG list. So far the team has eliminated: -Incremental cost change, due to multiple convergence cycles and long convergence -Synched FIB installation, due to reqs for tight sync'ing, strong implementation constrains, service dependency on NTP, operational constrains Brian Habberman: Is it only service dependency or an architectural problem? Alex: No architectural problem, since general routing still doesn't depend on NTP, only the FRR piece that would depend on it Alex went through the various options currently under consideration by the design team. <See Slides for details> DT: PLSN Pros: -Easy to understand -Constant delay -Covers all topo changes -Allows SRLGs -About 90% of the loops -Changing topology is likely to improve coverage -Can be used as a basic method with further extensions Cons: - Less than 100 % - Loop traffic may congest loop free. - Asymmetric link costs require stricter safety condition (DS BF instead of LFA) Order FIB Install Pros: - 100% coverage [for micro-loops] - Asym costs covered - SRLGs covered at a cost Cons: - Distributed - Longer convergence (can be improved using explicit signaling) Tunnels: Pros: -Cover 100% -Const Convergence time (shorter than the ordered FIB) -Can handle SRLGs Cons -Requires "Covert announcements" in PRs -With Basic FRR requires additional an data-plane mechanisms -Tunnels have different operational and security operations -Distributed version involves increased complexity The next step for the Design Team is to come up with the recommendations for WG. So far Path Locking Safe Neighbors for basic FRR is a high runner. Hoping the design team will make recommendation in, one month? Questions? Alex: Will summarize considerations and send this to the ML and move the discussion there. Dave Ward: How did the DT come to a conclusion that Less than 100% coverage is good enough? Alex: Generally, there's no clear cut here--reaching 100% coverage comes at a high cost. Basic IP FRR doesn't give us 100% coverage, and PLSN's coverage is very similar to that. Dave: OK, do envision using PLSN as the base method and use something else for advanced? Alex: Both basic FRR and PLSN address local failures and micro-loops using redundancy in physical network topology. As we move towards better coverage, the physical-topology method may be augmented with advanced methods like tunnels. Use PLSN where possible, use advanced on top of that for the rest. Russ White: What is parts of topology not covered? Alex: Mike did simulations, that show that places where PLSN doesn't cover microloops are almost the same as the places where IP FRR wouldn't cover local failures. Mike: Most of the places where you can't repair you also can't prevent loops. There are some cases where loops and failures do not collocate, of course. [? Not sure if this was still Russ] I think IPFRR and PLSN May be valid Alex: The physical meaning is that since both basic IPFRR and PLSN use similar constructs for defining nbrs that can be used for local failure repair and micro-loop prevention, so if you have a problem protecting against a local failure, it's likely you'll have loops there too. Mike Shand: There are certain typical topologies that have problems: Rigorous Hub and spoke; some rings. If you have a well constructed mesh, it will play much better. Thomas Eriksson[sp?]:If you play with metrics it does not add much. Stewart: You might come up with poor topologies. You can reduce coverage with metrics or asymmetric link metrics. This line of questioning went too fast for the scribe: [?]Asymmetric link metrics occur as intentional or accidental? [Dave Ward?] How do I set up my metrics for traffic and repair. [?]Useful if more people with operations experiences could comment. IGP metric-based TE and IP FRR coverage will probably work against each other. Not going to solve it here and now. Tools should take this into account. Stewart asked for feedback on whether we're getting sufficient coverage given the restrictions. Alex: there was a Panel at NANOG: Message that Alex took from there, as well as from conversations with other SPs is operators are concerned about complexity and would rather see 80-90% coverage with simple methods they can understand and deploy. Stewart: did they say if they want to pick the traffic that is covered [by the 80-90%]? Alex: telling you what I heard. Alex: my position: basic IP FRR and advanced FRR will have different applicability compared to each other and MPLS FRR. There will be networks where they won't make any sense because of the topology, and networks where the fit nicely. All about trade-offs. Stewart Bryant presented a New Method called NotVia. This is an improvement on the tunnels method. (See Slides for details) IP Fast reroute via NotVia Advantages of NotVia: - Repairs all non-partitioning failures. When repair delivers traffic NotVia paths. - Two guardian routers deliver the traffic NotVia paths. - NotVia addresses are created per interface. - Stewart used some Diagrams illustrating NotVia - Some Sample results using NotVia: - Incremental SPF with early terminations in networks with 40-400 nodes equivalent of 5-13 full SPFs per node. - Really a very cheap [computationally] algorithm. More Advantages: - Works with MPLS LDP. Just push the label for an intermediate to get to P. Labels needed by S source. - Encapsulation (any IP encapsulation will work). - One tunnel does the job. - Not just Pt-pt unicast. - NotVia Covers LANs. - When You have a LAN Don't know whether the LAN or a component failed. Can diagnose what is going on. Simple case must assume LAN failed. - Can discover via the not via repair to test the P failure. Powerful technique. - NotVia Works for Multicast. Although this is a hard problem. - Loop Free alternates can be used with NotVia. Can use NotVia as an acceleration for LFA. - Multicast needs an encapsulation but works [claimed by Stewart and Mike]. - ECMP can be used with NotVia. - NotVia supports Incremental Deployment. [via capabilities] - You can Exclude routers that are not NotVia capable. Stewart explained the Routing extensions for NotVia: - Need to advertise NotVia address. - Must advertise protected component SRG Stewart explained Link Failure for NotVia. S (a source router in the diagram) can give any packet to any router. S can optimize which router is best. NotVia provides for diagnosis of faults in LANs by correlating NotVia paths. Loops don't form[?] Stewart explains it supports Multi-homed prefixes with two strategies. Joel Halpern: <referring to diagram> Node B [source Node] must have detected the failure already to use the [NotVia path]. Stewart: Yes Joel: B sends to P if P cannot forward P drops [and does not loop]? [Yes] Thomas Eriksson [sp?]. How would this work with MPLS? Some fast discussion and debate whether all the cases are covered by several people. Stewart: Does not change the problem--B needs to know about the failure quickly enough. Summary of NotVia by Stewart: - Solves the problem at an Intuitive level. - Works with asymmetric links. - Uses MPLS FRR hardware - Single-level encapsulation - Repair time is bounded. Alex: How many NotVia addresses? Stewart: One per link [in network] Joel: Isn't it one per neighbor? Stewart: yes, one per neighbor Joel: Except for SRLG ? Alex: Assumption B [referring to diagram] will detect the failure soon enough that it is not dropped on the floor. Stewart: Standard failure detection applies to every problem and every solution here. Same response characteristics here as with basic IP FRR. <some more discussion on this> [Dave?]BFD time is comparable with all neighbors. Part of all solution in all problem scenarios. Alia: This is Different than Loop Free Alternates. Might not be able to do the repair. Stewart: Exactly the same [problem as LFA]. Mike: [Agrees] Does not make a difference. If BFD time is different that is the least of the problems. George Swallow: Works for Any packet? {scribe missed response Affirmative I believe.} Joel: What about Transients? Do you Assume the path is available? Alex: The network is going under changes Stewart: NotVia is Never worse than one detection time. Alex and Stewart: Read the draft. It is available publicly. Thanks. Read send comments. More Discussion on NotVia: Alia: If the link between S and P is a broadcast link I don't think it works. Stewart: If a broadcast link it is like the pseudo node. Joel: Made a comment, and subsequently withdrew it. Next Mike Shand Presented Summary of agreements from DT and non agreements between Alia, Mike and Stewart. <See Slides for details> All agree: There are four advanced methods. - U-turn - IP-TE - PQ-Tunnels - NotVia Less agreement: What value do you assign? [To complexity etc] It is a Value judgment. Issue: Failure scenarios All agree Methods Must do: -Link failure, -node failure, -broadcast link failure, -local SRLG failure Some agreement Maybe : - SRLG failure - IP unicast failure - MPLS LDP failure - IP multicast failure? Questions On Complexity of the encapsulation. Tunneling: Use IP or LDP for encapsulation? Some traffic types probably need tunneling. All Tunneling requires label acquisition. On Computation complexity (least agreement in DT) More SPFs how many is too many? Final computation time? How long it take to prepare for the next failure? Not all SPFs are equal? Network is vulnerable during the [preparation] time. Delay convergence is provided. So you might get the computation done in time. Mike presented a number of SPF calculations table: See slides Notable: U-turn - 20 node ~ 10 SPFs NotVia - 15 equivalent SPFs Mike also presented comments on Routing extensions, Forwarding extensions, Coverage. See slides What cost completeness, what cost lack of it? Alex: Are we Done? Andy Smith: Are the Methods nodal base or link based? What if the node is congested? Joel: This is about failures not congestion management. Andy: Do you have a delay mechanism for Link Flapping, e.g. link flap dampening. Stewart: Absolutely. Andy: Is it Nodal based or link based? Does the algorithm delay announcements for each link or for the node as a whole? Alex: We're building on the existing IGP behavior. What you're asking is how IGP will react to a flapping link. Different vendors have different optimizations. There was a draft on that, which didn't survive. Might be useful to reconsider that. Assume the IGP can handle this. Andy: These are all good ideas, but DOS attacks etc. Alex: If a router is overloaded and starts dropping hellos, it will be detected as a failure Joel: Congestion better dealt with by other aspects. Alex: Agree with Joel, this is a general problem for IGPs, IP FRR, MPLS FRR, and should not be solved here. Valid problem, but IPFRR is not the right place. Don Fedyk: Do we need to allow an option to Bail out? Some Discussion for Don to clarify his comment. Alex: Not required. ? [Does this mean there are] Not a lot of nodal failures. Ted: How do you tell? Common failure situation is many links. Stewart: Do you mean SRLGs? Ted: Yes. Andy: SRLGs on a nodal basis or across an network Alia: Yes. Both local SRLGs and general SRLGs. [?]If we have a meshy network. Would not change anything if it is. Determinism. Ted: When something happens in the night how do you adjust? [Dave?]Pagers are for that. Dave Ward: IP FRR is not for that. [?]What is the path coming to a conclusion on those things? Alex: Need to finnish the base Spec first. Then Do the Mibs, applicability statement. The framework could be applicability statement. Will need Implementation reports. Chris Hopps[sp?] Do you want [basic] implementations? Alex: Basic implementations [first yes]. Stewart: Concerned about Basic IPFFR; believes it is an accidental approach. It may turn out not what we want. If it turns out to not be what we want, if so the RFC will legitimize it. We may not be able to deploy other better solutions. It's not entirely clear if Basic is the right intermediate step. Look at it as more of a facilitator. May mean we miss the Real solution for the real [situations]. Alex: talked to a few service providers. Believes the Basic FRR is still valid and useful, more operationally feasible. The decision made while chartering the WG--to produce basic first--is still valid, and shouldn't be revisited it now. Thomas Erricson: Agree keep down complexity. Alex: exactly why the basic first. Alex: Thanks, read drafts send comments. WG meeting concluded. |