RRG Meeting Minutes (03/27/09) ============================== RRG held an all day meeting on March 27, 2009. Since its re-charter in February 2007, this is the 7th meeting RRG held in conjunction with IETF. 1/ By the original plan, a recommendation for a scalable routing architecture should be due at the end of this month.  We all agreed to postpone that date by one year. 2/ The morning agenda included 3 presentations together with discussions. (1) David Meyer: Architectural Implications of Locator/ID Separation Dave reported the observations and lessons learned from LISP trial experimentation. As a Map-and-Encap solution, LISP requires the edge-address (called EID in LISP) to be mapped to a routed address (called RLOC in LISP). LISP trial revealed two classes of problems. The first problem: The Locator Path Liveness Problem -- an EID is mapped to one of potentially multiple RLOCs, yet it seems difficult to find a simple and efficient solution to allow all the mapping points to keep track the status of RLOCs in real time. This problem is not specific to LISP, but is also viewed as general issues for other solutions that require a mapping layer/function. The second problem is the State Synchronization Problem - when an ITR fails, the recovery is supposed to be a pure local act. However because state is kept at xTRs, failure recovery involves state re-synchronization, hence the remote nodes. Following Dave's presentation there were some discussions on the exact measure of the solutions, whether N^2 or N*M. Another opinion is whether the failure rate can be low enough that the complexities do not have a noticeable impact in practice. It is also unclear, at this time, how dynamic/static we should make the mapping system to be. (2) Dan Jen: Scaling FIBs with Virtual Aggregation: How Much Stretch? How|Much FIB Savings? Virtual Aggregation (VA) is proposed as one solution to reduce the FIB size, it is important to understand its cost in other measures, one such measure is path stretch. Dan Jen used data obtained from a tier-1 AS to estimate the answer to this question. If one installs APRs (Aggregation Prefix Router) at each major POP, which is a sound engineering approach, then routers can have 80-90% reduction in FIB size, while the path strength seems small (32% routes get no stretch, 38% get 1-8 msec stretch delay, 30% get 9-16 msec stretch delay). Thus although VA is not a full RRG solution, but it can buy us time to roll out other scalability solutions. Discussion: one issue is clear that popular prefixes (those carrying heavy traffic flows) must not be aggregated to avoid APR overloading. Another concern is how representative this result can be, given ISPs in different continents may have rather different topologies. Yet another concern is about the stretch delay, 8msec can be a big problem for some users/applications. However the biggest concern is whether VA can get us locked in a local optimum--reduced FIB pressure would allow more PI prefixes being announced to the global routing, but given the RIB size is not reduced at all, some resources would eventually got exhausted. (3) Varun Khare: Evolution Towards Global Routing Scalability Varun articulated an evolutionary path to scalable routing: one can start from VA, then build RIB reduction on top of VA, and eventually evolves to a routing system with controllable sizes. One interesting point made in the talk is the subtle difference between this evolutionary path and the solutions that claim being "incremental deployable": the latter normally mean being able to co-exist with the existing system, but cannot bring full benefit to first movers until most others have upgraded. The proposed evolution path aims at providing full benefit to the first movers. Discussions: There was some clarification about the word "evolution": some people mistook it as meant a random walk, it is not. It is intentional self-guided evolution, not the real-world evolution, where random permutations. Concern about local optimum was expressed again: what is needed is a global road map to make progress. "If everybody is optimizing for his local operation, this doesn't solve the global problem." 3/ The afternoon agenda included 2 presentations together with discussions. (1) Christian Vogt: A report on the Dagstuhl Seminar on "Naming and Addressing in a Future Internet", which was held on March 1-4, 2009. Christian gave a personal version of the preliminary report on the workshop. This presentation led to lively discussions over a broad space of naming and identifier issues. (2) Lixia Zhang: Terminology Definitions and Clarifications Lixia led the discussion on defining terminology. The experience over the last two years of efforts shows that it is important to nail down a clearly defined terminology in order to speed up RRG's progress. Through a lively discussion contributed by many participants, the following consensus was reached on how many different things that are needed in the context of discussing routing scalability: 1. IP prefixes that are in the DFZ routing table (One should keep in mind that there are more than one DFZs) 2. IP prefixes that are not in the DFZ Example: this thing that LISP called “EID”, but really not an end point identifier 3. “endpoint” identifier - Outside the routing system (local or global) - Debates going on whether calling it “Stack ID” 4. At least one additional identifier may need to be defined - it would be tither identical to #3, or closer to application than #3. The agenda was finished by this time. Scott Brim offered to give a talk on "What Identifiers Do We Need". He tried to take a position that stack-ID, or its equivalent, is not necessary; in today's practice we get around stack-ID by the combination of address+port number. People pointed out that the current practice has various limitations, and stack-ID can indeed bring a big value to the future architecture. The meeting adjourned around 4:00PM ACKNOWLEDGMENT ============== Thanks to Scott Brim and Benno Overeinder for taking notes during the meeting, and to Iljitsch van Beijnum for help with jabber scribe!