2.7.9 Path MTU Discovery (pmtud)

NOTE: This charter is a snapshot of the 59th IETF Meeting in Seoul, Korea. It may now be out-of-date.

Last Modified: 2004-02-11

Chair(s):
Matt Mathis <mathis@psc.edu>
Transport Area Director(s):
Allison Mankin <mankin@psg.com>
Jon Peterson <jon.peterson@neustar.biz>
Transport Area Advisor:
Allison Mankin <mankin@psg.com>
Mailing Lists:
General Discussion: mtu@psc.edu
To Subscribe: majordomo@psc.edu with
Archive: http://www.psc.edu/~mathis/MTU/mbox.txt
Description of Working Group:
The goal of the PMTUD working group is to specify a robust method for determining the IP Maximum Transmission Unit supported over an end-to-end path. This new method is expected to update most uses of RFC1191 and RFC1981, the current standards track protocols for this purpose. Various weakness in the current methods are documented in RFC2923, and have proven to be a chronic impediment to the deployment of new technologies that alter the path MTU, such as tunnels and new types of link layers. The proposed new method does not rely on ICMP or other messages from the network. It finds the proper MTU by starting a connection using relatively small packets (e.g. TCP segments) and searching upwards by probing with progressively larger test packets (containing application data). If a probe packet is successfully delivered, then the path MTU is raised. The isolated loss of a probe packet (with or without an ICMP can't fragment message) is treated as an indication of a MTU limit, and not a congestion indicator. The working group will specify the method for use in TCP, SCTP, and will outline what is necessary to support the method in transports such as DCCP. It will particularly describe the precise conditions under which lost packets are not treated as congestion indications. The work will pay particular attention to details that affect robustness and security. Path MTU discovery has the potential to interact with many other parts of the Internet, including all link, transport, encapsulation and tunnel protocols. Thereforethis working group will particularly encourage input from a wide cross section of the IETF to help to maximize the robustness of path MTU discovery in the presence of pathological behaviors from other components. Input draft: Packetization Layer Path MTU Discovery draft-mathis-plpmtud-00.txt
Goals and Milestones:
Jul 03  Reorganized Internet-Draft. Solicit implementation and field experience.
Dec 03  Update Internet-Draft incorporating implementers experience, actively solicit input from stakeholders - all communities that might be affected by changing PMTUD.
Feb 04  Submit completed Internet-draft and a PMTUD MIB draft for Proposed Standard.
Internet-Drafts:
  • - draft-ietf-pmtud-method-01.txt
  • No Request For Comments

    Current Meeting Report

    IETF Path MTU Discovery WG (pmtud)
    Thursday, March 4, 2004 at 15:30 to 17:30
    
    =========================================
    
    The meeting was moderated by the working group chairs, Matt Mathis and Matt 
    Zekauskas.  Simon Leinen and Matt Zekauskas took notes, which were edited 
    into these minutes by the chairs.
    
    AGENDA
    1. Agenda Bashing
    2. PMTUD method document status
    3. (if interest, and advocate): 
    draft-welzl-pmtud-options-01.txt
    3. Milestone status
    
    1. Agenda Bashing
       -- The Chairs
    
    Joe Touch: are we going to discuss Richardson IPsec draft?
    
    issue -- on IPsec ML, there is an active 2401bis discussion
        dealing with fragmentation problems
    
       reassemble before decrypt?  after decrypt?
         it's a mess
    
    MM: thinking about RFC, fragmentation even more harmful than thought
    
    MZ: what can we do?
    
    JT: add value to the discussion on the ipsec ML
       issue of fragmentation 2 network, a few rubrics: 
         1. performance
         2. different sources, else fail before resend
              [hangs that result from tunnels?--mjz]
         3. view DNF as a covert channel
               SPI negotiates SET ("ok"), COPY, or CLEAR
    
            argue, !clear, then drop.  prevent covert channel
    
           These folks want complete control over visibility in header...
    
           even if violates semantics to change something in middle...
    
    MM: one soln, IPv4 that emulates IPv6... DF on frags, too.
       
    JT: problem is IPsec in middle on tunnel, not enough room for all the 
    IPsec headers, must fragment.
       
    S. Parthi noted that in Solaris fragmentation is not on fast path, and not 
    often used.  But recently, we see a big rise for tunnel & encryption
    
    MM: there are security boxes that ignore fragmentation, or fragment 
    anyway if DF set;   and most of these are major manufacturer's boxes...
    
    
    2. PMTUD method update
       -- Matt Mathis
    
    See slides.
    
    On the "running code" slide, first send a probe packet, small probe on RTT, 
    then changes.  In this case, the transfer is CPU limited, and the 
    transfer speeds up when MTU rises.
    
    JT: It looks like the MTU went up and down?
    
    You send a single larger MTU probe packet, and spend the next RTT at the old 
    MTU until the probe succeeds (so you don't lose a whole window of data if 
    the MTU is too large).
    
    Joe wondered if the case where a tunnel is introduced and the MTU goes down 
    had been tested.
    
    If there is no ICMP message, TCP will get a timeout. At that point we pull 
    the MTU way back and then restart.
    
    Joe wanted to be sure that that case had been tested... and MM replied that 
    one of the implementations did that.
    
    Matt continued with the slide presentation.
    
    Matt Mathis made a call for implementations.  We encourage people to do 
    implementations. To get the next round of bugs out of the document will 
    require people to look at code and see where the document is 
    underspecified.
    
    Matt noted there were two open robustness issues. First, what to do in case 
    of transport when there are repeated timeouts. At one point we thought we 
    would be able to have unified text on this issue. For MTU - you want to 
    pull it down to something safe.  However, you would need to be unified on 
    different timeouts.  This is a bigger problem than thought it was.
    
    Second, what to do when the path ignores DF.
      
    However, the issue with MTU raising loss rate has been solved in the 
    current draft.
    
    In the case of not honoring DF:
    
    There is evidence it is becoming more common.  However, there is only 
    lore, not based on measurements.
    
    There's a case where this might be fatal with high-speed transfers. At high 
    rates it does not take much time to wrap the 16 bit IPid field.   If there is 
    a single connection at high speed, using ip frag inappropriately (UDP 
    transfer tools, say); If drop a low frag and protocol recovers in usual 
    way, might have remaining hi frag in recovery queue.  
    
    Now the sequence space wraps.
    
    You can now associate a new low frag w/old hi frag.
    
    If this causes a CRC error, then the packet is dropped. (and in fact all 
    packets would be dropped, because from then on the lower and upper halfs 
    would not associate correctly).  If the packet does not fail a CRC check, 
    then you could pass incorrect data to applicatoins.
    
    joe: seems like party doing frag has violated semantics -- reusing ipid 
    w/in 2MSL is broken
    
    Matt said that yes, but then applications are limited to 100pps.
    
    joe: if declaring that pepole do this, and have to live with it, then 
    maybe we should declare that IPid field not used.
    
    Matt thought it might be better to discourage using 
    fragmentation.
    
    Joe said that if tunnels are introduced you must fragment.
    
    Matt said that if PMTU discovery works, then don't do it. prob second doc.
    
    On repeated timeouts, MTU is set to 512/1230.  However, imagine 
    scenario when probing gives acceptable answer, but when raise MTU and send 
    many packets at the new MTU the link fails. Is there something we should do 
    to detect or prevent this?
    
    
    Matt said we still need contributors, especially in the following areas:
    
     *  SCTP 
     *  DCCP
     *  How to address interaction with tunnels in the document.
    
    Joe Touch volunteered to help with the tunnels.
    
    Larry Dunn asked about how much implementation experience we need before 
    pushing the document out.
    
    Matt replied that we can declare doc done before there is extensive 
    implimentation experience.  There is a little bit in the document that is 
    standard, the rest is heuristics that we can try changing. So, we can gain 
    experience, and then tweak the document in two years based on the 
    experience.
    
    
    At this point, we skipped to
    
    4. Milestones
       --The chairs
    
    We noted that the first milestone was complete, but the last one had an 
    item that we had not at all addressed: MIBs.
    
    We cover items that must be instrumented for visibility in the method 
    document.  For example, there are APIs for Unix that no longer support MTU 
    testing.  They support application fragmentation, and kernel 
    fragmentation (where the kernel does PMTUD for you) but no "send this big 
    packet even though you think it's too big".
    
    Furthermore, the MIBs are typically transport-dependent, so the most we 
    could specify might be a MIB fragment that is inserted into other MIBs.
    
    Joe Touch mentiond that there is work in another working group that is 
    relevant...  API/MIB info for link.  They did not think about the issue of 
    override, however.  It is going to be a BCP, we should comment on that 
    document.
    
    Joe also went back to the Richardson draft; it provides interim 
    solutions for an IPsec gateway.  It's not about IPsec, exactly, but 
    fragmentation after processing.  PMTUD is related to the IPsec working 
    group, and they need to know about this proposed technology.  We need to go 
    to that group and make them aware of it. 2401bis has been long coming -- we 
    should look at it before it goes to last call.
    
    Sunil from Sun asked if there is a testbed for this kind of thing?  
    Someplace to show the benefits of PMTUD working -- to be able to 
    quantify the gain.  Both of the Matt's said to come talk, there are 
    networks that support > 1500 byte MTUs.
    
    Finally, we touched on the Welzl draft.  There were no advocates in the 
    audience.
    
    Joe Touch noted that if the options are only partially implemented it 
    would not be productive to proceed forward, because you cannot ratchet the 
    MTU down.  You must search up.
    
    Matt Mathis said that inded the document does not stand alone, but must be 
    used with other methods.
    
    Joe Touch thought that we needed to heed the rule to be gentile with what 
    you do, liberal in what you accept.
    
    With no other comments the meeting adjourned early.
    
    

    Slides

    Agenda
    Path Maximum Transmission Unit Discovery