2.4.2 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 58th IETF Meeting in Minneapolis, Minnesota USA. It may now be out-of-date.

Last Modified: 2003-10-16

Chair(s):
Kevin Dubray <kdubray@juniper.net>
Al Morton <acmorton@att.com>
Operations and Management Area Director(s):
Randy Bush <randy@psg.com>
Bert Wijnen <bwijnen@lucent.com>
Operations and Management Area Advisor:
Randy Bush <randy@psg.com>
Mailing Lists:
General Discussion: bmwg@ietf.org
To Subscribe: bmwg-request@ietf.org
In Body: subscribe your_email_address
Archive: ftp://ftp.ietf.org/ietf-mail-archive/bmwg/
Description of Working Group:
The major goal of the Benchmarking Methodology Working Group is to make a series of recommendations concerning the measurement of the performance characteristics of various internetworking technologies; further, these recommendations may focus on the systems or services that are built from these technologies.

Each recommendation will describe the class of equipment, system, or service being addressed; discuss the performance characteristics that are pertinent to that class; clearly identify a set of metrics that aid in the description of those characteristics; specify the methodologies required to collect said metrics; and lastly, present the requirements for the common, unambiguous reporting of benchmarking results.

To better distinguish the BMWG from other measurement initiatives in the IETF, the scope of the BMWG is limited to technology characterization using simulated stimuli in a laboratory environment. Said differently, the BMWG does not attempt to produce benchmarks for live, operational networks. Moreover, the benchmarks produced by this WG shall strive to be vendor independent or otherwise have universal applicability to a given technology class.

Because the demands of a particular technology may vary from deployment to deployment, a specific non-goal of the Working Group is to define acceptance criteria or performance requirements.

An ongoing task is to provide a forum for discussion regarding the advancement of measurements designed to provide insight on the operation internetworking technologies.

Goals and Milestones:
Done  Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.
Done  Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.
Done  Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.
Done  Submit workplan for initiating work on Benchmarking Methodology for LAN Switching Devices.
Done  Submit workplan for continuing work on the Terminology for Cell/Call Benchmarking draft.
Done  Submit initial draft of Benchmarking Methodology for LAN Switches.
Done  Submit Terminology for IP Multicast Benchmarking draft for AD Review.
Done  Submit Benchmarking Terminology for Firewall Performance for AD review
Done  Progress ATM benchmarking terminology draft to AD review.
Done  Submit Benchmarking Methodology for LAN Switching Devices draft for AD review.
Done  Submit first draft of Firewall Benchmarking Methodology.
Done  First Draft of Terminology for FIB related Router Performance Benchmarking.
Done  First Draft of Router Benchmarking Framework
Done  Progress Frame Relay benchmarking terminology draft to AD review.
Done  Methodology for ATM Benchmarking for AD review.
Done  Terminology for ATM ABR Benchmarking for AD review.
Done  Terminology for FIB related Router Performance Benchmarking to AD review.
Done  Firewall Benchmarking Methodology to AD Review
Done  First Draft of Methodology for FIB related Router Performance Benchmarking.
Done  First draft Net Traffic Control Benchmarking Methodology.
Done  Methodology for IP Multicast Benchmarking to AD Review.
Mar 03  Resource Reservation Benchmarking Terminology to AD Review
Done  First I-D on IPsec Device Benchmarking Terminology
Apr 03  Net Traffic Control Benchmarking Terminology to AD Review
Apr 03  Methodology for FIB related Router Performance Benchmarking to AD review.
Apr 03  EGP Convergence Benchmarking Terminology to AD Review
Jul 03  Resource Reservation Benchmarking Methodology to AD Review
Jul 03  Basic BGP Convergence Benchmarking Methodology to AD Review.
Dec 03  Net Traffic Control Benchmarking Methodology to AD Review.
Dec 03  IPsec Device Benchmarking Terminology to AD Review
Internet-Drafts:
  • - draft-ietf-bmwg-mcastm-13.txt
  • - draft-ietf-bmwg-dsmterm-08.txt
  • - draft-ietf-bmwg-benchres-term-03.txt
  • - draft-ietf-bmwg-conterm-05.txt
  • - draft-ietf-bmwg-ospfconv-term-06.txt
  • - draft-ietf-bmwg-ospfconv-intraarea-06.txt
  • - draft-ietf-bmwg-ospfconv-applicability-03.txt
  • - draft-ietf-bmwg-ipsec-term-02.txt
  • - draft-ietf-bmwg-igp-dataplane-conv-meth-01.txt
  • - draft-ietf-bmwg-igp-dataplane-conv-term-01.txt
  • - draft-ietf-bmwg-igp-dataplane-conv-app-01.txt
  • - draft-ietf-bmwg-acc-bench-term-01.txt
  • - draft-ietf-bmwg-acc-bench-framework-00.txt
  • Request For Comments:
    RFCStatusTitle
    RFC1242 I Benchmarking Terminology for Network Interconnection Devices
    RFC1944 I Benchmarking Methodology for Network Interconnect Devices
    RFC2285 I Benchmarking Terminology for LAN Switching Devices
    RFC2432 I Terminology for IP Multicast Benchmarking
    RFC2544 I Benchmarking Methodology for Network Interconnect Devices
    RFC2647 I Benchmarking Terminology for Firewall Performance
    RFC2761 I Terminology for ATM Benchmarking
    RFC2889 I Benchmarking Methodology for LAN Switching Devices
    RFC3116 I Methodology for ATM Benchmarking
    RFC3133 I Terminology for Frame Relay Benchmarking
    RFC3134 I Terminology for ATM ABR Benchmarking
    RFC3222 I Terminology for Forwarding Information Base (FIB) based Router Performance
    RFC3511 I Benchmarking Methodology for Firewall Performance

    Current Meeting Report

    closed.Benchmarking Methodology WG (bmwg)
    
    
    
    Wednesday, November 12, 2003, 1300-1500
    =======================================
    
    
    
    CHAIRS Kevin Dubray <kdubray@juniper.net>
               Al Morton <acmorton@att.com>
    
    
    
    Reported by Al Morton and Kevin Dubray, using information generously 
    compiled by Scott Poretsky and Tony DeLaRosa as official 
    note-takers.
    
    
    
    About 25 people attended the BMWG session.
    
    
    The session's agenda was approved as presented.
    
    
    1.   Working Group Status (Morton)
    
    
    The status of various BMWG I-Ds and proposals were reported thusly:
    
    
    AD/IESG Review
         <draft-ietf-bmwg-conterm-05.txt>, revised, under review.
         <draft-ietf-bmwg-mcastm-13.txt>, OPS Directorate Comments
         
    <draft-ietf-bmwg-ospfconv-term-06.txt>,  OPS Dir Comments
         
    <draft-ietf-bmwg-ospfconv-intraarea-06.txt>, same
         
    <draft-ietf-bmwg-ospfconv-applicability-03.txt>, same
    I-D Last Call
         <draft-ietf-bmwg-fib-meth-01.txt>,  Call ended 3/14.
         <draft-ietf-bmwg-dsmterm-08.txt>, Call ended 11/7, w/comment.
    I-Ds
         
    <draft-ietf-bmwg-ipsec-term-02.txt>, draft 10/2003, 
    comments@57
         
    <draft-ietf-bmwg-benchres-term-04.txt>, back in WG, comments@57
         
    <draft-ietf-bmwg-acc-bench-term-01.txt> Revised on comments
         
    <draft-ietf-bmwg-acc-bench-framework-00.txt> New
         
    <draft-ietf-bmwg-igp-dataplane-conv-term-01.txt> Revised
         
    <draft-ietf-bmwg-igp-dataplane-conv-meth-01.txt> Revised
         
    <draft-ietf-bmwg-igp-dataplane-conv-app-01.txt> Revised
    Expired BMWG I-Ds
         <draft-ietf-bmwg-bgpbas-01.txt>, Pending term. progress
         
    <draft-ietf-bmwg-benchres-method-00.txt> Pending term progress
    New Work proposals.
         
    <draft-kimura-protection-term-02.txt>
         
    <draft-poretsky-mpls-protection-meth-01.txt>
    
    
    The old BMWG I-D on FIB benchmarking methodology I-D (expired) seems to be 
    dying a quiet death.  It was one of the original I-Ds to address 
    convergence.  The WG needs to decide whether this work is critical to the 
    body of convergence work or not.
    
    
    
    ----------------------------------------
    ---------------------------------
    2.   Benchmarking Network-layer Traffic Control Mechanisms:
           Terminology -- Communicate latest changes & results of WG Last Call 
    ending Nov 7th. (J. Perser et al.)
    
    
           
    http://www.ietf.org/internet-drafts/d
    raft-ietf-bmwg-dsmterm-08.txt
           
    http://www.ietf.org/mail-archive/work
    ing-groups/bmwg/current/
    
    
    Jerry Perser presented slides (see proceedings) outlining the changes 
    reflected in the latest version of the draft as well as issues 
    surrounding the term "Channel Capacity".  Jerry briefly articulated the 
    notion of "forwarding capacity," as a possible successor to the channel 
    capacity term.
    
    
    Jerry hopes to get the wording incorporated into the I-D quickly.  A last 
    call will be issued subsequent to the I-D's announcement.
    
    
    
    ----------------------------------------
    ---------------------------------
    3.  IPsec Device Benchmarking Terminology I-D. New developments in 02.
          Finalize recent list discussions on the packet mix 
    possibilities; this topic has relevance to many other efforts. (A. 
    Morton)
    
    
          
    http://www.ietf.org/internet-drafts/d
    raft-ietf-bmwg-ipsec-term-02.txt
          
    http://www.ietf.org/mail-archive/work
    ing-groups/bmwg/current/
    
    
    Al thanked Michele Bustos, Tim Van Herck and Merike Kaeo for preparing 
    slide material.  The question at the top of the docket for this draft was 
    "IMIX", or the specifying of an offered load composed of something other 
    than a series of packets having the same size. Three possible 
    alternatives for "mixes" were presented to the working group.
    
    
    Commentary was offered that while this discussion was beneficial, it 
    doesn't appear exclusively bound to IPsec benchmarking.  It was 
    suggested that the issue be handled outside the scope of the IPsec 
    benchmarking, in a more general BMWG context.  Heads nodded.
    
    
    There was concern expressed that any single mix of packet sizes would not be 
    typical for all classes of users. For example, the NLANR notion of IMIX 
    might not be typical for users of, say, VPNs.  It was countered that the 
    specification could allow a generalized packet mix (hereafter referred to as a 
    "mix") to be specified locally by the testing body, with strict 
    requirements on mix reporting.
    
    
    It was pointed out that local specification of mix may 
    short-circuit the notion "neutrality" that is very desirable in 
    benchmarks.  That is to say, a "mix" while adequately documented, could 
    still be biased based on the mix's composition.
    
    
    So, it was argued, viable solutions might be: a) give everyone a single 
    mix, or b) give no one a mix (as BMWG has done to date).
    
    
    If, on the other hand, locally defined mixes were deemed acceptable, the 
    group believed that the mix option requiring strict reporting 
    constraints must be in place. It was pointed out that a mix composed of an 
    increasing sweep of packet sizes is a subset of the locally defined "mix" 
    option.
    
    
    It was again articulated this discussion should happen independent of the 
    IPsec terminology work.
    
    
    
    
    ----------------------------------------
    ---------------------------------
    4.  IGP Data plane convergence benchmark I-Ds.
         Changes from 00 to 01 to address issues from mailing list.
         Ready for Last Call?  Authors believe these are ready.
         (S.Poretsky)
    
    
    
    http://www.ietf.org/internet-drafts/draf
    t-ietf-bmwg-igp-dataplane-conv-term-01.txt
    
    http://www.ietf.org/internet-drafts/draf
    t-ietf-bmwg-igp-dataplane-conv-meth-01.txt
    
    http://www.ietf.org/internet-drafts/draf
    t-ietf-bmwg-igp-dataplane-conv-app-01.txt
    
    
    Scott presented his slides which outlined changes to the I-Ds and new 
    terms added.
    
    
    One of the new terms, "Packet Sampling Interval", essentially sets the time 
    resolution for re-convergence measurements. There was a comment that the 
    draft should have recommendations for the appropriate intervals to use with 
    respect to the measured convergence time interval.  Scott described the 
    existing recommendations using 0.1 second Packet Sampling Interval. The 01 
    terminology and methodology drafts provide some detail, but there was 
    interest expressed in more explicit recommendations in this area.
    
    
    Much time was spent on the slide which illustrated various terms that 
    describe the dataplane's response to 
    failure/re-convergence (e.g., Convergence Event Instant, Convergence Event 
    Transition, Convergence Recovery Transition, Convergence Recovery 
    Instant, etc.) on their relation with each other. (Slide 4 in this talk, see 
    proceedings.)
    
    
    A question was posed on how do you capture the graph of Slide 4 in the 
    methodology.  One brand of test equipment produced this plot, but not all 
    test equipment will produce a graph like this, and graphing the results is an 
    exercise left to the reader. Scott replied that all the required 
    components are in the reporting section of the Methodology I-D and 
    defined in the Terminology ID (where a tabular format is defined).
    
    
    There was some discussion regarding the manner to capture the specific 
    times of rate stability (or instability).  Scott said the 
    methodology maintains a continuous survey for which there are 2 major 
    constraints: 1) do you have full convergence, 2) how long do you sustain 
    that convergence.
    
    
    Another question was asked regarding how would a reduction in frame rate 
    following convergence affect the convergence time. Scott replied the 
    convergence time approaches infinity in this case (perhaps the DUT never 
    achieves Full Convergence).
    
    
    Regarding convergence recovery instant, how does one determine the 
    precise times that there are no longer packet losses?  It was thought that 
    the return to "full frame rate" would reflect that instant (the 
    definition of Rate-Derived Convergence time will be revised). It was 
    offered that it appeared that the indicators for convergence need to be 
    better defined, since one needs multiple measurements over time to 
    declare stability.
    
    
    In summary, use of Forwarding Rate in the definitions and discussion of 
    time "Instants" may be preferable to Packet Loss, since these Instants are 
    indispensable to Rate-Derived Convergence Time. Also, there was 
    agreement to define a time period for evaluation of convergence 
    stability following the "Convergence Recovery Instant".
    
    
    The meeting was reminded by an attendee that the BMWG should strive to 
    produce definitions that are understandable without pictures (this is 
    generally good advice).
    
    
    
    
    ----------------------------------------
    ---------------------------------
    5.  Terminology for Benchmarking Core Router Accelerated Life Testing.
         Changes to Term from 00 to 01. New Framework Draft 00.
         Application of Term and Framework to build forthcoming 
    Methodology Issues? Comments? Concerns? (S. Poretsky et al.)
    
    
    
    http://www.ietf.cnri.reston.va.us/intern
    et-drafts/draft-ietf-bmwg-acc-bench-term-01.txt
    
    http://www.ietf.cnri.reston.va.us/intern
    et-drafts/draft-ietf
    -bmwg-acc-bench-framework-00.txt
    
    
    The main comments on this topic sought better definitions of the terms and 
    benchmarks.  Unexpected Loss, a form of error that would trigger the end of a 
    test interval, needs to be clearly differentiated from other losses that are 
    expected (on the same interfaces).  There was a suggestion to go beyond 
    loss, and define errors in terms of packet delay and reordering. From a 
    management interface perspective, very long delay on an SSH 
    connection is as bad as loss (can't manage the box).
    
    
    The current benchmarks assess the relative performance as a time 
    interval of correct operation, but have a Pass/Fail criteria as opposed to 
    quantitative characterization. The benchmarking methodology needs to 
    characterize the DUT performance along a dimension that allows easy 
    comparison with other vendor's products, so the approach should be to use 
    stressful configurations that produce errors in a short amount of time.  
    Discussion revealed that some participants use this sort of testing in 
    RFPs, and also to compare vendor's products with under 
    user-specific conditions, and for this the terminology and framework may be 
    enough info. But there should be a very explicit connection to the 
    traditional black-box benchmarks added to these drafts.
    
    
    
    ----------------------------------------
    ---------------------------------
    6.  Revised Milestones. (Chairs)
    The proposed milestones were presented without remarkable 
    commentary.
    Revise:
    Dec 03  Resource Reservation Benchmarking Terminology to AD Review
    Dec 03  Net Traffic Control Benchmarking Terminology to AD Review
    
    
    Dec 03  Support EGP Convergence Benchmarking Terminology through AD 
    Review
    Dec 03  Support Multicast Benchmarking Methodology through AD Review
    Mar 04  Support OSPF Convergence Benchmarking Drafts through AD Review
    
    
    New:
    Dec 03  IPsec Device Benchmarking Terminology to AD Review
    Dec 04  Net Traffic Control Benchmarking Methodology to AD Review.
    Dec 04  IPsec Device Benchmarking Methodology to AD Review
    Dec 03  IGP/Data-Plane Terminology I-D to AD Review
    Mar 04  IGP/Data-Plane Methodology and Applicability I-Ds to AD Review
    Dec 03  Router Accelerated Test Terminology I-D to AD Review
    Jul 04  Router Accelerated Test Method. and Applicability I-Ds to AD 
    Review
    
    
    Remove: (Pending FIB discussion/resolution)
    Apr 03  Methodology for FIB related Router Performance Benchmarking to AD 
    review.
    Jul 03  Resource Reservation Benchmarking Methodology to AD Review
    Jul 03  Basic BGP Convergence Benchmarking Methodology to AD Review.
    
    
    
    ----------------------------------------
    ---------------------------------
    7.  New Work Proposals on Protection Switching Methodology (now there are 2)
         Should BMWG take a broad, or technology-specific approach to this 
    work?
          - This is a placeholder for advanced discussion of this topic
          - There *will* be discussion on the list before the meeting to 
    better articulate the questions posed to attendees and the list.
    
    
         Earlier proposal Automatic Protection Switching Benchmark 
    Terminology
         The WG has discussed this item at several previous meetings.
         The Goal has been articulated as follows
    
    
         The objective of this effort is to produce a terminology and 
    methodology set of drafts that specifies the performance 
    benchmarking sub-IP layer resiliency and protection technologies. There is a 
    common terminology draft and multiple methodology drafts for the 
    technologies.  The methodology drafts will include (but not limited to) 
    Automatic Protection Switching (APS) for SONET/SDH, Fast Reroute for 
    Multi-Protocol Label Switching (MPLS), and Resilient Packet Ring (RPR) 
    standardized in IEEE. (T.Kimura, J.Perser)
    
    
    
    http://www.ietf.org/internet-drafts/draf
    t-kimura-protection-term-02.txt
    
    
    
         New Proposal Draft on MPLS Protection Benchmarking Methodology
         The Goal has been articulated as follows
    
    To develop a benchmarking methodology for MPLS protection mechanisms 
    including Headend Reroute, Standby LSP, Fast Reroute Detour Mode, and Fast 
    Reroute Bypass Mode.  Test cases will benchmark the DUT in all LSR roles 
    including Ingress, Midpoint, Egress, PLR, and Merge Node.  Test cases are 
    provided for Link and Node protection.  The most common causes of 
    failover, such as local administrative shutdown, local link failure, and 
    remote link failure are considered.    The Benchmark for each test case is 
    calculated from the measured packet loss during a failover event.  
    Benchmarks can be used to compare failover performance of different 
    Label-Switched Routers and evaluate the different MPLS protection 
    mechanisms.  The methodology uses existing MPLS and MPLS protection 
    mechanism terminology defined in current IETF RFCs.  (S.Poretsky, et al.)
    
    
    
    http://www.ietf.org/internet-drafts/draf
    t-poretsky-mpls-protection-meth-01.txt
    
    
    Discussion of these two new BMWG proposals vacillated between the need for 
    MPLS-specific failover benchmarks vs. generic IP service protection 
    provided by underlying transport mechanisms. The two approaches appear to be 
    solidifying as separate entities. One participant identified the his need to 
    understand the performance of MPLS network elements in their native mode of 
    operation, at the Labeled Packet layer and in terms of LSPs or VPNs.  
    Clearly, the original proposal/direction to characterize protection 
    effects at the IP layer retains merit, and the terminology draft now 
    presents a fairly advanced framework.
    
    
    There is a clear need to recruit more participation in both these 
    protection-related work items (thus far discussion has been among the 
    authors and the WG chairs).  Thus the call(s) for support will include 
    requests for participants who will actively review the work, having 
    identified themselves as providing some valuable perspective or 
    expertise.
    
    
    
    ----------------------------------------
    ---------------------------------
    8. Trends in BMWG
    
    
    The chairs managed to insert many comments on WG trends during the 
    meeting, including apparent lack of readership in certain areas beyond the 
    author list(s). There maybe a need for a change in the instructions for 
    Last Calls, where even agreeable readers must respond with some 
    commentary, and drafts do not progress until sufficient review has been 
    completed.
    
    

    Slides

    Agenda
    Benchmarking Network-layer Traffic Control Mechanisms
    Accelerated Stress Benchmarking
    IGP Data Plane Convergence Benchmarking
    IMIX - The Controversy
    Benchmarking Terminology for Protection Performance
    Benchmarking Methodology for MPLS Protection Mechanisms