2.4.1 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 61st IETF Meeting in Washington, DC USA. It may now be out-of-date.

Last Modified: 2004-09-22

Chair(s):

Kevin Dubray <kdubray@juniper.net>
Al Morton <acmorton@att.com>

Operations and Management Area Director(s):

Bert Wijnen <bwijnen@lucent.com>
David Kessens <david.kessens@nokia.com>

Operations and Management Area Advisor:

David Kessens <david.kessens@nokia.com>

Mailing Lists:

General Discussion: bmwg@ietf.org
To Subscribe: bmwg-request@ietf.org
In Body: subscribe your_email_address
Archive: http://www.ietf.org/mail-archive/web/bmwg/index.html

Description of Working Group:

The major goal of the Benchmarking Methodology Working Group is to make
a series of recommendations concerning the measurement of the
performance characteristics of various internetworking technologies;
further, these recommendations may focus on the systems or services
that are built from these technologies.

Each recommendation will describe the class of equipment, system, or
service being addressed; discuss the performance characteristics that
are pertinent to that class; clearly identify a set of metrics that aid
in the description of those characteristics; specify the methodologies
required to collect said metrics; and lastly, present the requirements
for the common, unambiguous reporting of benchmarking results.

To better distinguish the BMWG from other measurement initiatives in
the IETF, the scope of the BMWG is limited to technology
characterization using simulated stimuli in a laboratory environment.
Said differently, the BMWG does not attempt to produce benchmarks for
live, operational networks. Moreover, the benchmarks produced by this
WG
shall strive to be vendor independent or otherwise have universal
applicability to a given technology class.

Because the demands of a particular technology may vary from
deployment to deployment, a specific non-goal of the Working Group is
to define acceptance criteria or performance requirements.

An ongoing task is to provide a forum for discussion regarding the
advancement of measurements designed to provide insight on the
operation internetworking technologies.

Goals and Milestones:

Done  Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.
Done  Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.
Done  Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.
Done  Submit workplan for initiating work on Benchmarking Methodology for LAN Switching Devices.
Done  Submit workplan for continuing work on the Terminology for Cell/Call Benchmarking draft.
Done  Submit initial draft of Benchmarking Methodology for LAN Switches.
Done  Submit Terminology for IP Multicast Benchmarking draft for AD Review.
Done  Submit Benchmarking Terminology for Firewall Performance for AD review
Done  Progress ATM benchmarking terminology draft to AD review.
Done  Submit Benchmarking Methodology for LAN Switching Devices draft for AD review.
Done  Submit first draft of Firewall Benchmarking Methodology.
Done  First Draft of Terminology for FIB related Router Performance Benchmarking.
Done  First Draft of Router Benchmarking Framework
Done  Progress Frame Relay benchmarking terminology draft to AD review.
Done  Methodology for ATM Benchmarking for AD review.
Done  Terminology for ATM ABR Benchmarking for AD review.
Done  Terminology for FIB related Router Performance Benchmarking to AD review.
Done  Firewall Benchmarking Methodology to AD Review
Done  First Draft of Methodology for FIB related Router Performance Benchmarking.
Done  First draft Net Traffic Control Benchmarking Methodology.
Done  Methodology for IP Multicast Benchmarking to AD Review.
Done  Resource Reservation Benchmarking Terminology to AD Review
Done  First I-D on IPsec Device Benchmarking Terminology
Apr 03  Net Traffic Control Benchmarking Terminology to AD Review
Apr 03  Methodology for FIB related Router Performance Benchmarking to AD review.
Done  EGP Convergence Benchmarking Terminology to AD Review
Done  Resource Reservation Benchmarking Methodology to AD Review
Jul 03  Basic BGP Convergence Benchmarking Methodology to AD Review.
Dec 03  Net Traffic Control Benchmarking Methodology to AD Review.
Dec 03  IPsec Device Benchmarking Terminology to AD Review

Internet-Drafts:

  • draft-ietf-bmwg-conterm-06.txt
  • draft-ietf-bmwg-ospfconv-term-10.txt
  • draft-ietf-bmwg-ospfconv-intraarea-10.txt
  • draft-ietf-bmwg-ospfconv-applicability-07.txt
  • draft-ietf-bmwg-ipsec-term-04.txt
  • draft-ietf-bmwg-igp-dataplane-conv-meth-04.txt
  • draft-ietf-bmwg-igp-dataplane-conv-term-04.txt
  • draft-ietf-bmwg-igp-dataplane-conv-app-04.txt
  • draft-ietf-bmwg-acc-bench-term-04.txt
  • draft-ietf-bmwg-acc-bench-meth-01.txt
  • draft-ietf-bmwg-hash-stuffing-01.txt

    Request For Comments:

    RFCStatusTitle
    RFC1242 I Benchmarking Terminology for Network Interconnection Devices
    RFC1944 I Benchmarking Methodology for Network Interconnect Devices
    RFC2285 I Benchmarking Terminology for LAN Switching Devices
    RFC2432 I Terminology for IP Multicast Benchmarking
    RFC2544 I Benchmarking Methodology for Network Interconnect Devices
    RFC2647 I Benchmarking Terminology for Firewall Performance
    RFC2761 I Terminology for ATM Benchmarking
    RFC2889 I Benchmarking Methodology for LAN Switching Devices
    RFC3116 I Methodology for ATM Benchmarking
    RFC3133 I Terminology for Frame Relay Benchmarking
    RFC3134 I Terminology for ATM ABR Benchmarking
    RFC3222 I Terminology for Forwarding Information Base (FIB) based Router Performance
    RFC3511 I Benchmarking Methodology for Firewall Performance
    RFC3918 I Methodology for IP Multicast Benchmarking

    Current Meeting Report

    Benchmarking Methodology WG (bmwg) Thursday, November 11 at 0900-1130 ===================================== CHAIRS: Kevin Dubray <kdubray@juniper.net> Al Morton <acmorton@att.com> This meeting report was prepared by the chairs, based on detailed minutes collected by Phil Chimento as official note taker. There were 30 people in attendance, and Al Morton chaired the meeting. AGENDA: 1. Working Group Status ----------------------- Co-chair Al Morton highlighted areas of progress and areas needing attention. The OSPF Convergence Benchmarking Drafts and EGP convergence terminology have been approved, and are now in the RFC Editor's queue. Drafts in Last Call not discussed below include the Resource Reservation Terminology (new version expected soon for Last Call) and the Diffserv Traffic Control Terminology (Jeff Dunn and Cynthia Martin volunteered to assume the editor role with Scott Poretsky, since progress had stalled). The Howard Berkowitz indicated that development on the EGP methodology was resuming with input from Sue Hares (who has IRTF work in-progress on BGP convergence). The summary of new work items indicated that Hash and Stuffing proposal closed without objections, and that the current plan was for the Network Convergence Draft to proceed as an individual submission. Thomas Eriksson reminded the group of his LDP convergence proposal, and that a new draft was available. During the meeting, there was a short discussion on future scheduling. Cynthia Martin pointed out the need for work between meetings, and suggested another 2.5 hour session at Minneapolis. In addition, we need to avoid overlap with the IPv6 session in the future, and there was a request not to meet at 0900 Monday morning. 2. Revised Milestones --------------------- The milestones have been revised on BMWG's charter page. The chairs will work with AD David Kessens to add a milestone for new work on Address Hash and Bit Stuffing. 3. IPSec Terminology Update (M.Kaeo) ------------------------------------ Editors were distracted during the development and missed the deadline, but a new version of IPSec terminology will be done soon, followed by the initial methodology. Merike said they would like to see some review of the methodology before a WGLC on the terminology draft. 4. IGP Data plane convergence benchmark I-Ds (S.Poretsky) --------------------------------------------------------- http://www.ietf.org/internet-drafts/draft-ietf-bmwg-igp-dataplane-conv-term-04.txt http://www.ietf.org/internet-drafts/draft-ietf-bmwg-igp-dataplane-conv-meth-04.txt http://www.ietf.org/internet-drafts/draft-ietf-bmwg-igp-dataplane-conv-app-04.txt Scott described changes from 03 to 04 to address comments raised at the last meeting. The method should be agnostic to layers below IP, so the preference for SONET was removed. There were several recent list-comments based on some early implementation experience, and other comments seek to clarify definitions and the status of equations 1 and 2. Al Morton commented on the methodology draft section 3.2.3, where we should state that forwarding rate is measured on the next-best interface or the restored interface. Scott will incorporate these comments in the next revision for another WGLC. 5. Techniques for Benchmarking Core Router Accelerated Life Testing (S. Poretsky) --------------------------------------------------------------------------------- http://www.ietf.cnri.reston.va.us/internet-drafts/draft-ietf-bmwg-acc-bench-term-04.txt http://www.ietf.cnri.reston.va.us/internet-drafts/draft-ietf-bmwg-acc-bench-meth-01.txt Scott has added new test cases on new EBGP peer, change in BGP Policy, and specified SYN Flood as the DOS attack in the methodology. Jeff Dunn suggested a new test case with route flap-dampening on. Howard Berkowitz asked if there was an implicit assumption that peers have the same BGP implementation. There are radical differences in time of convergence in different implementations, where data structures in the routers are very different and affect the time. This should be noted in the methodology. George Jones asked if this document had cases where you throw tons of traffic at the device and see whether authentication and other management functions still work? Scott answered yes, and offered to work on these. Al suggested that the Start-up, Instability, and Recovery Phase slide would be useful in the Draft. Scott said he would try to add it. Also, with the methodology still in development, it would be best to wait on a last call for the terminology draft until there is more stability. 6. New Work on Hash and Stuffing (T.Player) ------------------------------------------- http://www.ietf.cnri.reston.va.us/internet-drafts/draft-ietf-bmwg-hash-stuffing-01.txt Timmons gave a brief overview of changes in the 01 version, including a definition of random in bit patterns. The other changes avoid accidental use of multicast addresses, and include considerations for MPLS Labels. This draft did not have much readership this time, and Al suggested that editors post interim drafts and avoid the avalanche just before the meeting. 7. Resumption of work on the FIB methodology with new editors (J.Dunn) ---------------------------------------------------------------------- http://www.ietf.org/internet-drafts/draft-ietf-bmwg-fib-meth-02.txt (The tombstone may still be here, watch for updates) Jeff described the changes in this new, heavily revised version. Co-editor Cindy Martin added that this was an expired draft. Assumptions were being made in the document about the interactions between the FIB and RIB which were not true. There was a short discussion of vendors building to the tests. Scott Bradner commented that it is almost impossible to avoid people building to the test, so you have to construct the tests so that building to the test doesn't matter. Bradner was also concerned about the idea of mimicking traffic, but you can mimic topology. Jeff asked for more input on test cases, and posited a RIB to FIB convergence ID. Scott Poretsky reminded that we already have control and dataplane convergence work to cover this, and we can't approach "white box" measurements (seconded by others). Howard Berkowitz pointed out that "graceful" recovery mechanisms mean that routers operate for some time with a corrupted FIB, and the draft should take this into account (Jeff agreed). Al Morton asked to consider revising the Terminology RFC, since the new methodology intends to include the effects of Route Aggregation and the current terminology indicates that aggregation would be avoided. 8. IPv6 Benchmarking Introduction (J.Dunn) ------------------------------------------ Jeff gave a short summary of IPv6 concepts and testing requirements. There was a discussion of the charter, since it doesn't mention IPv6 explicitly. Scott Bradner pointed out that BMWG's original charter preceded IPv6 by about 5 years, and yet there are 10 BMWG I-Ds that mention IPv6. In IETF, charters are contracts = what a WG will work on. David Kessens supported putting IPv6 explicitly in the charter, so it will encourage more people to work on IPv6. Scott Bradner investigated the nature of Dual Stack testing with several questions, and it was resolved to test with both IPv4 and v6 in a single configuration at the same time. Scott also suggested that the most efficient way to add IPv6 to the existing work was to write a document with the new packet types, and then say in the document "add these to the procedures in 2544". Update but don't replace. A discussion of the hop-by-hop and extension headers followed. In most circumstances, only the hop-by-hop header would be relevant, because the extension header is only processed by the destination host. Jeff is looking for folks to help out with this effort. 9. New Work on WLAN Benchmarking Methodology (S.Bradner, D.Stanley) ---------------------------------------------------------------- Related Draft: http://www.ietf.org/internet-drafts/draft-alexander-wlan-meth-00.txt Tom Alexander and Scott Bradner prepared an methodology draft for 802.11 devices and systems, and asked the bmwg-list if there was interest in this work. Scott began his presentation saying that substantial interest had been expressed, but that he was not proposing this draft become a BMWG work item at this time. There is potential overlap with the IEEE 802.11 Task Group T (TG T) on Wireless Performance Prediction. However, Tom Alexander is a member of the task group, and he contacted Scott with the plan to prepare an Internet- Draft for BMWG. Scott gave a brief overview of the draft, which he characterized as a first cut. This methodology builds on the existing RFCs for Network Interconnection Devices and LAN Switching Devices (RFCs 1242, 2544, 2285, and 2889). There are three test set-ups, including a set-up for wireless clients. One of the goals of the test conditions is to minimize the radio dependencies. Scott closed his presentation (in much less than 20 minutes) by offering three possibilities: The work could be done in IEEE 802.11, or as a cooperative effort between 802.11 and BMWG, or entirely in BMWG. Scott introduced Dorothy Stanley as the IEEE 802.11 Liaison to the IETF. Dorothy presented a slide summarizing all the 802.11 task groups, and highlighted (see slides). Task Group T is the IEEE 802.11 group that has been formed to look at performance. They will produce a recommended practice. Al Morton asked about the scope of the Wireless Performance Prediction Task Group, and Dorothy responded by displaying a passage from the Draft IEEE Project Authorization Request: "The scope of the project is to provide a set of performance metrics, measurement methodologies, and test conditions to enable measuring and predicting the performance of 802.11 WLAN devices and networks at the component and application level." In response to David Newman's questions, it was clarified that component level refers to the radio equipment aspects, and that the application level refers to user applications - the IP layer was not specifically called out. (Document 1157 outlines TG T work, and includes use of user opinion models to interpret results, including ITU-T Rec. G.107 for voice quality). David pointed out that the draft did not correlate 802.11 events with those at the network layer and above. The scope did not seem to address re-association time measurement. Dorothy indicated that the "R" Task Group had defined this time interval (from last packet sent or received on the old access point to the first packet on the new access point. Dorothy supplied a URL for all IEEE 802.11 documents: http://www.802wirelessworld.com/index.jsp (you may need to register to obtain documents, the guest login appears restricted). David Kessens indicated his hesitancy about taking up this document in IETF. Lots of the measurements depend on the physical layer, and you may be mostly measuring the radio. However, the capacity aspects are not radio, and they are useful information. It is not clear whether the entire draft is within the BMWG scope, but some is. Bernard Aboba, IEEE 802 Liaison to the IETF, quickly showed 3 slides to indicate some of the challenges present in the context of 802.11 measurements and benchmarking. Implementation of rate adaptation and roaming methods are key points. IEEE has not defined how to do rate adaptation or even what the results should be. Rate adaptation derives feedback from Frame loss. Bernard's first slide showed RSSI vs Delivery Ratio, derived from an MIT study. There was an enormous amount of scatter: Packet Loss Ratio can be anything from 20% to 100% under some conditions. They attribute a lot of this variability to multipath. As a result, rate adaptation can be very noisy, with lots of weird behavior. The theoretical curves are very different. It was noted that the draft addresses tunneling from one Access Point to another. Bernard supplied the following URLs with more background information: The slides on FER vs. S/N theory and observation are from "Link-Level Measurements from an 802.11b Mesh Network" by Aguayo, Bicket, Biswas and Morris of MIT: http://www.pdos.lcs.mit.edu/roofnet/sigcomm-talk.ppt http://www.pdos.lcs.mit.edu/roofnet/ExOR-HotNets.ppt The best citation on roaming interval theory and measurement are the following presentations to 802.11r by Areg Alimian and Bernard Aboba: http://www.drizzle.com/~aboba/IEEE/11-04-0377-01-frfh-analysis-roaming-techniques.ppt http://www.drizzle.com/~aboba/IEEE/11-04-0378-00-roaming-intervals-measurements.ppt Scott added that repeatability is extremely important in Benchmarking. He had run tests that were repeatable to within about 1% a year later, but that buffer capacity testing in RFC 1944 that was not very useful (with 17% variability in measurements, it's difficult to compare devices). When asked his opinion on the best venue for the work, Bernard gave three key considerations: Where is the demonstrated expertise? Where are the people who will do the work? Where is the best community to support the work? Scott related Tom Alexander's opinion that BMWG expertise was needed to complete the work, and they tried to match the current draft to BMWG's charter as much as possible. It may be appropriate to split work, appoint liaisons, etc. Overlap can be avoided by working-out the assignments among the people directly involved. David Newman moved the discussion toward the issues that are usually addressed by BMWG, at IP level and above. 802.11 roaming events can cause a transport gap of 10 to 15 seconds. Part of this is TCP wake-up. David added that it would be better to use IP packet length in the draft. Also, the existing terms of Throughput and Latency (at Throughput Load) were not so meaningful in the lossy wireless context. Bit error ratio testing was discussed briefly, but thought to be out of scope. Scott concluded that we need to work out a division of labor on this topic, and we will hear from the IEEE 802.11 TG T after their meeting (week of 11/15). The chairman/Al asked that both Liaisons reflect the comments in this session back to their respective committees. Specifically, he raised the question whether any committee can produce repeatable benchmarks in this area. Scott said that one goal of the current draft was to minimize the radio-specific aspects of the testing, and hence minimize the variability of the results. The Signal to Noise Ratio Section (3.5.9) is primarily where the topic is addressed (there are other key points throughout Section 3.5 - Test Conditions). The BMWG would like to know if the TG T members agree that the repeatability goal was achieved in Tom & Scott's draft, or what modifications/additional stipulations would be necessary to achieve it. Bernard added that a review of the draft could be requested. A complete review is not asked, unless the TG T committee decides that BMWG is the right place for some aspects of the work. 10. "Old" Work Proposal on Protection Switching Methodology (S.Poretsky) ------------------------------------------------------------------------ Drafts related to this proposal: Draft on MPLS Protection Benchmarking Methodology http://www.ietf.org/internet-drafts/draft-poretsky-mpls-protection-meth-03.txt Automatic Protection Switching Benchmark Terminology, Expired, see http://www.watersprings.org/pub/id/draft-kimura-protection-term-02.txt The WG has discussed this item at several previous meetings. Generic Sub-IP and MPLS proposals can now be re-combined into one using a common terminology. Methodology was submitted without terminology and used MPLS terminology. The terminology document was submitted as a silver bullet covering all types of protection (including SONET, etc). Now there will be a single terminology draft, and a methodology for MPLS protection benchmarking. Scott described the recent spike of interest and review comments based on trial implementation. All the discursion should be moved to the bmwg list. About 6 people had read the drafts. Al Morton asked the folks who have read the drafts to hum if they supported the work (Good hum), and no readers opposed (silence). Al also asked if there was any enthusiasm among non-readers to review the drafts, based on the presentation (no response). There was clear support from folks who have knowledge. Updates will be submitted in the coming weeks. We will wait to look at revised drafts and circulate a work proposal on the list. Scott's plan is to Update Terminology with new terms and Update Methodology with new terms and comments. 11. Quick Wrap-up ----------------- Action Items from this meeting: - Note to Internet-Drafts to reactivate the FIB draft. - Charter update to include IPv6 - Milestone addition for Hash and Stuffing - Looking for updates on almost every draft - Awaiting results of the IEEE 802.11 TG T discussions - New work Proposal on Protection Switching on the list Mailing list archive: --------------------- http://www.ietf.org/mail-archive/working-groups/bmwg/current/ Current Status of WG drafts: ---------------------------- WG Last Call <draft-ietf-bmwg-fib-meth-02.txt>, New Editors, new draft @ IETF-61 <draft-ietf-bmwg-igp-dataplane-conv-term-04.txt> Revised on WG input <draft-ietf-bmwg-igp-dataplane-conv-meth-04.txt> Revised <draft-ietf-bmwg-igp-dataplane-conv-app-04.txt> Revised <draft-ietf-bmwg-dsmterm-09.txt>, Call ended w/editorials, EXPIRED, new editor I-Ds <draft-ietf-bmwg-ipsec-term-03.txt>, draft 7/04, new draft very soon, LC <draft-ietf-bmwg-benchres-term-04.txt>, (Exp) new draft soon, LC <draft-ietf-bmwg-acc-bench-term-04.txt> Revised <draft-ietf-bmwg-acc-bench-meth-01.txt> Revised on WG input <draft-ietf-bmwg-hash-stuffing-01.txt> Revised on WG input Expired BMWG I-Ds <draft-ietf-bmwg-bgpbas-01.txt>, Authors re-assembling <draft-ietf-bmwg-benchres-method-00.txt> Pending term prog. New Work proposals. Hash and Stuffing - need to add to charter WLAN benchmarking methodology - deferred to IEEE 802.11T Protection benchmarking - One Proposal - Float Proposal on List Considerations for Measuring Network Convergence, Individual RFC Editor Queue -- <draft-ietf-bmwg-conterm-06.txt> <draft-ietf-bmwg-ospfconv-term-10.txt> <draft-ietf-bmwg-ospfconv-intraarea-10.txt> <draft-ietf-bmwg-ospfconv-applicability-07.txt> New RFC 3918 was <draft-ietf-bmwg-mcastm-14.txt>

    Slides

    Agenda/Status/Milestones
    IGP Data Plane Convergence Benchmarking
    Accelerated Stress Benchmarking
    Hash and Stuffing: Overlooked Factors in Network Device Benchmarking
    Methodology for Forwarding Information Base (FIB) based Router Performance
    IPv6 in the BMWG
    Benchmarking Methodology for Wireless LAN Devices
    IEEE 802.11 Summary
    Benchmarking Protection Mechanisms