Benchmarking Methodology (bmwg) Charter

2.4.1 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 63rd IETF Meeting in Paris, France. It may now be out-of-date.

Last Modified: 2005-06-27

Chair(s):

Kevin Dubray <kdubray@juniper.net>
Al Morton <acmorton@att.com>

Operations and Management Area Director(s):

Bert Wijnen <bwijnen@lucent.com>
David Kessens <david.kessens@nokia.com>

Operations and Management Area Advisor:

David Kessens <david.kessens@nokia.com>

Mailing Lists:

General Discussion: bmwg@ietf.org
To Subscribe: bmwg-request@ietf.org
In Body: subscribe your_email_address
Archive: http://www.ietf.org/mail-archive/web/bmwg/index.html

Description of Working Group:

The major goal of the Benchmarking Methodology Working Group is to make
a series of recommendations concerning the measurement of the
performance characteristics of various internetworking technologies;
further, these recommendations may focus on the systems or services
that are built from these technologies.

Each recommendation will describe the class of equipment, system, or
service being addressed; discuss the performance characteristics that
are pertinent to that class; clearly identify a set of metrics that aid
in the description of those characteristics; specify the methodologies
required to collect said metrics; and lastly, present the requirements
for the common, unambiguous reporting of benchmarking results.

To better distinguish the BMWG from other measurement initiatives in
the IETF, the scope of the BMWG is limited to technology
characterization using simulated stimuli in a laboratory environment.
Said differently, the BMWG does not attempt to produce benchmarks for
live, operational networks. Moreover, the benchmarks produced by this
WG
shall strive to be vendor independent or otherwise have universal
applicability to a given technology class.

Because the demands of a particular technology may vary from
deployment to deployment, a specific non-goal of the Working Group is
to define acceptance criteria or performance requirements.

An ongoing task is to provide a forum for discussion regarding the
advancement of measurements designed to provide insight on the
operation internetworking technologies.

Goals and Milestones:

Done		Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.
Done		Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.
Done		Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.
Done		Submit workplan for initiating work on Benchmarking Methodology for LAN Switching Devices.
Done		Submit workplan for continuing work on the Terminology for Cell/Call Benchmarking draft.
Done		Submit initial draft of Benchmarking Methodology for LAN Switches.
Done		Submit Terminology for IP Multicast Benchmarking draft for AD Review.
Done		Submit Benchmarking Terminology for Firewall Performance for AD review
Done		Progress ATM benchmarking terminology draft to AD review.
Done		Submit Benchmarking Methodology for LAN Switching Devices draft for AD review.
Done		Submit first draft of Firewall Benchmarking Methodology.
Done		First Draft of Terminology for FIB related Router Performance Benchmarking.
Done		First Draft of Router Benchmarking Framework
Done		Progress Frame Relay benchmarking terminology draft to AD review.
Done		Methodology for ATM Benchmarking for AD review.
Done		Terminology for ATM ABR Benchmarking for AD review.
Done		Terminology for FIB related Router Performance Benchmarking to AD review.
Done		Firewall Benchmarking Methodology to AD Review
Done		First Draft of Methodology for FIB related Router Performance Benchmarking.
Done		First draft Net Traffic Control Benchmarking Methodology.
Done		Methodology for IP Multicast Benchmarking to AD Review.
Done		Resource Reservation Benchmarking Terminology to AD Review
Done		First I-D on IPsec Device Benchmarking Terminology
Done		EGP Convergence Benchmarking Terminology to AD Review
Done		Resource Reservation Benchmarking Methodology to AD Review
Dec 04		IPsec Device Benchmarking Terminology to AD Review
Apr 05		IGP/Data-Plane Terminology I-D to AD Review
Apr 05		IGP/Data-Plane Methodology and Considerations I-Ds to AD Review
Apr 05		Router Accelerated Test Terminology I-D to AD Review
Apr 05		Net Traffic Control Benchmarking Terminology to AD Review
Apr 05		Methodology for FIB related Router Performance Benchmarking to AD review.
Jul 05		Basic BGP Convergence Benchmarking Methodology to AD Review.
Oct 05		Router Accelerated Test Method. and Considerations I-Ds to AD Review
Oct 05		Hash and Stuffing I-D to AD Review
Dec 05		Net Traffic Control Benchmarking Methodology to AD Review.

Internet-Drafts:

draft-ietf-bmwg-dsmterm-11.txt

draft-ietf-bmwg-benchres-term-06.txt

draft-ietf-bmwg-fib-meth-03.txt

draft-ietf-bmwg-ipsec-term-06.txt

draft-ietf-bmwg-igp-dataplane-conv-meth-07.txt

draft-ietf-bmwg-igp-dataplane-conv-term-07.txt

draft-ietf-bmwg-igp-dataplane-conv-app-07.txt

draft-ietf-bmwg-acc-bench-term-06.txt

draft-ietf-bmwg-acc-bench-meth-03.txt

draft-ietf-bmwg-hash-stuffing-03.txt

draft-ietf-bmwg-acc-bench-meth-ebgp-00.txt

draft-ietf-bmwg-acc-bench-meth-opsec-00.txt

Request For Comments:

RFC	Status	Title
RFC1242	I	Benchmarking Terminology for Network Interconnection Devices
RFC1944	I	Benchmarking Methodology for Network Interconnect Devices
RFC2285	I	Benchmarking Terminology for LAN Switching Devices
RFC2432	I	Terminology for IP Multicast Benchmarking
RFC2544	I	Benchmarking Methodology for Network Interconnect Devices
RFC2647	I	Benchmarking Terminology for Firewall Performance
RFC2761	I	Terminology for ATM Benchmarking
RFC2889	I	Benchmarking Methodology for LAN Switching Devices
RFC3116	I	Methodology for ATM Benchmarking
RFC3133	I	Terminology for Frame Relay Benchmarking
RFC3134	I	Terminology for ATM ABR Benchmarking
RFC3222	I	Terminology for Forwarding Information Base (FIB) based Router Performance
RFC3511	I	Benchmarking Methodology for Firewall Performance
RFC3918	I	Methodology for IP Multicast Benchmarking
RFC4061	I	Benchmarking Basic OSPF Single Router Control Plane Convergence
RFC4062	I	OSPF Benchmarking Terminology and Concepts
RFC4063	I	Considerations When Using Basic OSPF Convergence Benchmarks
RFC4098	I	Terminology for Benchmarking BGP Device Convergence in the Control Plane

Current Meeting Report

Benchmarking Methodology WG (bmwg) TUESDAY, August 2 at 1030-1230 ============================== CHAIRS: Kevin Dubray <kdubray@juniper.net> Al Morton <acmorton@att.com> Approximately 25 people attended the BMWG session at the 63rd IETF. Matt Zekauskas volunteered to take raw meeting notes. [ed.: Thanks, again(!), Matt!] Agenda offered as: 0. Agenda 1. WG Status and Liaison notes (Chairs, 20 min) 2. Milestones Status (Chairs, 5 min) 3. Resource Reservation Terminology I-Ds. (Korn, 15 min) 4. IPsec Benchmarking Terms (& intro to methods) (Kaeo, 15 min) 5. Core Router Accelerated Life Testing I-Ds (Poretsky, 15 min) 6. Revised Charter Text (Chairs, 15 min) 7. New Work Proposals 0. Agenda The Agenda was approved as proposed by the chairs. (The chairs' presentation which contains the agenda and work status report can be found in the proceedings; follow the "Agenda/Status/Charter" link.) The chairs reserved the right to move some agenda items around in the discussion sequence. 1. WG Status and Liaison notes The chairs presented the WG activity as a function of I-D activity. (A detailed accounting can be found in the Chairs' slides, slides 3, 4, & 5, in the Proceedings.) Regarding some specific I-Ds: * Network-layer Traffic Control Mechanisms Terms, <draft-ietf-bmwg-dsmterm-11.txt> Working Group Last Call (WGLC) requested, as S. Poretsky believes all nits are addressed. Scott looking for help to start the methodology document. * Hash & Stuffing: Overlooked Factors in Network Device Benchmarking, <draft-ietf-bmwg-hash-stuffing-03.txt>, editors requesting WGLC. * Benchmarking IGP Data Plane Convergence benchmark I-Ds, <draft-ietf-bmwg-igp-dataplane-conv-term-07.txt>, <draft-ietf-bmwg-igp-dataplane-conv-meth-07.txt>, <draft-ietf-bmwg-igp-dataplane-conv-app-07.txt>, While WGLC yield no direct commentary to list, it did yield one supporting comment from a methodology user, and a comment on Consistency of THPT/FwdRate from Timmons Player. There was considerable discussion of this off-list, but the principals seem to be closing in on satisfactory text. (Chairs remarked about the need to have discussions like this on the BMWG mailing list.) The BGP convergence methodology draft and resource reservation methodology draft were cited as expired; the former has been rumored to be started and the latter will get started upon approval of its terminology peer. Since the group met last, 4 BMWG I-Ds were promoted to RFCs: RFCs 4061, 4062, 4063 (OSPF convergence) and RFC 4098 (on BGP convergence terminology). * 802.11T Liaison The chairs noted that a detailed report was posted (on 1 Aug 05) to reflector from Tom Alexander on 802.11T WG. Of particular note, a couple of proposals have been approved; these are hoped to be integrated into a draft. 3. Resource Reservation Terminology I-Ds. Andras Korn was present to discuss the latest version of the "Benchmarking Terminology for Resource Reservation Capable Routers," <draft-ietf-bmwg-benchres-term-06.txt>. Andras' slides, can be found with the "Terms for Resource Res" link in the Proceedings. Andras described the latest changes in two classes: editorial and technically significant. Most editorial changes were to improve wording. The technical modifications brought in the concept of loading (router and traffic), while expanding the notions of loss (and loss-free conditions). Traffic units were rethought. A new section, 3.5, was added to better address load and scalability limits. The principals tried to clarify the complex boundary associated with load conditions and loss-free states. The notion of "reasonable time" is a difficult concept to define, but "reasonable time" is very important - especially in the methodological considerations. Andras cited this work has been going on for over 5 years; directed input from group can help bring this phase to closure. Al Morton cautioned that when work on the methodology starts, use care that "reasonable time" doesn't become an acceptance criteria standard. The notion of "waiting time" might be set on the long side, record time-to-response, etc. Andras countered while we don't determine acceptance standards, the notion of correct behavior is crucial. Eliot Gerber noted assessing "loss-free" conditions is important and questioned its uniqueness to this effort. Al added that given the significant changes, another WGLC will be conducted after the summer holiday period. 4. Terms (and Methods) for Benchmarking IPsec Devices (T. Van Herck) Slides by Tim Van Herck, Merike Kaeo's, and M. Bustos can be found in the Proceedings under the "IPsec Terms & Method" link. A driving motivation was stated to be defining tests that identify implementation limits versus, say, interoperability. A methodology doc was in the works, but input generated by the WGLC on the IPsec terminology document demonstrated that reconsideration of the tunnel definitions was warranted. In addition, modifications to the terminology document was made to better address IPv6 and enhance scope beyond gateway to some host testing. Merike explained that while addressing the IPv6 definitions, it became necessary to accommodate manual keying. Subsequently, it was thought to rethink the tunnel terminology, too. Other support (transport & tunnel modes) were added, too. An informal poll showed that about 6 attendees read the current I-D. The editors feels that Phase 1 rekey frame loss could be supported, but feel that it's not really needed. Similarly, support for IKEv2 isn't a priority - v1 & v2 are substantially different and would cause a substantial rewrite to the term doc. Moreover, because v2 is a not backward compatible with v1 and v2 addresses different application classes, it may be better to address v2 in a separate, follow-on benchmarking document. Support for fragmentation will be added - probably limited to IPv4, but may need v6, too. Do we need a reassembly metric, too? Reporting requirements were tightened to account for Xauth/Modcfg. Traffic measurement units have been respecified as frames - some discussion of this vis a vis framesizes might be necessary. Again, lot's of tunnel terminology being reworked. Merike asked whether it is worth measuring "time to first packet" - the time for tunnel setup plus propagation delay to get cleartext packet from ingress port to encrypted, egress port? Merike reports the 00-version of the methodology I-D is still solidifying. She is expecting a first draft in several weeks, and hoping for some substantial feedback from the WG. The editors request that the WG give input on target test topologies, (currently 4 defined: single DUT, failover, IPsec aware and unaware scenario), test set up, or otherwise cite improvements or areas of omission. Merike asked the chairs whether the editors could submit the term and methodology docs for WGLC simultaneously in a couple of months. Mr. Morton believed that was doable. 5. Core Router Accelerated Life Testing I-Ds. Terminology & Methodology I-Ds revised. The Methodology has been split into 3 drafts (general, EBGP, & security). Editor wishes to test readiness for WGLC on terms and general methods drafts.(S.Poretsky, in absentia) Specifically, the following I-Ds, were addressed: draft-ietf-bmwg-acc-bench-term-06.txt draft-ietf-bmwg-acc-bench-meth-03.txt draft-ietf-bmwg-acc-bench-meth-ebgp-00.txt draft-ietf-bmwg-acc-bench-meth-opsec-00.txt Slides for this presntation can be found in the BMWG proceedings under the title "Several Drafts". Al presented this topic on Scott's behalf, since he was unable to attend at the last minute. It was clarified that the discussion at the last meeting resulted in a general recommendation to split the methodology draft into several functional areas (and not just a recommendation from the co-chairs). With many first-time BMWG attendees present, no one had read these drafts except the chairs. Scott's slide described the two new drafts. The Operational EBGP methodology contains procedures for many different scenarios, and the Operational Security draft has had some input from the IETF opsec WG. Al had one specific comment on the new ebgp methodology draft: In several places, the Results section gives "expected" levels of packet loss and other expectations that may be that may constitute acceptance criteria (as currently worded). It would be best to re-word this section to simply identify the key performance characteristics, adding the performance levels that some devices under test will achieve (zero packet loss). For example, in Section 4.2, Results: OLD It is expected that there will be zero packet loss as the DUT learns the new routes. Other DUT operation should be stable without session loss or sustained packet loss. NEW Key benchmarks will be packet loss and session loss as the DUT learns the new routes. Some DUT will achieve zero packet loss and stable operation without session loss. Also, it would be good to see some evidence that these benchmarks can be used to make meaningful vendor comparisons, through example results shared with the working group. It's not mandatory to do this, but those who have doubts may find this evidence compelling. 6. Revised Charter Text - Definition of a Benchmark (Chairs) Al introduced the topic and covered several slides giving a proposed benchmark definition, its attributes and exclusions. There was discussion and feedback on several points. Regarding the required attributes of a benchmark, Bert Wijnen asked if specific coverage limits could be stated such that a reasonable scope would be established, and without limits proposals for electrical power outlets could appear! Al responded that the focus was on IETF protocols and Kevin added that was the original intent, but that this expanded to systems that use IETF protocols. The most recent questions of BMWG scope were connected with benchmarks for 802.11 technologies. The current charter says "inter-networking technologies and services", and Kevin pointed out that wording is vague. The was agreement to tighten the wording to help decide whether to accept or fend-off future work proposals, but the challenge is to develop the new wording. Conformance tests were identified as excluded. Bert voiced his opinion that the ban was an IETF community position (rather than something the IESG decided). Kevin added that there has been pressure to include this sort of testing in BMWG (if not there, where?), but Brian Carpenter's message to wgchairs list on conformance testing ("...the IETF doesn't measure or certify conformance or inter-operability, because that might create legal liability...") makes the IETF boundaries clear. Acceptance tests are another exclusion. Andras Korn pointed out that it is difficult to avoid specifying some minimum criteria for valid benchmarking in the resource reservation benchmarks. Al responded that it would be appropriate to specify some prerequisites for measuring benchmarks, and that it would be up to the testers to determine if the prerequisites are met. A methodology could specify that devices must claim conformance with RFCxxxx, for example. Andras pointed out that sometimes a protocol spec. does not include all the specifications needed for testing (although it should). and additional specifications may be needed to provide a complete methodology. Tim Van Herck asked what org. IETF looks to for conformance testing, since we cannot do it ourselves? Kevin responded that this is kind of a catch-22, and that there have been some correctness criteria specified in earlier BMWG RFCs. We don't want good performance from a broken implementation. Bert added that IETF should be writing protocol specifications that are complete and can be verified. IETF does not take a stance on vendors that lie about conformance. Although vendor's customers would like a complete certification process (like CableLabs'), the change would require an IETF-plenary-level discussion. It would be good if non-compliant implementations produced poor benchmark performance, but the results might favor a non-compliant implementation, as well. Merike Kaeo suggested that the methodologies should strive to make it difficult to cheat. Al closed the discussion, saying that a more fleshed-out version of the definition (benefiting from the discussion today) will be posted to the list. 2. Milestone Status [ed. This agenda item was discussed out of sequence to the proposed agenda.] The Co-chairs summarized the progress toward meeting WG milestones in three categories: Past Due, On-Track, and New. There was clear progress on most of the seven Past Due milestones, but the FIB and BGP methodologies were exceptions. Three milestones that are still on-track need attention to progress as planned, and we will need to add 2 new milestones to cover the new Accelerated Stress Methodologies on EBGP and Operational Security. We need to revise the milestones and obtain AD-approval. 7. New Work Proposals * draft-poretsky-protection-term-00.txt * draft-poretsky-mpls-protection-meth-04.txt Sub-IP Protection terms and a method for MPLS have been harmonized into a single proposal now. There was good support for this work at our Nov 2004 meeting, but at this meeting, no one (other than the chairs) had read the drafts. David Kessens indicated his concern about taking up new work without the support of the WG, although this particular meeting was lightly attended by the WG's most active participants. David agreed that the work could be proposed to the list, but that he was raising the bar to accept new work. He wants to see more than just interest, people must be willing to contribute and people must be willing to say that they need this work done. If the work is done by two people, then it's not IETF work and should be published by the individuals. 8. WG Chairs' Summary and Conclusions There are several WG Last Calls to Launch after the meeting, including the benchres, IGP-dataplane, and hash-and-stuffing drafts. The IPsec authors are looking for people to review the new methodology, and hope to start another last call in a few months. We will also float the proposal for new work on the list, with a high bar for acceptance. If list members are part of the WG, then participation isn't optional! We must have participation to move topics forward, and we encourage everyone to find a way they can contribute to the work at hand. Otherwise, BMWG will work in other areas with clear sponsorship and participation.

Slides

Agenda/Status/Charter
Terms for Resource Res
IPSec Terms & Method
Several Drafts