2 title: Multiple Loss Ratio Search for Packet Throughput (MLRsearch)
3 abbrev: Multiple Loss Ratio Search
4 docname: draft-ietf-bmwg-mlrsearch-02
9 wg: Benchmarking Working Group
14 pi: # can use array (if all yes) or hash here
16 sortrefs: # defaults to yes
21 ins: M. Konstantynowicz
22 name: Maciek Konstantynowicz
25 email: mkonstan@cisco.com
30 email: vrpolak@cisco.com
37 target: https://s3-docs.fd.io/csit/rls2110/report/introduction/methodology_data_plane_throughput/methodology_data_plane_throughput.html#mlrsearch-tests
38 title: "FD.io CSIT Test Methodology - MLRsearch"
41 target: https://pypi.org/project/MLRsearch/0.4.0/
42 title: "MLRsearch 0.4.0, Python Package Index"
47 TODO: Update after all sections are ready.
49 This document proposes changes to [RFC2544], specifically to packet
50 throughput search methodology, by defining a new search algorithm
51 referred to as Multiple Loss Ratio search (MLRsearch for short). Instead
52 of relying on binary search with pre-set starting offered load, it
53 proposes a novel approach discovering the starting point in the initial
54 phase, and then searching for packet throughput based on defined packet
55 loss ratio (PLR) input criteria and defined final trial duration time.
56 One of the key design principles behind MLRsearch is minimizing the
57 total test duration and searching for multiple packet throughput rates
58 (each with a corresponding PLR) concurrently, instead of doing it
61 The main motivation behind MLRsearch is the new set of challenges and
62 requirements posed by NFV (Network Function Virtualization),
63 specifically software based implementations of NFV data planes. Using
64 [RFC2544] in the experience of the authors yields often not repetitive
65 and not replicable end results due to a large number of factors that are
66 out of scope for this draft. MLRsearch aims to address this challenge
67 in a simple way of getting the same result sooner, so more repetitions
68 can be done to describe the replicability.
73 As we use kramdown to convert from markdown,
74 we use this way of marking comments not to be visible in rendered draft.
75 https://stackoverflow.com/a/42323390
76 If other engine is used, convert to this way:
77 https://stackoverflow.com/a/20885980
82 TODO: Update after most other sections are updated.
85 The following is probably not needed (or defined elsewhere).
87 * Frame size: size of an Ethernet Layer-2 frame on the wire, including
88 any VLAN tags (dot1q, dot1ad) and Ethernet FCS, but excluding Ethernet
89 preamble and inter-frame gap. Measured in bytes (octets).
90 * Packet size: same as frame size, both terms used interchangeably.
91 * Device Under Test (DUT): In software networking, "device" denotes a
92 specific piece of software tasked with packet processing. Such device
93 is surrounded with other software components (such as operating system
94 kernel). It is not possible to run devices without also running the
95 other components, and hardware resources are shared between both. For
96 purposes of testing, the whole set of hardware and software components
97 is called "system under test" (SUT). As SUT is the part of the whole
98 test setup performance of which can be measured by [RFC2544] methods,
99 this document uses SUT instead of [RFC2544] DUT. Device under test
100 (DUT) can be re-introduced when analysing test results using whitebox
101 techniques, but this document sticks to blackbox testing.
102 * System Under Test (SUT): System under test (SUT) is a part of the
103 whole test setup whose performance is to be benchmarked. The complete
104 test setup contains other parts, whose performance is either already
105 established, or not affecting the benchmarking result.
106 * Bi-directional throughput tests: involve packets/frames flowing in
107 both transmit and receive directions over every tested interface of
108 SUT/DUT. Packet flow metrics are measured per direction, and can be
109 reported as aggregate for both directions and/or separately
110 for each measured direction. In most cases bi-directional tests
111 use the same (symmetric) load in both directions.
112 * Uni-directional throughput tests: involve packets/frames flowing in
113 only one direction, i.e. either transmit or receive direction, over
114 every tested interface of SUT/DUT. Packet flow metrics are measured
115 and are reported for measured direction.
116 * Packet Throughput Rate: maximum packet offered load DUT/SUT forwards
117 within the specified Packet Loss Ratio (PLR). In many cases the rate
118 depends on the frame size processed by DUT/SUT. Hence packet
119 throughput rate MUST be quoted with specific frame size as received by
120 DUT/SUT during the measurement. For bi-directional tests, packet
121 throughput rate should be reported as aggregate for both directions.
122 Measured in packets-per-second (pps) or frames-per-second (fps),
124 * Bandwidth Throughput Rate: a secondary metric calculated from packet
125 throughput rate using formula: bw_rate = pkt_rate * (frame_size +
126 L1_overhead) * 8, where L1_overhead for Ethernet includes preamble (8
127 octets) and inter-frame gap (12 octets). For bi-directional tests,
128 bandwidth throughput rate should be reported as aggregate for both
129 directions. Expressed in bits-per-second (bps).
130 * TODO do we need this as it is identical to RFC2544 Throughput?
131 Non Drop Rate (NDR): maximum packet/bandwidth throughput rate sustained
132 by DUT/SUT at PLR equal zero (zero packet loss) specific to tested
133 frame size(s). MUST be quoted with specific packet size as received by
134 DUT/SUT during the measurement. Packet NDR measured in
135 packets-per-second (or fps), bandwidth NDR expressed in
136 bits-per-second (bps).
137 * TODO if needed, reformulate to make it clear there can be multiple rates
138 for multiple (non-zero) loss ratios.
139 : Partial Drop Rate (PDR): maximum packet/bandwidth throughput rate
140 sustained by DUT/SUT at PLR greater than zero (non-zero packet loss)
141 specific to tested frame size(s). MUST be quoted with specific packet
142 size as received by DUT/SUT during the measurement. Packet PDR
143 measured in packets-per-second (or fps), bandwidth PDR expressed in
144 bits-per-second (bps).
145 * TODO: Refer to FRMOL instead.
146 Maximum Receive Rate (MRR): packet/bandwidth rate regardless of PLR
147 sustained by DUT/SUT under specified Maximum Transmit Rate (MTR)
148 packet load offered by traffic generator. MUST be quoted with both
149 specific packet size and MTR as received by DUT/SUT during the
150 measurement. Packet MRR measured in packets-per-second (or fps),
151 bandwidth MRR expressed in bits-per-second (bps).
152 * TODO just keep using "trial measurement"?
153 Trial: a single measurement step. See [RFC2544] section 23.
154 * TODO already defined in RFC2544:
155 Trial duration: amount of time over which packets are transmitted
156 in a single measurement step.
161 * TODO: The current text uses Throughput for the zero loss ratio load.
162 Is the capital T needed/useful?
163 * DUT and SUT: see the definitions in https://gerrit.fd.io/r/c/csit/+/35545
164 * Traffic Generator (TG) and Traffic Analyzer (TA): see
165 https://datatracker.ietf.org/doc/html/rfc6894#section-4
166 TODO: Maybe there is an earlier RFC?
167 * Overall search time: the time it takes to find all required loads within
168 their precision goals, starting from zero trials measured at given
169 DUT configuration and traffic profile.
170 * TODO: traffic profile?
171 * Intended load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.1
172 * Offered load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.2
173 * Maximum offered load (MOL): see
174 https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.3
175 * Forwarding rate at maximum offered load (FRMOL)
176 https://datatracker.ietf.org/doc/html/rfc2285#section-3.6.2
177 * Trial Loss Count: the number of frames transmitted
178 minus the number of frames received. Negative count is possible,
179 e.g. when SUT duplicates some frames.
180 * Trial Loss Ratio: ratio of frames received relative to frames
181 transmitted over the trial duration.
182 For bi-directional throughput tests, the aggregate ratio is calculated,
183 based on the aggregate number of frames transmitted and received.
184 If the trial loss count is negative, its absolute value MUST be used
185 to keep compliance with RFC2544.
186 * Safe load: any value, such that trial measurement at this (or lower)
187 intended load is correcrly handled by both TG and TA, regardless of SUT behavior.
188 Frequently, it is not known what the safe load is.
189 * Max load (TODO rename?): Maximal intended load to be used during search.
190 Benchmarking team decides which value is low enough
191 to guarantee values reported by TG and TA are reliable.
192 It has to be a safe load, but it can be lower than a safe load estimate
194 See the subsection on unreliable test equipment below.
195 This value MUST NOT be higher than MOL, which itself MUST NOT
196 be higher than Maximum Frame Rate
197 https://datatracker.ietf.org/doc/html/rfc2544#section-20
198 * Min load: Minimal intended load to be used during search.
199 Benchmarking team decides which value is high enough
200 to guarantee the trial measurement results are valid.
201 E.g. considerable overall search time can be saved by declaring SUT
202 faulty if min load trial shows too high loss rate.
203 Zero frames per second is a valid min load value
204 * Effective loss ratio: a corrected value of trial loss ratio
205 chosen to avoid difficulties if SUT exhibits decreasing loss ratio
206 with increasing load. It is the maximum of trial loss ratios
207 measured at the same duration on all loads smaller than (and including)
209 * Target loss ratio: a loss ratio value acting as an input for the search.
210 The search is finding tight enough lower and upper bounds in intended load,
211 so that the measurement at the lower bound has smaller or equal
212 trial loss ratio, and upper bound has strictly larger trial loss ratio.
213 For the tightest upper bound, the effective loss ratio is the same as
214 trial loss ratio at that upper bound load.
215 For the tightest lower bound, the effective loss ratio can be higher
216 than the trial loss ratio at that lower bound, but still not larger
217 than the target loss ratio.
218 * TODO: Search algorithm.
219 * TODO: Precision goal.
220 * TODO: Define a "benchmarking group".
221 * TODO: Upper and lower bound.
222 * TODO: Valid and invalid bound?
223 * TODO: Interval and interval width?
225 TODO: Mention NIC/PCI bandwidth/pps limits can be lower than bandwidth of medium.
227 # Intentions of this document
230 Instead of talking about DUTs being non-deterministic
231 and vendors "gaming" in order to get better Throughput results,
232 Maciek and Vratko currently prefer to talk about result repeatability.
235 The intention of this document is to provide recommendations for:
236 * optimizing search for multiple target loss ratios at once,
237 * speeding up the overall search time,
238 * improve search results repeatability and comparability.
240 No part of RFC2544 is intended to be obsoleted by this document.
243 This document may contain examples which contradict RFC2544 requirements
245 That is not an ecouragement for benchmarking groups
246 to stop being compliant with RFC2544.
253 It is useful to restate the key requirements of RFC2544
254 using the new terminology (see section Terminology).
256 The following sections of RFC2544 are of interest for this document.
258 * https://datatracker.ietf.org/doc/html/rfc2544#section-20
259 Mentions the max load SHOULD not be larget than the theoretical
260 maximum rate for the frame size on the media.
262 * https://datatracker.ietf.org/doc/html/rfc2544#section-23
263 Lists the actions to be done for each trial measurement,
264 it also mentions loss rate as an example of trial measurement results.
265 This document uses loss count instead, as that is the quantity
266 that is easier for the current test equipment to measure,
267 e.g. it is not affected by the real traffic duration.
268 TODO: Time uncertainty again.
270 * https://datatracker.ietf.org/doc/html/rfc2544#section-24
271 Mentions "full length trials" leading to the Throughput found,
272 as opposed to shorter trial durations, allowed in an attempt
273 to "minimize the length of search procedure".
274 This document talks about "final trial duration" and aims to
275 "optimize overal search time".
277 * https://datatracker.ietf.org/doc/html/rfc2544#section-26.1
278 with https://www.rfc-editor.org/errata/eid422
279 finaly states requirements for the search procedure.
280 It boils down to "increase intended load upon zero trial loss
281 and decrease intended load upon non-zero trial loss".
283 No additional constraints are placed on the load selection,
284 and there is no mention of an exit condition, e.g. when there is enough
285 trial measurements to proclaim the largest load with zero trial loss
286 (and final trial duration) to be the Throughput found.
289 The following section is probably not useful enough.
291 ## Generalized search
293 Note that the Throughput search can be restated as a "conditional
294 load search" with a specific condition.
296 "increase intended load upon trial result satisfying the condition
297 and decrease intended load upon trial result not satisfying the condition"
298 where the Throughput condition is "trial loss count is zero".
300 This works for any condition that can be evaluated from a single
301 trial measurement result, and is likely to be true at low loads
302 and false at high loads.
304 MLRsearch can incorporate multiple different conditions,
305 as long as there is total ligical ordering between them
306 (e.g. if a condition for a target loss ratio is not satisfied,
307 it is also not satisfied for any other codition which uses
308 larger target loss ratio).
310 TODO: How to call a "load associated with this particular condition"?
315 TODO: Not sure if this subsection is needed an where.
319 There is one obvious and simple search algorithm which conforms
320 to throughput search requirements: simple bijection.
322 Input: target precision, in frames per second.
326 1. Chose min load to be zero.
327 1. No need to measure, loss count has to be zero.
328 2. Use the zero load as the current lower bound.
329 2. Chose max load to be the max value allowed by bandwidth of the medium.
330 1. Perform a trial measurement (at the full length duration) at max load.
331 2. If there is zero trial loss count, return max load as Throughput.
332 3. Use max load as the current upper bound.
333 3. Repeat until the difference between lower bound and upper bound is
334 smaller or equal to the precision goal.
335 1. If it is not larget, return the current lower bound as Throughput.
336 2. Else: Chose new load as the arithmetic average of lower and upper bound.
337 3. Perform a trial measurement (at the full length duration) at this load.
338 4. If the trial loss rate is zero, consider the load as new lower bound.
339 5. Else consider the load as the new upper bound.
340 6. Jump back to the repeat at 3.
342 Another possible stop condition is the overal search time so far,
343 but that is not really a different condition, as the time for search to reach
344 the precision goal is just a function of precision goal, trial duration
345 and the difference between max and min load.
347 While this algorithm can be accomodated to search for multiple
348 target loss ratios "at the same time (see somewhere below),
349 it is still missing multiple improvement which give MLRsearch
350 considerably better overal search time in practice.
356 ## Repeatability and Comparability
358 RFC2544 does not suggest to repeat Throughput search,
359 {::comment}probably because the full set of tests already takes long{:/comment}
360 and from just one Throughput value, it cannot be determined
361 how repeatable that value is (how likely it is for a repeated Throughput search
362 to end up with a value less then the precision goal away from the first value).
364 Depending on SUT behavior, different benchmark groups
365 can report significantly different Througput values,
366 even when using identical SUT and test equipment,
367 just because of minor differences in their search algorithm
368 (e.g. different max load value).
370 While repeatability can be addressed by repeating the search several times,
371 the differences in the comparability scenario may be systematic,
372 e.g. seeming like a bias in one or both benchmark groups.
374 MLRsearch algorithm does not really help with the repeatability problem.
375 This document RECOMMENDS to repeat a selection of "important" tests
376 ten times, so users can ascertain the repeatability of the results.
378 TODO: How to report? Average and standard deviation?
380 Following MLRsearch algorithm leaves less freedom for the benchmark groups
381 to encounter the comparability problem,
382 alghough more research is needed to determine the effect
383 of MLRsearch's tweakable parameters.
386 Possibly, the old DUTs were quite sharply consistent in their performance,
387 and/or precision goals were quite large in order to save overal search time.
389 With software DUTs and with time-efficient search algorithms,
390 nowadays the repeatability of Throughput can be quite low,
391 as in standard deviation of repeated Througput results
392 is considerably higher than the precision goal.
396 TODO: Unify with PLRsearch draft.
397 TODO: No-loss region, random region, lossy region.
398 TODO: Tweaks with respect to non-zero loss ratio goal.
399 TODO: Duration dependence?
401 Both RFC2544 and MLRsearch return Throughput somewhere inside the random region,
402 or at most the precision goal below it.
406 TODO: Make sure this is covered elsewhere, then delete.
408 ## Search repeatability
410 The goal of RFC1242 and RFC2544 is to limit how vendors benchmark their DUTs,
411 in order to force them to report values that have higher chance
412 to be confirmed by independent benchmarking groups following the same RFCs.
414 This works well for deterministic DUTs.
416 But for non-deterministic DUTs, the RFC2544 Throughput value
417 is only guaranteed to fall somewhere below the lossy region (TODO define).
418 It is possible to arrive at a value positioned likely high in the random region
419 at the cost of increased overall search duration,
420 simply by lowering the load by very small amounts (instead of exact halving)
421 upon lossy trial and increasing by large amounts upon lossless trial.
423 Prescribing an exact search algorithm (bisection or MLRsearch or other)
424 will force vendors to report less "gamey" Throughput values.
430 The following two sections are probably out of scope,
431 as they does not affect MLRsearch design choices.
433 ### Direct and inverse measurements
435 TODO expand: Direct measurement is single trial measurement,
436 with predescribed inputs and outputs turned directly into the quality of interest
438 Latency https://datatracker.ietf.org/doc/html/rfc2544#section-26.2
439 is a single direct measurement.
440 Frame loss rate https://datatracker.ietf.org/doc/html/rfc2544#section-26.3
441 is a sequence of direct measurements.
443 TODO expand: Indirect measurement aims to solve an "inverse function problem",
444 meaning (a part of) trial measurement output is prescribed, and the quantity
445 of interest is (derived from) the input parameters of trial measurement
446 that achieves the prescribed output.
447 In general this is a hard problem, but if the unknown input parameter
448 is just one-dimensional quantity, algorithms such as bisection
449 do converge regardless of outputs seen.
450 We call any such algorithm examining one-dimensional input as "search".
451 Of course, some exit condition is needed for the search to end.
452 In case of Throughput, bisection algorithm tracks both upper bound
453 and lower bound, with lower bound at the end of search is the quantity
454 satisfying the definition of Throughput.
456 ### Metrics other than frames
458 TODO expand: Small TCP transaction can succeed even if some frames are lost.
460 TODO expand: It is possible for loss ratio to use different metric than load.
461 E.g. pps loss ratio when traffic profile uses higher level transactions per second.
463 ### TODO: Stateful DUT
465 ### TODO: Stateful traffic
468 ## Non-Zero Target Loss Ratios
470 https://datatracker.ietf.org/doc/html/rfc1242#section-3.17
471 defines Throughput as:
472 The maximum rate at which none of the offered frames
473 are dropped by the device.
476 Since even the loss of one frame in a
477 data stream can cause significant delays while
478 waiting for the higher level protocols to time out,
479 it is useful to know the actual maximum data
480 rate that the device can support.
484 While this may still be true for some protocols,
485 research has been performed...
487 TODO: Add this link properly: https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-Y.1541-201112-I!!PDF-E&type=items
488 TODO: List values from that document, from 10^-3 to 4*10^-6.
490 ...on other protocols and use cases,
491 resulting in some small but non-zero loss ratios being considered
492 as acceptable. Unfortunately, the acceptable value depends on use case
493 and properties such as TCP window size and round trip time,
494 so no single value of target loss rate (other than zero)
495 is considered to be universally applicable.
499 New "software DUTs" (traffic forwarding programs running on
500 commercial-off-the-shelf compute server hardware) frequently exhibit quite
501 low repeatability of Throughput results per above definition.
503 This is due to, in general, throughput rates of software DUTs (programs)
504 being sensitive to server resource allocation by OS during runtime,
505 as well as any interrupts or blocking of software threads involved
506 in packet processing.
508 To deal with this, this document recommends discovery of multiple throughput rates of interest for software DUTs that run on general purpose COTS servers (with x86, AArch64 Instruction Set Architectures):
509 * throughput rate with target of zero packet loss ratio.
510 * at least one throughput rate with target of non-zero packet loss ratio.
513 In our experience, the higher the target loss ratio is,
514 the better is the repeatability of the corresponding load found.
516 TODO: Define a good name for a load corresponding to a specific non-zero
517 target loss ration, while keeping Throughput for the load corresponding
518 to zero target loss ratio.
520 This document RECOMMENDS the benchmark groups to search for corresponding loads
521 to at least one non-zero target loss ratio.
522 This document does not suggest any particular non-zero target loss ratio value
523 to search the corresponding load for.
526 What is worse, some benchmark groups (which groups?; citation needed)
527 started reporting loads that achieved only "approximate zero loss",
528 while still calling that a Throughput (and thus becoming non-compliant
534 This document gives several independent ideas on how to lower the (average)
535 overall search time, while remaining unconditionally compliant with RFC2544
536 (and adding some of extensions).
538 This document also specifies one particular way to combine all the ideas
539 into a single search algorithm class (single logic with few tweakable parameters).
541 Little to no research has been done into the question of which combination
542 of ideas achieves the best compromise with respect to overal search time,
543 high repeatability and high comparability.
545 TODO: How important it is to discuss particular implementation choices,
546 especially when motivated by non-deterministic SUT behavior?
548 ## Short duration trials
550 https://datatracker.ietf.org/doc/html/rfc2544#section-24
551 already mentions the possibity of using shorter duration
552 for trials that are not part of "final determination".
554 Obviously, the upper and lower bound from a smaller duration trial
555 can be used as the initial upper and lower bound for the final determination.
557 MLRsearch makes it clear a re-measurement is always needed
558 (new trial measurement with the same load but longer duration).
559 It also specifes what to do if the longer trial is no longer a valid bound
560 (TODO define?), e.g. start an external search.
561 Additionaly one halving can be saved during the shorter duration search.
563 ## FRMOL as reasonable start
565 TODO expand: Overal search ends with "final determination" search,
566 preceded by "shorter duration search" preceded by "bound initialization",
567 where the bounds can be considerably different from min and max load.
569 For SUTs with high repeatability, the FRMOL is usually a good approximation
570 of Throughput. But for less repeatable SUTs, forwarding rate (TODO define)
571 is frequently a bad approximation to Throughput, therefore halving
572 and other robust-to-worst-case approaches have to be used.
573 Still, forwarding rate at FRMOL load can be a good initial bound.
575 ## Non-zero loss ratios
577 See the "Popularity of non-zero target loss ratios" section above.
579 TODO: Define "trial measurement result classification criteria",
580 or keep reusing long phrases without definitions?
582 A search for a load corresponding to a non-zero target loss rate
583 is very similar to a search for Throughput,
584 just the criterion when to increase or decrease the intended load
585 for the next trial measurement uses the comparison of trial loss ratio
586 to the target loss ratio (instead of comparing loss count to zero)
587 Any search algorithm that works for Throughput can be easily used also for
588 non-zero target loss rates, perhaps with small modifications
589 in places where the measured forwarding rate is used.
591 Note that it is possible to search for multiple loss ratio goals if needed.
593 ## Concurrent ratio search
595 A single trial measurement result can act as an upper bound for a lower
596 target loss ratio, and as a lower bound for a higher target loss ratio
597 at the same time. This is an example of how
598 it can be advantageous to search for all loss ratio goals "at once",
599 or at least "reuse" trial measurement result done so far.
601 Even when a search algorithm is fully deterministic in load selection
602 while focusing on a single loss ratio and trial duration,
603 the choice of iteration order between target loss ratios and trial durations
604 can affect the obtained results in subtle ways.
605 MLRsearch offers one particular ordering.
608 It is not clear if the current ordering is "best",
609 it is not even clear how to measure how good an ordering is.
610 We would need several models for bad SUT behaviors,
611 bug-free implementations of different orderings,
612 simulator to show the distribution of rates found,
613 distribution of overall durations,
614 and a criterion of which rate distribution is "bad"
615 and whether it is worth the time saved.
620 ## Load selection heuristics and shortcuts
622 Aside of the two heuristics already mentioned (FRMOL based initial bounds
623 and saving one halving when increasing trial duration),
624 there are other tricks that can save some overall search time
625 at the cost of keeping the difference between final lower and upper bound
626 intentionally large (but still within the precision goal).
628 TODO: Refer implementation subsections on:
630 * Rounding the interval width up.
631 * Using old invalid bounds for interval width guessing.
633 The impact on overall duration is probably small,
634 and the effect on result distribution maybe even smaller.
635 TODO: Is the two-liner above useful at all?
637 # Non-compliance with RFC2544
639 It is possible to achieve even faster search times by abandoning
640 some requirements and suggestions of RFC2544,
641 mainly by reducing the wait times at start and end of trial.
643 Such results are therefore no longer compliant with RFC2544
644 (or at least not unconditionally),
645 but they may still be useful for internal usage, or for comparing
646 results of different DUTs achieved with an identical non-compliant algorithm.
648 TODO: Refer to the subsection with CSIT customizations.
650 # Additional Requirements
652 RFC2544 can be understood as having a number of implicit requirements.
653 They are made explicit in this section
654 (as requirements for this document, not for RFC2544).
656 Recommendations on how to properly address the implicit requirements
657 are out of scope of this document.
661 Although some (insufficient) ideas are proposed.
665 ## TODO: Search Stop Criteria
667 TODO: Mention the timeout parameter?
671 TODO: highlight importance of results consistency
672 for SUT performance trending and anomaly detection.
676 ## Reliability of Test Equipment
678 Both TG and TA MUST be able to handle correctly
679 every intended load used during the search.
681 On TG side, the difference between Intended Load and Offered Load
684 TODO: How small? Difference of one packet may not be measurable
685 due to time uncertainties.
689 Maciek: 1 packet out of 10M, that's 10**-7 accuracy.
691 Vratko: For example, TRex uses several "worker" threads, each doing its own
692 rounding on how many packets to send, separately per each traffic stream.
693 For high loads and durations, the observed number of frames transmitted
694 can differ from the expected (fractional) value by tens of frames.
698 TODO expand: time uncertainty.
700 To ensure that, max load (see Terminology) has to be set to low enough value.
701 Benchmark groups MAY list the max load value used,
702 especially if the Throughput value is equal (or close) to the max load.
706 The following is probably out of scope of this document,
707 but can be useful when put into a separate document.
709 TODO expand: If it results in smaller Throughput reported,
710 it is not a big issue. Treat similarly to bandwidth and PPS limits of NICs.
712 TODO expand: TA dropping packets when loaded only lowers Throughput,
715 TODO expand: TG sending less packets but stopping at target duration
716 is also fine, as long as the forwarding rate is used as Throughput value,
717 not the higher intended load.
719 TODO expand: Duration stretching is not fine.
720 Neither "check for actual duration" nor "start+sleep+stop"
721 are reliable solutions due to time overheads and uncertainty
722 of TG starting/stopping traffic (and TA stopping counting packets).
726 Solutions (even problem formulations) for the following open problems
727 are outside of the scope of this document:
728 * Detecting when the test equipment operates above its safe load.
729 * Finding a large but safe load value.
730 * Correcting any result affected by max load value not being a safe load.
734 TODO: Mention 90% of self-test as an idea:
735 https://datatracker.ietf.org/doc/html/rfc8219#section-9.2.1
737 This is pointing to DNS testing, nothing to do with throughput,
738 so how is it relevant here?
744 Part of discussion on BMWG mailing list (with small edits):
746 This is a hard issue.
747 The algorithm as described has no way of knowing
748 which part of the whole system is limiting the performance.
750 It could be SUT only (no problem, testing SUT as expected),
751 it could be TG only (can be mitigated by TG self-test
752 and using small enough loads).
754 But it could also be an interaction between DUT and TG.
755 Imagine a TG (the Traffic Analyzer part) which is only able
756 to handle incoming traffic up to some rate,
757 but passes the self-test as the Generator part has maximal rate
758 not larger than that. But what if SUT turns that steady rate
759 into long-enough bursts of a higher rate (with delays between bursts
760 large enough, so average forwarding rate matches the load).
761 This way TA will see some packets as missing (when its buffers
762 fill up), even though SUT has processed them correctly.
770 In CSIT we are aggressive at skipping all wait times around trial,
771 but few of DUTs have large enough buffers.
772 Or there is another reason why we are seeing negative loss counts.
777 RFC2544 requires quite conservative time delays
778 see https://datatracker.ietf.org/doc/html/rfc2544#section-23
779 to prevent frames buffered in one trial measurement
780 to be counted as received in a subsequent trial measurement.
782 However, for some SUTs it may still be possible to buffer enough frames,
783 so they are still sending them (perhaps in bursts)
784 when the next trial measurement starts.
785 Sometimes, this can be detected as a negative trial loss count, e.g. TA receiving
786 more frames than TG has sent during this trial measurement. Frame duplication
787 is another way of causing the negative trial loss count.
789 https://datatracker.ietf.org/doc/html/rfc2544#section-10
790 recommends to use sequence numbers in frame payloads,
791 but generating and verifying them requires test equipment resources,
792 which may be not plenty enough to suport at high loads.
793 (Using low enough max load would work, but frequently that would be
794 smaller than SUT's sctual Throughput.)
796 RFC2544 does not offer any solution to the negative loss problem,
797 except implicitly treating negative trial loss counts
798 the same way as positive trial loss counts.
800 This document also does not offer any practical solution.
802 Instead, this document SUGGESTS the search algorithm to take any precaution
803 necessary to avoid very late frames.
805 This document also REQUIRES any detected duplicate frames to be counted
806 as additional lost frames.
807 This document also REQUIRES, any negative trial loss ratio
808 to be treated as positive trial loss ratio of the same absolute value.
812 !!! Make sure this is covered elsewere, at least in better comments. !!!
814 ## TODO: Bad behavior of SUT
816 (Highest load with always zero loss can be quite far from lowest load
817 with always nonzero loss.)
818 (Non-determinism: warm up, periodic "stalls", perf decrease over time, ...)
821 http://www.hit.bme.hu/~lencse/publications/ECC-2017-B-M-DNS64-revised.pdf
822 See page 8 and search for the word "gaming".
826 !!! Nothing below is up-to-date with draft v02. !!!
828 # MLRsearch Background
830 TODO: Old section, probably obsoleted by preceding section(s).
832 Multiple Loss Ratio search (MLRsearch) is a packet throughput search
833 algorithm suitable for deterministic systems (as opposed to
834 probabilistic systems). MLRsearch discovers multiple packet throughput
835 rates in a single search, each rate is associated with a distinct
836 Packet Loss Ratio (PLR) criterion.
838 For cases when multiple rates need to be found, this property makes
839 MLRsearch more efficient in terms of time execution, compared to
840 traditional throughput search algorithms that discover a single packet
841 rate per defined search criteria (e.g. a binary search specified by
842 [RFC2544]). MLRsearch reduces execution time even further by relying on
843 shorter trial durations of intermediate steps, with only the final
844 measurements conducted at the specified final trial duration. This
845 results in the shorter overall search execution time when compared to a
846 traditional binary search, while guaranteeing the same results for
847 deterministic systems.
849 In practice, two rates with distinct PLRs are commonly used for packet
850 throughput measurements of NFV systems: Non Drop Rate (NDR) with PLR=0
851 and Partial Drop Rate (PDR) with PLR>0. The rest of this document
852 describes MLRsearch with NDR and PDR pair as an example.
854 Similarly to other throughput search approaches like binary search,
855 MLRsearch is effective for SUTs/DUTs with PLR curve that is
856 non-decreasing with growing offered load. It may not be as
857 effective for SUTs/DUTs with abnormal PLR curves, although
858 it will always converge to some value.
860 MLRsearch relies on traffic generator to qualify the received packet
861 stream as error-free, and invalidate the results if any disqualifying
862 errors are present e.g. out-of-sequence frames.
864 MLRsearch can be applied to both uni-directional and bi-directional
867 For bi-directional tests, MLRsearch rates and ratios are aggregates of
868 both directions, based on the following assumptions:
870 * Traffic transmitted by traffic generator and received by SUT/DUT
871 has the same packet rate in each direction,
872 in other words the offered load is symmetric.
873 * SUT/DUT packet processing capacity is the same in both directions,
874 resulting in the same packet loss under load.
876 MLRsearch can be applied even without those assumptions,
877 but in that case the aggregate loss ratio is less useful as a metric.
879 MLRsearch can be used for network transactions consisting of more than
880 just one packet, or anything else that has intended load as input
881 and loss ratio as output (duration as input is optional).
882 This text uses mostly packet-centric language.
886 The main properties of MLRsearch:
888 * MLRsearch is a duration aware multi-phase multi-rate search algorithm:
889 * Initial Phase determines promising starting interval for the search.
890 * Intermediate Phases progress towards defined final search criteria.
891 * Final Phase executes measurements according to the final search
893 * Final search criteria are defined by following inputs:
894 * Target PLRs (e.g. 0.0 and 0.005 when searching for NDR and PDR).
895 * Final trial duration.
896 * Measurement resolution.
898 * Measure MRR over initial trial duration.
899 * Measured MRR is used as an input to the first intermediate phase.
900 * Multiple Intermediate Phases:
902 * Start with initial trial duration in the first intermediate phase.
903 * Converge geometrically towards the final trial duration.
904 * Track all previous trial measurement results:
905 * Duration, offered load and loss ratio are tracked.
906 * Effective loss ratios are tracked.
907 * While in practice, real loss ratios can decrease with increasing load,
908 effective loss ratios never decrease. This is achieved by sorting
909 results by load, and using the effective loss ratio of the previous load
910 if the current loss ratio is smaller than that.
911 * The algorithm queries the results to find best lower and upper bounds.
912 * Effective loss ratios are always used.
913 * The phase ends if all target loss ratios have tight enough bounds.
915 * Iterate over target loss ratios in increasing order.
916 * If both upper and lower bound are in measurement results for this duration,
917 apply bisect until the bounds are tight enough,
918 and continue with next loss ratio.
919 * If a bound is missing for this duration, but there exists a bound
920 from the previous duration (compatible with the other bound
921 at this duration), re-measure at the current duration.
922 * If a bound in one direction (upper or lower) is missing for this duration,
923 and the previous duration does not have a compatible bound,
924 compute the current "interval size" from the second tightest bound
925 in the other direction (lower or upper respectively)
926 for the current duration, and choose next offered load for external search.
927 * The logic guarantees that a measurement is never repeated with both
928 duration and offered load being the same.
929 * The logic guarantees that measurements for higher target loss ratio
930 iterations (still within the same phase duration) do not affect validity
931 and tightness of bounds for previous target loss ratio iterations
932 (at the same duration).
933 * Use of internal and external searches:
935 * It is a variant of "exponential search".
936 * The "interval size" is multiplied by a configurable constant
937 (powers of two work well with the subsequent internal search).
939 * A variant of binary search that measures at offered load between
940 the previously found bounds.
941 * The interval does not need to be split into exact halves,
942 if other split can get to the target width goal faster.
943 * The idea is to avoid returning interval narrower than the current
944 width goal. See sample implementation details, below.
946 * Executed with the final test trial duration, and the final width
947 goal that determines resolution of the overall search.
948 * Intermediate Phases together with the Final Phase are called
950 * The returned bounds stay within prescribed min_rate and max_rate.
951 * When returning min_rate or max_rate, the returned bounds may be invalid.
952 * E.g. upper bound at max_rate may come from a measurement
953 with loss ratio still not higher than the target loss ratio.
955 The main benefits of MLRsearch vs. binary search include:
957 * In general, MLRsearch is likely to execute more trials overall, but
958 likely less trials at a set final trial duration.
959 * In well behaving cases, e.g. when results do not depend on trial
960 duration, it greatly reduces (>50%) the overall duration compared to a
961 single PDR (or NDR) binary search over duration, while finding
963 * In all cases MLRsearch yields the same or similar results to binary
965 * Note: both binary search and MLRsearch are susceptible to reporting
966 non-repeatable results across multiple runs for very bad behaving
971 * Worst case MLRsearch can take longer than a binary search, e.g. in case of
972 drastic changes in behaviour for trials at varying durations.
973 * Re-measurement at higher duration can trigger a long external search.
974 That never happens in binary search, which uses the final duration
977 # Sample Implementation
979 Following is a brief description of a sample MLRsearch implementation,
980 which is a simplified version of the existing implementation.
984 1. **max_rate** - Maximum Transmit Rate (MTR) of packets to
985 be used by external traffic generator implementing MLRsearch,
986 limited by the actual Ethernet link(s) rate, NIC model or traffic
987 generator capabilities.
988 2. **min_rate** - minimum packet transmit rate to be used for
989 measurements. MLRsearch fails if lower transmit rate needs to be
990 used to meet search criteria.
991 3. **final_trial_duration** - required trial duration for final rate
993 4. **initial_trial_duration** - trial duration for initial MLRsearch phase.
994 5. **final_relative_width** - required measurement resolution expressed as
995 (lower_bound, upper_bound) interval width relative to upper_bound.
996 6. **packet_loss_ratios** - list of maximum acceptable PLR search criteria.
997 7. **number_of_intermediate_phases** - number of phases between the initial
998 phase and the final phase. Impacts the overall MLRsearch duration.
999 Less phases are required for well behaving cases, more phases
1000 may be needed to reduce the overall search duration for worse behaving cases.
1004 1. First trial measures at configured maximum transmit rate (MTR) and
1005 discovers maximum receive rate (MRR).
1006 * IN: trial_duration = initial_trial_duration.
1007 * IN: offered_transmit_rate = maximum_transmit_rate.
1009 * OUT: measured loss ratio.
1010 * OUT: MRR = measured receive rate.
1011 Received rate is computed as intended load multiplied by pass ratio
1012 (which is one minus loss ratio). This is useful when loss ratio is computed
1013 from a different metric than intended load. For example, intended load
1014 can be in transactions (multiple packets each), but loss ratio is computed
1015 on level of packets, not transactions.
1017 * Example: If MTR is 10 transactions per second, and each transaction has
1018 10 packets, and receive rate is 90 packets per second, then loss rate
1019 is 10%, and MRR is computed to be 9 transactions per second.
1021 If MRR is too close to MTR, MRR is set below MTR so that interval width
1022 is equal to the width goal of the first intermediate phase.
1023 If MRR is less than min_rate, min_rate is used.
1024 2. Second trial measures at MRR and discovers MRR2.
1025 * IN: trial_duration = initial_trial_duration.
1026 * IN: offered_transmit_rate = MRR.
1028 * OUT: measured loss ratio.
1029 * OUT: MRR2 = measured receive rate.
1030 If MRR2 is less than min_rate, min_rate is used.
1031 If loss ratio is less or equal to the smallest target loss ratio,
1032 MRR2 is set to a value above MRR, so that interval width is equal
1033 to the width goal of the first intermediate phase.
1034 MRR2 could end up being equal to MTR (for example if both measurements so far
1035 had zero loss), which was already measured, step 3 is skipped in that case.
1036 3. Third trial measures at MRR2.
1037 * IN: trial_duration = initial_trial_duration.
1038 * IN: offered_transmit_rate = MRR2.
1040 * OUT: measured loss ratio.
1041 * OUT: MRR3 = measured receive rate.
1042 If MRR3 is less than min_rate, min_rate is used.
1043 If step 3 is not skipped, the first trial measurement is forgotten.
1044 This is done because in practice (if MRR2 is above MRR), external search
1045 from MRR and MRR2 is likely to lead to a faster intermediate phase
1046 than a bisect between MRR2 and MTR.
1048 ## Non-Initial Phases
1051 1. IN: trial_duration for the current phase. Set to
1052 initial_trial_duration for the first intermediate phase; to
1053 final_trial_duration for the final phase; or to the element of
1054 interpolating geometric sequence for other intermediate phases.
1055 For example with two intermediate phases, trial_duration of the
1056 second intermediate phase is the geometric average of
1057 initial_trial_duration and final_trial_duration.
1058 2. IN: relative_width_goal for the current phase. Set to
1059 final_relative_width for the final phase; doubled for each
1060 preceding phase. For example with two intermediate phases, the
1061 first intermediate phase uses quadruple of final_relative_width
1062 and the second intermediate phase uses double of
1063 final_relative_width.
1064 3. IN: Measurement results from the previous phase (previous duration).
1065 4. Internal target ratio loop:
1066 1. IN: Target loss ratio for this iteration of ratio loop.
1067 2. IN: Measurement results from all previous ratio loop iterations
1068 of current phase (current duration).
1069 3. DO: According to the procedure described in point 2:
1070 1. either exit the phase (by jumping to 1.5),
1071 2. or exit loop iteration (by continuing with next target loss ratio,
1073 3. or calculate new transmit rate to measure with.
1074 4. DO: Perform the trial measurement at the new transmit rate and
1075 current trial duration, compute its loss ratio.
1076 5. DO: Add the result and go to next iteration (1.4.1),
1077 including the added trial result in 1.4.2.
1078 5. OUT: Measurement results from this phase.
1079 6. OUT: In the final phase, bounds for each target loss ratio
1080 are extracted and returned.
1081 1. If a valid bound does not exist, use min_rate or max_rate.
1082 2. New transmit rate (or exit) calculation (for point 1.4.3):
1083 1. If the previous duration has the best upper and lower bound,
1084 select the middle point as the new transmit rate.
1085 1. See 2.5.3. below for the exact splitting logic.
1086 2. This can be a no-op if interval is narrow enough already,
1087 in that case continue with 2.2.
1088 3. Discussion, assuming the middle point is selected and measured:
1089 1. Regardless of loss rate measured, the result becomes
1090 either best upper or best lower bound at current duration.
1091 2. So this condition is satisfied at most once per iteration.
1092 3. This also explains why previous phase has double width goal:
1093 1. We avoid one more bisection at previous phase.
1094 2. At most one bound (per iteration) is re-measured
1095 with current duration.
1096 3. Each re-measurement can trigger an external search.
1097 4. Such surprising external searches are the main hurdle
1098 in achieving low overall search durations.
1099 5. Even without 1.1, there is at most one external search
1100 per phase and target loss ratio.
1101 6. But without 1.1 there can be two re-measurements,
1102 each coming with a risk of triggering external search.
1103 2. If the previous duration has one bound best, select its transmit rate.
1104 In deterministic case this is the last measurement needed this iteration.
1105 3. If only upper bound exists in current duration results:
1106 1. This can only happen for the smallest target loss ratio.
1107 2. If the upper bound was measured at min_rate,
1108 exit the whole phase early (not investigating other target loss ratios).
1109 3. Select new transmit rate using external search:
1110 1. For computing previous interval size, use:
1111 1. second tightest bound at current duration,
1112 2. or tightest bound of previous duration,
1113 if compatible and giving a more narrow interval,
1114 3. or target interval width if none of the above is available.
1115 4. In any case increase to target interval width if smaller.
1116 2. Quadruple the interval width.
1117 3. Use min_rate if the new transmit rate is lower.
1118 4. If only lower bound exists in current duration results:
1119 1. If the lower bound was measured at max_rate,
1120 exit this iteration (continue with next lowest target loss ratio).
1121 2. Select new transmit rate using external search:
1122 1. For computing previous interval size, use:
1123 1. second tightest bound at current duration,
1124 2. or tightest bound of previous duration,
1125 if compatible and giving a more narrow interval,
1126 3. or target interval width if none of the above is available.
1127 4. In any case increase to target interval width if smaller.
1128 2. Quadruple the interval width.
1129 3. Use max_rate if the new transmit rate is higher.
1130 5. The only remaining option is both bounds in current duration results.
1131 1. This can happen in two ways, depending on how the lower bound
1133 1. It could have been selected for the current loss ratio,
1134 e.g. in re-measurement (2.2) or in initial bisect (2.1).
1135 2. It could have been found as an upper bound for the previous smaller
1136 target loss ratio, in which case it might be too low.
1137 3. The algorithm does not track which one is the case,
1138 as the decision logic works well regardless.
1139 2. Compute "extending down" candidate transmit rate exactly as in 2.3.
1140 3. Compute "bisecting" candidate transmit rate:
1141 1. Compute the current interval width from the two bounds.
1142 2. Express the width as a (float) multiple of the target width goal
1144 3. If the multiple is not higher than one, it means the width goal
1145 is met. Exit this iteration and continue with next higher
1147 4. If the multiple is two or less, use half of that
1148 for new width if the lower subinterval.
1149 5. Round the multiple up to nearest even integer.
1150 6. Use half of that for new width if the lower subinterval.
1151 7. Example: If lower bound is 2.0 and upper bound is 5.0, and width
1152 goal is 1.0, the new candidate transmit rate will be 4.0.
1153 This can save a measurement when 4.0 has small loss.
1154 Selecting the average (3.5) would never save a measurement,
1155 giving more narrow bounds instead.
1156 4. If either candidate computation want to exit the iteration,
1157 do as bisecting candidate computation says.
1158 5. The remaining case is both candidates wanting to measure at some rate.
1159 Use the higher rate. This prefers external search down narrow enough
1160 interval, competing with perfectly sized lower bisect subinterval.
1162 # FD.io CSIT Implementation
1164 The only known working implementation of MLRsearch is in
1165 the open-source code running in Linux Foundation
1166 FD.io CSIT project [FDio-CSIT-MLRsearch] as part of
1167 a Continuous Integration / Continuous Development (CI/CD) framework.
1169 MLRsearch is also available as a Python package in [PyPI-MLRsearch].
1171 ## Additional details
1173 This document so far has been describing a simplified version of
1174 MLRsearch algorithm. The full algorithm as implemented in CSIT contains
1175 additional logic, which makes some of the details (but not general
1176 ideas) above incorrect. Here is a short description of the additional
1177 logic as a list of principles, explaining their main differences from
1178 (or additions to) the simplified description, but without detailing
1179 their mutual interaction.
1181 1. Logarithmic transmit rate.
1182 * In order to better fit the relative width goal, the interval
1183 doubling and halving is done differently.
1184 * For example, the middle of 2 and 8 is 4, not 5.
1185 2. Timeout for bad cases.
1186 * The worst case for MLRsearch is when each phase converges to
1187 intervals way different than the results of the previous phase.
1188 * Rather than suffer total search time several times larger than pure
1189 binary search, the implemented tests fail themselves when the
1190 search takes too long (given by argument *timeout*).
1192 * The number of packets to send during the trial should be equal to
1193 the intended load multiplied by the duration.
1194 * Also multiplied by a coefficient, if loss ratio is calculated
1195 from a different metric.
1196 * Example: If a successful transaction uses 10 packets,
1197 load is given in transactions per second, but loss ratio is calculated
1198 from packets, so the coefficient to get intended count of packets
1200 * But in practice that does not work.
1201 * It could result in a fractional number of packets,
1202 * so it has to be rounded in a way traffic generator chooses,
1203 * which may depend on the number of traffic flows
1204 and traffic generator worker threads.
1205 4. Attempted count. As the real number of intended packets is not known exactly,
1206 the computation uses the number of packets traffic generator reports as sent.
1207 Unless overridden by the next point.
1208 5. Duration stretching.
1209 * In some cases, traffic generator may get overloaded,
1210 causing it to take significantly longer (than duration) to send all packets.
1211 * The implementation uses an explicit stop,
1212 * causing lower attempted count in those cases.
1213 * The implementation tolerates some small difference between
1214 attempted count and intended count.
1215 * 10 microseconds worth of traffic is sufficient for our tests.
1216 * If the difference is higher, the unsent packets are counted as lost.
1217 * This forces the search to avoid the regions of high duration stretching.
1218 * The final bounds describe the performance of not just SUT,
1219 but of the whole system, including the traffic generator.
1221 * In some test (e.g. using TCP flows) Traffic generator reacts to packet loss
1222 by retransmission. Usually, such packet loss is already affecting loss ratio.
1223 If a test also wants to treat retransmissions due to heavily delayed packets
1224 also as a failure, this is once again visible as a mismatch between
1225 the intended count and the attempted count.
1226 * The CSIT implementation simply looks at absolute value of the difference,
1227 so it offers the same small tolerance before it starts marking a "loss".
1228 7. For result processing, we use lower bounds and ignore upper bounds.
1230 ### FD.io CSIT Input Parameters
1232 1. **max_rate** - Typical values: 2 * 14.88 Mpps for 64B
1233 10GE link rate, 2 * 18.75 Mpps for 64B 40GE NIC (specific model).
1234 2. **min_rate** - Value: 2 * 9001 pps (we reserve 9000 pps
1235 for latency measurements).
1236 3. **final_trial_duration** - Value: 30.0 seconds.
1237 4. **initial_trial_duration** - Value: 1.0 second.
1238 5. **final_relative_width** - Value: 0.005 (0.5%).
1239 6. **packet_loss_ratios** - Value: 0.0, 0.005 (0.0% for NDR, 0.5% for PDR).
1240 7. **number_of_intermediate_phases** - Value: 2.
1241 The value has been chosen based on limited experimentation to date.
1242 More experimentation needed to arrive to clearer guidelines.
1243 8. **timeout** - Limit for the overall search duration (for one search).
1244 If MLRsearch oversteps this limit, it immediately declares the test failed,
1245 to avoid wasting even more time on a misbehaving SUT.
1246 Value: 600.0 (seconds).
1247 9. **expansion_coefficient** - Width multiplier for external search.
1248 Value: 4.0 (interval width is quadroupled).
1249 Value of 2.0 is best for well-behaved SUTs, but value of 4.0 has been found
1250 to decrease overall search time for worse-behaved SUT configurations,
1251 contributing more to the overall set of different SUT configurations tested.
1254 ## Example MLRsearch Run
1257 The following list describes a search from a real test run in CSIT
1258 (using the default input values as above).
1260 * Initial phase, trial duration 1.0 second.
1262 Measurement 1, intended load 18750000.0 pps (MTR),
1263 measured loss ratio 0.7089514628479618 (valid upper bound for both NDR and PDR).
1265 Measurement 2, intended load 5457160.071600716 pps (MRR),
1266 measured loss ratio 0.018650817320118702 (new tightest upper bounds).
1268 Measurement 3, intended load 5348832.933500009 pps (slightly less than MRR2
1269 in preparation for first intermediate phase target interval width),
1270 measured loss ratio 0.00964383362905351 (new tightest upper bounds).
1272 * First intermediate phase starts, trial duration still 1.0 seconds.
1274 Measurement 4, intended load 4936605.579021453 pps (no lower bound,
1275 performing external search downwards, for NDR),
1276 measured loss ratio 0.0 (valid lower bound for both NDR and PDR).
1278 Measurement 5, intended load 5138587.208637197 pps (bisecting for NDR),
1279 measured loss ratio 0.0 (new tightest lower bounds).
1281 Measurement 6, intended load 5242656.244044665 pps (bisecting),
1282 measured loss ratio 0.013523745379347257 (new tightest upper bounds).
1284 * Both intervals are narrow enough.
1285 * Second intermediate phase starts, trial duration 5.477225575051661 seconds.
1287 Measurement 7, intended load 5190360.904111567 pps (initial bisect for NDR),
1288 measured loss ratio 0.0023533920869969953 (NDR upper bound, PDR lower bound).
1290 Measurement 8, intended load 5138587.208637197 pps (re-measuring NDR lower bound),
1291 measured loss ratio 1.2080222912800403e-06 (new tightest NDR upper bound).
1293 * The two intervals have separate bounds from now on.
1295 Measurement 9, intended load 4936605.381062318 pps (external NDR search down),
1296 measured loss ratio 0.0 (new valid NDR lower bound).
1298 Measurement 10, intended load 5036583.888432355 pps (NDR bisect),
1299 measured loss ratio 0.0 (new tightest NDR lower bound).
1301 Measurement 11, intended load 5087329.903232804 pps (NDR bisect),
1302 measured loss ratio 0.0 (new tightest NDR lower bound).
1304 * NDR interval is narrow enough, PDR interval not ready yet.
1306 Measurement 12, intended load 5242656.244044665 pps (re-measuring PDR upper bound),
1307 measured loss ratio 0.0101174866190136 (still valid PDR upper bound).
1309 * Also PDR interval is narrow enough, with valid bounds for this duration.
1310 * Final phase starts, trial duration 30.0 seconds.
1312 Measurement 13, intended load 5112894.3238511775 pps (initial bisect for NDR),
1313 measured loss ratio 0.0 (new tightest NDR lower bound).
1315 Measurement 14, intended load 5138587.208637197 (re-measuring NDR upper bound),
1316 measured loss ratio 2.030389804256833e-06 (still valid PDR upper bound).
1318 * NDR interval is narrow enough, PDR interval not yet.
1320 Measurement 15, intended load 5216443.04126728 pps (initial bisect for PDR),
1321 measured loss ratio 0.005620871287975237 (new tightest PDR upper bound).
1323 Measurement 16, intended load 5190360.904111567 (re-measuring PDR lower bound),
1324 measured loss ratio 0.0027629971184465604 (still valid PDR lower bound).
1326 * PDR interval is also narrow enough.
1328 * NDR_LOWER = 5112894.3238511775 pps; NDR_UPPER = 5138587.208637197 pps;
1329 * PDR_LOWER = 5190360.904111567 pps; PDR_UPPER = 5216443.04126728 pps.
1331 # IANA Considerations
1333 No requests of IANA.
1335 # Security Considerations
1337 Benchmarking activities as described in this memo are limited to
1338 technology characterization of a DUT/SUT using controlled stimuli in a
1339 laboratory environment, with dedicated address space and the constraints
1340 specified in the sections above.
1342 The benchmarking network topology will be an independent test setup and
1343 MUST NOT be connected to devices that may forward the test traffic into
1344 a production network or misroute traffic to the test management network.
1346 Further, benchmarking is performed on a "black-box" basis, relying
1347 solely on measurements observable external to the DUT/SUT.
1349 Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
1350 benchmarking purposes. Any implications for network security arising
1351 from the DUT/SUT SHOULD be identical in the lab and in production
1356 Many thanks to Alec Hothan of OPNFV NFVbench project for thorough
1357 review and numerous useful comments and suggestions.