docs/ietf/draft-ietf-bmwg-mlrsearch-02.md

   1 ---
   2 title: Multiple Loss Ratio Search for Packet Throughput (MLRsearch)
   3 abbrev: Multiple Loss Ratio Search
   4 docname: draft-ietf-bmwg-mlrsearch-02
   5 date: 2022-03-07
   6
   7 ipr: trust200902
   8 area: ops
   9 wg: Benchmarking Working Group
  10 kw: Internet-Draft
  11 cat: info
  12
  13 coding: us-ascii
  14 pi:    # can use array (if all yes) or hash here
  15   toc: yes
  16   sortrefs:   # defaults to yes
  17   symrefs: yes
  18
  19 author:
  20       -
  21         ins: M. Konstantynowicz
  22         name: Maciek Konstantynowicz
  23         org: Cisco Systems
  24         role: editor
  25         email: mkonstan@cisco.com
  26       -
  27         ins: V. Polak
  28         name: Vratko Polak
  29         org: Cisco Systems
  30         email: vrpolak@cisco.com
  31
  32 normative:
  33   RFC2544:
  34
  35 informative:
  36   FDio-CSIT-MLRsearch:
  37     target: https://s3-docs.fd.io/csit/rls2110/report/introduction/methodology_data_plane_throughput/methodology_data_plane_throughput.html#mlrsearch-tests
  38     title: "FD.io CSIT Test Methodology - MLRsearch"
  39     date: 2021-11
  40   PyPI-MLRsearch:
  41     target: https://pypi.org/project/MLRsearch/0.4.0/
  42     title: "MLRsearch 0.4.0, Python Package Index"
  43     date: 2021-04
  44
  45 --- abstract
  46
  47 TODO: Update after all sections are ready.
  48
  49 This document proposes changes to [RFC2544], specifically to packet
  50 throughput search methodology, by defining a new search algorithm
  51 referred to as Multiple Loss Ratio search (MLRsearch for short). Instead
  52 of relying on binary search with pre-set starting offered load, it
  53 proposes a novel approach discovering the starting point in the initial
  54 phase, and then searching for packet throughput based on defined packet
  55 loss ratio (PLR) input criteria and defined final trial duration time.
  56 One of the key design principles behind MLRsearch is minimizing the
  57 total test duration and searching for multiple packet throughput rates
  58 (each with a corresponding PLR) concurrently, instead of doing it
  59 sequentially.
  60
  61 The main motivation behind MLRsearch is the new set of challenges and
  62 requirements posed by NFV (Network Function Virtualization),
  63 specifically software based implementations of NFV data planes. Using
  64 [RFC2544] in the experience of the authors yields often not repetitive
  65 and not replicable end results due to a large number of factors that are
  66 out of scope for this draft. MLRsearch aims to address this challenge
  67 in a simple way of getting the same result sooner, so more repetitions
  68 can be done to describe the replicability.
  69
  70 --- middle
  71
  72 {::comment}
  73     As we use kramdown to convert from markdown,
  74     we use this way of marking comments not to be visible in rendered draft.
  75     https://stackoverflow.com/a/42323390
  76     If other engine is used, convert to this way:
  77     https://stackoverflow.com/a/20885980
  78 {:/comment}
  79
  80 # Terminology
  81
  82 TODO: Update after most other sections are updated.
  83
  84 {::comment}
  85     The following is probably not needed (or defined elsewhere).
  86
  87     * Frame size: size of an Ethernet Layer-2 frame on the wire, including
  88       any VLAN tags (dot1q, dot1ad) and Ethernet FCS, but excluding Ethernet
  89       preamble and inter-frame gap. Measured in bytes (octets).
  90     * Packet size: same as frame size, both terms used interchangeably.
  91     * Device Under Test (DUT): In software networking, "device" denotes a
  92       specific piece of software tasked with packet processing. Such device
  93       is surrounded with other software components (such as operating system
  94       kernel). It is not possible to run devices without also running the
  95       other components, and hardware resources are shared between both. For
  96       purposes of testing, the whole set of hardware and software components
  97       is called "system under test" (SUT). As SUT is the part of the whole
  98       test setup performance of which can be measured by [RFC2544] methods,
  99       this document uses SUT instead of [RFC2544] DUT. Device under test
 100       (DUT) can be re-introduced when analysing test results using whitebox
 101       techniques, but this document sticks to blackbox testing.
 102     * System Under Test (SUT): System under test (SUT) is a part of the
 103       whole test setup whose performance is to be benchmarked. The complete
 104       test setup contains other parts, whose performance is either already
 105       established, or not affecting the benchmarking result.
 106     * Bi-directional throughput tests: involve packets/frames flowing in
 107       both transmit and receive directions over every tested interface of
 108       SUT/DUT. Packet flow metrics are measured per direction, and can be
 109       reported as aggregate for both directions and/or separately
 110       for each measured direction. In most cases bi-directional tests
 111       use the same (symmetric) load in both directions.
 112     * Uni-directional throughput tests: involve packets/frames flowing in
 113       only one direction, i.e. either transmit or receive direction, over
 114       every tested interface of SUT/DUT. Packet flow metrics are measured
 115       and are reported for measured direction.
 116     * Packet Throughput Rate: maximum packet offered load DUT/SUT forwards
 117       within the specified Packet Loss Ratio (PLR). In many cases the rate
 118       depends on the frame size processed by DUT/SUT. Hence packet
 119       throughput rate MUST be quoted with specific frame size as received by
 120       DUT/SUT during the measurement. For bi-directional tests, packet
 121       throughput rate should be reported as aggregate for both directions.
 122       Measured in packets-per-second (pps) or frames-per-second (fps),
 123       equivalent metrics.
 124     * Bandwidth Throughput Rate: a secondary metric calculated from packet
 125       throughput rate using formula: bw_rate = pkt_rate * (frame_size +
 126       L1_overhead) * 8, where L1_overhead for Ethernet includes preamble (8
 127       octets) and inter-frame gap (12 octets). For bi-directional tests,
 128       bandwidth throughput rate should be reported as aggregate for both
 129       directions. Expressed in bits-per-second (bps).
 130     * TODO do we need this as it is identical to RFC2544 Throughput?
 131       Non Drop Rate (NDR): maximum packet/bandwidth throughput rate sustained
 132       by DUT/SUT at PLR equal zero (zero packet loss) specific to tested
 133       frame size(s). MUST be quoted with specific packet size as received by
 134       DUT/SUT during the measurement. Packet NDR measured in
 135       packets-per-second (or fps), bandwidth NDR expressed in
 136       bits-per-second (bps).
 137     * TODO if needed, reformulate to make it clear there can be multiple rates
 138       for multiple (non-zero) loss ratios.
 139       : Partial Drop Rate (PDR): maximum packet/bandwidth throughput rate
 140       sustained by DUT/SUT at PLR greater than zero (non-zero packet loss)
 141       specific to tested frame size(s). MUST be quoted with specific packet
 142       size as received by DUT/SUT during the measurement. Packet PDR
 143       measured in packets-per-second (or fps), bandwidth PDR expressed in
 144       bits-per-second (bps).
 145     * TODO: Refer to FRMOL instead.
 146       Maximum Receive Rate (MRR): packet/bandwidth rate regardless of PLR
 147       sustained by DUT/SUT under specified Maximum Transmit Rate (MTR)
 148       packet load offered by traffic generator. MUST be quoted with both
 149       specific packet size and MTR as received by DUT/SUT during the
 150       measurement. Packet MRR measured in packets-per-second (or fps),
 151       bandwidth MRR expressed in bits-per-second (bps).
 152     * TODO just keep using "trial measurement"?
 153       Trial: a single measurement step. See [RFC2544] section 23.
 154     * TODO already defined in RFC2544:
 155       Trial duration: amount of time over which packets are transmitted
 156       in a single measurement step.
 157 {:/comment}
 158 {::comment}
 159 {:/comment}
 160
 161 * TODO: The current text uses Throughput for the zero loss ratio load.
 162   Is the capital T needed/useful?
 163 * DUT and SUT: see the definitions in https://gerrit.fd.io/r/c/csit/+/35545
 164 * Traffic Generator (TG) and Traffic Analyzer (TA): see
 165   https://datatracker.ietf.org/doc/html/rfc6894#section-4
 166   TODO: Maybe there is an earlier RFC?
 167 * Overall search time: the time it takes to find all required loads within
 168   their precision goals, starting from zero trials measured at given
 169   DUT configuration and traffic profile.
 170 * TODO: traffic profile?
 171 * Intended load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.1
 172 * Offered load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.2
 173 * Maximum offered load (MOL): see
 174   https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.3
 175 * Forwarding rate at maximum offered load (FRMOL)
 176   https://datatracker.ietf.org/doc/html/rfc2285#section-3.6.2
 177 * Trial Loss Count: the number of frames transmitted
 178   minus the number of frames received. Negative count is possible,
 179   e.g. when SUT duplicates some frames.
 180 * Trial Loss Ratio: ratio of frames received relative to frames
 181   transmitted over the trial duration.
 182   For bi-directional throughput tests, the aggregate ratio is calculated,
 183   based on the aggregate number of frames transmitted and received.
 184   If the trial loss count is negative, its absolute value MUST be used
 185   to keep compliance with RFC2544.
 186 * Safe load: any value, such that trial measurement at this (or lower)
 187   intended load is correcrly handled by both TG and TA, regardless of SUT behavior.
 188   Frequently, it is not known what the safe load is.
 189 * Max load (TODO rename?): Maximal intended load to be used during search.
 190   Benchmarking team decides which value is low enough
 191   to guarantee values reported by TG and TA are reliable.
 192   It has to be a safe load, but it can be lower than a safe load estimate
 193   for added safety.
 194   See the subsection on unreliable test equipment below.
 195   This value MUST NOT be higher than MOL, which itself MUST NOT
 196   be higher than Maximum Frame Rate
 197   https://datatracker.ietf.org/doc/html/rfc2544#section-20
 198 * Min load: Minimal intended load to be used during search.
 199   Benchmarking team decides which value is high enough
 200   to guarantee the trial measurement results are valid.
 201   E.g. considerable overall search time can be saved by declaring SUT
 202   faulty if min load trial shows too high loss rate.
 203   Zero frames per second is a valid min load value
 204 * Effective loss ratio: a corrected value of trial loss ratio
 205   chosen to avoid difficulties if SUT exhibits decreasing loss ratio
 206   with increasing load. It is the maximum of trial loss ratios
 207   measured at the same duration on all loads smaller than (and including)
 208   the current one.
 209 * Target loss ratio: a loss ratio value acting as an input for the search.
 210   The search is finding tight enough lower and upper bounds in intended load,
 211   so that the measurement at the lower bound has smaller or equal
 212   trial loss ratio, and upper bound has strictly larger trial loss ratio.
 213   For the tightest upper bound, the effective loss ratio is the same as
 214   trial loss ratio at that upper bound load.
 215   For the tightest lower bound, the effective loss ratio can be higher
 216   than the trial loss ratio at that lower bound, but still not larger
 217   than the target loss ratio.
 218 * TODO: Search algorithm.
 219 * TODO: Precision goal.
 220 * TODO: Define a "benchmarking group".
 221 * TODO: Upper and lower bound.
 222 * TODO: Valid and invalid bound?
 223 * TODO: Interval and interval width?
 224
 225 TODO: Mention NIC/PCI bandwidth/pps limits can be lower than bandwidth of medium.
 226
 227 # Intentions of this document
 228
 229 {::comment}
 230     Instead of talking about DUTs being non-deterministic
 231     and vendors "gaming" in order to get better Throughput results,
 232     Maciek and Vratko currently prefer to talk about result repeatability.
 233 {:/comment}
 234
 235 The intention of this document is to provide recommendations for:
 236 * optimizing search for multiple target loss ratios at once,
 237 * speeding up the overall search time,
 238 * improve search results repeatability and comparability.
 239
 240 No part of RFC2544 is intended to be obsoleted by this document.
 241
 242 {::comment}
 243     This document may contain examples which contradict RFC2544 requirements
 244     and suggestions.
 245     That is not an ecouragement for benchmarking groups
 246     to stop being compliant with RFC2544.
 247 {:/comment}
 248
 249 # RFC2544
 250
 251 ## Throughput search
 252
 253 It is useful to restate the key requirements of RFC2544
 254 using the new terminology (see section Terminology).
 255
 256 The following sections of RFC2544 are of interest for this document.
 257
 258 * https://datatracker.ietf.org/doc/html/rfc2544#section-20
 259   Mentions the max load SHOULD not be larget than the theoretical
 260   maximum rate for the frame size on the media.
 261
 262 * https://datatracker.ietf.org/doc/html/rfc2544#section-23
 263   Lists the actions to be done for each trial measurement,
 264   it also mentions loss rate as an example of trial measurement results.
 265   This document uses loss count instead, as that is the quantity
 266   that is easier for the current test equipment to measure,
 267   e.g. it is not affected by the real traffic duration.
 268   TODO: Time uncertainty again.
 269
 270 * https://datatracker.ietf.org/doc/html/rfc2544#section-24
 271   Mentions "full length trials" leading to the Throughput found,
 272   as opposed to shorter trial durations, allowed in an attempt
 273   to "minimize the length of search procedure".
 274   This document talks about "final trial duration" and aims to
 275   "optimize overal search time".
 276
 277 * https://datatracker.ietf.org/doc/html/rfc2544#section-26.1
 278   with https://www.rfc-editor.org/errata/eid422
 279   finaly states requirements for the search procedure.
 280   It boils down to "increase intended load upon zero trial loss
 281   and decrease intended load upon non-zero trial loss".
 282
 283 No additional constraints are placed on the load selection,
 284 and there is no mention of an exit condition, e.g. when there is enough
 285 trial measurements to proclaim the largest load with zero trial loss
 286 (and final trial duration) to be the Throughput found.
 287
 288 {::comment}
 289     The following section is probably not useful enough.
 290
 291     ## Generalized search
 292
 293     Note that the Throughput search can be restated as a "conditional
 294     load search" with a specific condition.
 295
 296     "increase intended load upon trial result satisfying the condition
 297     and decrease intended load upon trial result not satisfying the condition"
 298     where the Throughput condition is "trial loss count is zero".
 299
 300     This works for any condition that can be evaluated from a single
 301     trial measurement result, and is likely to be true at low loads
 302     and false at high loads.
 303
 304     MLRsearch can incorporate multiple different conditions,
 305     as long as there is total ligical ordering between them
 306     (e.g. if a condition for a target loss ratio is not satisfied,
 307     it is also not satisfied for any other codition which uses
 308     larger target loss ratio).
 309
 310     TODO: How to call a "load associated with this particular condition"?
 311 {:/comment}
 312
 313 {::comment}
 314
 315     TODO: Not sure if this subsection is needed an where.
 316
 317     ## Simple bisection
 318
 319     There is one obvious and simple search algorithm which conforms
 320     to throughput search requirements: simple bijection.
 321
 322     Input: target precision, in frames per second.
 323
 324     Procedure:
 325
 326     1. Chose min load to be zero.
 327        1. No need to measure, loss count has to be zero.
 328        2. Use the zero load as the current lower bound.
 329     2. Chose max load to be the max value allowed by bandwidth of the medium.
 330        1. Perform a trial measurement (at the full length duration) at max load.
 331        2. If there is zero trial loss count, return max load as Throughput.
 332        3. Use max load as the current upper bound.
 333     3. Repeat until the difference between lower bound and upper bound is
 334        smaller or equal to the precision goal.
 335        1. If it is not larget, return the current lower bound as Throughput.
 336        2. Else: Chose new load as the arithmetic average of lower and upper bound.
 337        3. Perform a trial measurement (at the full length duration) at this load.
 338        4. If the trial loss rate is zero, consider the load as new lower bound.
 339        5. Else consider the load as the new upper bound.
 340        6. Jump back to the repeat at 3.
 341
 342     Another possible stop condition is the overal search time so far,
 343     but that is not really a different condition, as the time for search to reach
 344     the precision goal is just a function of precision goal, trial duration
 345     and the difference between max and min load.
 346
 347     While this algorithm can be accomodated to search for multiple
 348     target loss ratios "at the same time (see somewhere below),
 349     it is still missing multiple improvement which give MLRsearch
 350     considerably better overal search time in practice.
 351
 352 {:/comment}
 353
 354 # Problems
 355
 356 ## Repeatability and Comparability
 357
 358 RFC2544 does not suggest to repeat Throughput search,
 359 {::comment}probably because the full set of tests already takes long{:/comment}
 360 and from just one Throughput value, it cannot be determined
 361 how repeatable that value is (how likely it is for a repeated Throughput search
 362 to end up with a value less then the precision goal away from the first value).
 363
 364 Depending on SUT behavior, different benchmark groups
 365 can report significantly different Througput values,
 366 even when using identical SUT and test equipment,
 367 just because of minor differences in their search algorithm
 368 (e.g. different max load value).
 369
 370 While repeatability can be addressed by repeating the search several times,
 371 the differences in the comparability scenario may be systematic,
 372 e.g. seeming like a bias in one or both benchmark groups.
 373
 374 MLRsearch algorithm does not really help with the repeatability problem.
 375 This document RECOMMENDS to repeat a selection of "important" tests
 376 ten times, so users can ascertain the repeatability of the results.
 377
 378 TODO: How to report? Average and standard deviation?
 379
 380 Following MLRsearch algorithm leaves less freedom for the benchmark groups
 381 to encounter the comparability problem,
 382 alghough more research is needed to determine the effect
 383 of MLRsearch's tweakable parameters.
 384
 385 {::comment}
 386     Possibly, the old DUTs were quite sharply consistent in their performance,
 387     and/or precision goals were quite large in order to save overal search time.
 388
 389     With software DUTs and with time-efficient search algorithms,
 390     nowadays the repeatability of Throughput can be quite low,
 391     as in standard deviation of repeated Througput results
 392     is considerably higher than the precision goal.
 393 {:/comment}
 394
 395 {::comment}
 396     TODO: Unify with PLRsearch draft.
 397     TODO: No-loss region, random region, lossy region.
 398     TODO: Tweaks with respect to non-zero loss ratio goal.
 399     TODO: Duration dependence?
 400
 401     Both RFC2544 and MLRsearch return Throughput somewhere inside the random region,
 402     or at most the precision goal below it.
 403 {:/comment}
 404
 405 {::comment}
 406     TODO: Make sure this is covered elsewhere, then delete.
 407
 408     ## Search repeatability
 409
 410     The goal of RFC1242 and RFC2544 is to limit how vendors benchmark their DUTs,
 411     in order to force them to report values that have higher chance
 412     to be confirmed by independent benchmarking groups following the same RFCs.
 413
 414     This works well for deterministic DUTs.
 415
 416     But for non-deterministic DUTs, the RFC2544 Throughput value
 417     is only guaranteed to fall somewhere below the lossy region (TODO define).
 418     It is possible to arrive at a value positioned likely high in the random region
 419     at the cost of increased overall search duration,
 420     simply by lowering the load by very small amounts (instead of exact halving)
 421     upon lossy trial and increasing by large amounts upon lossless trial.
 422
 423     Prescribing an exact search algorithm (bisection or MLRsearch or other)
 424     will force vendors to report less "gamey" Throughput values.
 425 {:/comment}
 426
 427 {::comment}
 428     ## Extensions
 429
 430     The following two sections are probably out of scope,
 431     as they does not affect MLRsearch design choices.
 432
 433     ### Direct and inverse measurements
 434
 435     TODO expand: Direct measurement is single trial measurement,
 436     with predescribed inputs and outputs turned directly into the quality of interest
 437     Examples:
 438     Latency https://datatracker.ietf.org/doc/html/rfc2544#section-26.2
 439     is a single direct measurement.
 440     Frame loss rate https://datatracker.ietf.org/doc/html/rfc2544#section-26.3
 441     is a sequence of direct measurements.
 442
 443     TODO expand: Indirect measurement aims to solve an "inverse function problem",
 444     meaning (a part of) trial measurement output is prescribed, and the quantity
 445     of interest is (derived from) the input parameters of trial measurement
 446     that achieves the prescribed output.
 447     In general this is a hard problem, but if the unknown input parameter
 448     is just one-dimensional quantity, algorithms such as bisection
 449     do converge regardless of outputs seen.
 450     We call any such algorithm examining one-dimensional input as "search".
 451     Of course, some exit condition is needed for the search to end.
 452     In case of Throughput, bisection algorithm tracks both upper bound
 453     and lower bound, with lower bound at the end of search is the quantity
 454     satisfying the definition of Throughput.
 455
 456     ### Metrics other than frames
 457
 458     TODO expand: Small TCP transaction can succeed even if some frames are lost.
 459
 460     TODO expand: It is possible for loss ratio to use different metric than load.
 461     E.g. pps loss ratio when traffic profile uses higher level transactions per second.
 462
 463     ### TODO: Stateful DUT
 464
 465     ### TODO: Stateful traffic
 466 {:/comment}
 467
 468 ## Non-Zero Target Loss Ratios
 469
 470 https://datatracker.ietf.org/doc/html/rfc1242#section-3.17
 471 defines Throughput as:
 472     The maximum rate at which none of the offered frames
 473     are dropped by the device.
 474
 475 and then it says:
 476     Since even the loss of one frame in a
 477     data stream can cause significant delays while
 478     waiting for the higher level protocols to time out,
 479     it is useful to know the actual maximum data
 480     rate that the device can support.
 481
 482 {::comment}
 483
 484     While this may still be true for some protocols,
 485     research has been performed...
 486
 487     TODO: Add this link properly: https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-Y.1541-201112-I!!PDF-E&type=items
 488     TODO: List values from that document, from 10^-3 to 4*10^-6.
 489
 490     ...on other protocols and use cases,
 491     resulting in some small but non-zero loss ratios being considered
 492     as acceptable. Unfortunately, the acceptable value depends on use case
 493     and properties such as TCP window size and round trip time,
 494     so no single value of target loss rate (other than zero)
 495     is considered to be universally applicable.
 496
 497 {:/comment}
 498
 499 New "software DUTs" (traffic forwarding programs running on
 500 commercial-off-the-shelf compute server hardware) frequently exhibit quite
 501 low repeatability of Throughput results per above definition.
 502
 503 This is due to, in general, throughput rates of software DUTs (programs)
 504 being sensitive to server resource allocation by OS during runtime,
 505 as well as any interrupts or blocking of software threads involved
 506 in packet processing.
 507
 508 To deal with this, this document recommends discovery of multiple throughput rates of interest for software DUTs that run on general purpose COTS servers (with x86, AArch64 Instruction Set Architectures):
 509 * throughput rate with target of zero packet loss ratio.
 510 * at least one throughput rate with target of non-zero packet loss ratio.
 511
 512
 513 In our experience, the higher the target loss ratio is,
 514 the better is the repeatability of the corresponding load found.
 515
 516 TODO: Define a good name for a load corresponding to a specific non-zero
 517 target loss ration, while keeping Throughput for the load corresponding
 518 to zero target loss ratio.
 519
 520 This document RECOMMENDS the benchmark groups to search for corresponding loads
 521 to at least one non-zero target loss ratio.
 522 This document does not suggest any particular non-zero target loss ratio value
 523 to search the corresponding load for.
 524
 525 {::comment}
 526     What is worse, some benchmark groups (which groups?; citation needed)
 527     started reporting loads that achieved only "approximate zero loss",
 528     while still calling that a Throughput (and thus becoming non-compliant
 529     with RFC2544).
 530 {:/comment}
 531
 532 # Solution ideas
 533
 534 This document gives several independent ideas on how to lower the (average)
 535 overall search time, while remaining unconditionally compliant with RFC2544
 536 (and adding some of extensions).
 537
 538 This document also specifies one particular way to combine all the ideas
 539 into a single search algorithm class (single logic with few tweakable parameters).
 540
 541 Little to no research has been done into the question of which combination
 542 of ideas achieves the best compromise with respect to overal search time,
 543 high repeatability and high comparability.
 544
 545 TODO: How important it is to discuss particular implementation choices,
 546 especially when motivated by non-deterministic SUT behavior?
 547
 548 ## Short duration trials
 549
 550 https://datatracker.ietf.org/doc/html/rfc2544#section-24
 551 already mentions the possibity of using shorter duration
 552 for trials that are not part of "final determination".
 553
 554 Obviously, the upper and lower bound from a smaller duration trial
 555 can be used as the initial upper and lower bound for the final determination.
 556
 557 MLRsearch makes it clear a re-measurement is always needed
 558 (new trial measurement with the same load but longer duration).
 559 It also specifes what to do if the longer trial is no longer a valid bound
 560 (TODO define?), e.g. start an external search.
 561 Additionaly one halving can be saved during the shorter duration search.
 562
 563 ## FRMOL as reasonable start
 564
 565 TODO expand: Overal search ends with "final determination" search,
 566 preceded by "shorter duration search" preceded by "bound initialization",
 567 where the bounds can be considerably different from min and max load.
 568
 569 For SUTs with high repeatability, the FRMOL is usually a good approximation
 570 of Throughput. But for less repeatable SUTs, forwarding rate (TODO define)
 571 is frequently a bad approximation to Throughput, therefore halving
 572 and other robust-to-worst-case approaches have to be used.
 573 Still, forwarding rate at FRMOL load can be a good initial bound.
 574
 575 ## Non-zero loss ratios
 576
 577 See the "Popularity of non-zero target loss ratios" section above.
 578
 579 TODO: Define "trial measurement result classification criteria",
 580 or keep reusing long phrases without definitions?
 581
 582 A search for a load corresponding to a non-zero target loss rate
 583 is very similar to a search for Throughput,
 584 just the criterion when to increase or decrease the intended load
 585 for the next trial measurement uses the comparison of trial loss ratio
 586 to the target loss ratio (instead of comparing loss count to zero)
 587 Any search algorithm that works for Throughput can be easily used also for
 588 non-zero target loss rates, perhaps with small modifications
 589 in places where the measured forwarding rate is used.
 590
 591 Note that it is possible to search for multiple loss ratio goals if needed.
 592
 593 ## Concurrent ratio search
 594
 595 A single trial measurement result can act as an upper bound for a lower
 596 target loss ratio, and as a lower bound for a higher target loss ratio
 597 at the same time. This is an example of how
 598 it can be advantageous to search for all loss ratio goals "at once",
 599 or at least "reuse" trial measurement result done so far.
 600
 601 Even when a search algorithm is fully deterministic in load selection
 602 while focusing on a single loss ratio and trial duration,
 603 the choice of iteration order between target loss ratios and trial durations
 604 can affect the obtained results in subtle ways.
 605 MLRsearch offers one particular ordering.
 606
 607 {::comment}
 608     It is not clear if the current ordering is "best",
 609     it is not even clear how to measure how good an ordering is.
 610     We would need several models for bad SUT behaviors,
 611     bug-free implementations of different orderings,
 612     simulator to show the distribution of rates found,
 613     distribution of overall durations,
 614     and a criterion of which rate distribution is "bad"
 615     and whether it is worth the time saved.
 616 {:/comment}
 617 {::comment}
 618 {:/comment}
 619
 620 ## Load selection heuristics and shortcuts
 621
 622 Aside of the two heuristics already mentioned (FRMOL based initial bounds
 623 and saving one halving when increasing trial duration),
 624 there are other tricks that can save some overall search time
 625 at the cost of keeping the difference between final lower and upper bound
 626 intentionally large (but still within the precision goal).
 627
 628 TODO: Refer implementation subsections on:
 629 * Uneven splits.
 630 * Rounding the interval width up.
 631 * Using old invalid bounds for interval width guessing.
 632
 633 The impact on overall duration is probably small,
 634 and the effect on result distribution maybe even smaller.
 635 TODO: Is the two-liner above useful at all?
 636
 637 # Non-compliance with RFC2544
 638
 639 It is possible to achieve even faster search times by abandoning
 640 some requirements and suggestions of RFC2544,
 641 mainly by reducing the wait times at start and end of trial.
 642
 643 Such results are therefore no longer compliant with RFC2544
 644 (or at least not unconditionally),
 645 but they may still be useful for internal usage, or for comparing
 646 results of different DUTs achieved with an identical non-compliant algorithm.
 647
 648 TODO: Refer to the subsection with CSIT customizations.
 649
 650 # Additional Requirements
 651
 652 RFC2544 can be understood as having a number of implicit requirements.
 653 They are made explicit in this section
 654 (as requirements for this document, not for RFC2544).
 655
 656 Recommendations on how to properly address the implicit requirements
 657 are out of scope of this document.
 658
 659 {::comment}
 660
 661     Although some (insufficient) ideas are proposed.
 662
 663 {:/comment}
 664
 665 ## TODO: Search Stop Criteria
 666
 667 TODO: Mention the timeout parameter?
 668
 669 {::comment}
 670
 671     TODO: highlight importance of results consistency
 672     for SUT performance trending and anomaly detection.
 673
 674 {:/comment}
 675
 676 ## Reliability of Test Equipment
 677
 678 Both TG and TA MUST be able to handle correctly
 679 every intended load used during the search.
 680
 681 On TG side, the difference between Intended Load and Offered Load
 682 MUST be small.
 683
 684 TODO: How small? Difference of one packet may not be measurable
 685 due to time uncertainties.
 686
 687 {::comment}
 688
 689     Maciek: 1 packet out of 10M, that's 10**-7 accuracy.
 690
 691     Vratko: For example, TRex uses several "worker" threads, each doing its own
 692     rounding on how many packets to send, separately per each traffic stream.
 693     For high loads and durations, the observed number of frames transmitted
 694     can differ from the expected (fractional) value by tens of frames.
 695
 696 {:/comment}
 697
 698 TODO expand: time uncertainty.
 699
 700 To ensure that, max load (see Terminology) has to be set to low enough value.
 701 Benchmark groups MAY list the max load value used,
 702 especially if the Throughput value is equal (or close) to the max load.
 703
 704 {::comment}
 705
 706     The following is probably out of scope of this document,
 707     but can be useful when put into a separate document.
 708
 709     TODO expand: If it results in smaller Throughput reported,
 710     it is not a big issue. Treat similarly to bandwidth and PPS limits of NICs.
 711
 712     TODO expand: TA dropping packets when loaded only lowers Throughput,
 713     so not an issue.
 714
 715     TODO expand: TG sending less packets but stopping at target duration
 716     is also fine, as long as the forwarding rate is used as Throughput value,
 717     not the higher intended load.
 718
 719     TODO expand: Duration stretching is not fine.
 720     Neither "check for actual duration" nor "start+sleep+stop"
 721     are reliable solutions due to time overheads and uncertainty
 722     of TG starting/stopping traffic (and TA stopping counting packets).
 723
 724 {:/comment}
 725
 726 Solutions (even problem formulations) for the following open problems
 727 are outside of the scope of this document:
 728 * Detecting when the test equipment operates above its safe load.
 729 * Finding a large but safe load value.
 730 * Correcting any result affected by max load value not being a safe load.
 731
 732 {::comment}
 733
 734     TODO: Mention 90% of self-test as an idea:
 735     https://datatracker.ietf.org/doc/html/rfc8219#section-9.2.1
 736
 737     This is pointing to DNS testing, nothing to do with throughput,
 738     so how is it relevant here?
 739
 740 {:/comment}
 741
 742 {::comment}
 743
 744     Part of discussion on BMWG mailing list (with small edits):
 745
 746     This is a hard issue.
 747     The algorithm as described has no way of knowing
 748     which part of the whole system is limiting the performance.
 749
 750     It could be SUT only (no problem, testing SUT as expected),
 751     it could be TG only (can be mitigated by TG self-test
 752     and using small enough loads).
 753
 754     But it could also be an interaction between DUT and TG.
 755     Imagine a TG (the Traffic Analyzer part) which is only able
 756     to handle incoming traffic up to some rate,
 757     but passes the self-test as the Generator part has maximal rate
 758     not larger than that. But what if SUT turns that steady rate
 759     into long-enough bursts of a higher rate (with delays between bursts
 760     large enough, so average forwarding rate matches the load).
 761     This way TA will see some packets as missing (when its buffers
 762     fill up), even though SUT has processed them correctly.
 763
 764 {:/comment}
 765
 766 ### Very late frames
 767
 768 {::comment}
 769
 770     In CSIT we are aggressive at skipping all wait times around trial,
 771     but few of DUTs have large enough buffers.
 772     Or there is another reason why we are seeing negative loss counts.
 773
 774 {:/comment}
 775
 776
 777 RFC2544 requires quite conservative time delays
 778 see https://datatracker.ietf.org/doc/html/rfc2544#section-23
 779 to prevent frames buffered in one trial measurement
 780 to be counted as received in a subsequent trial measurement.
 781
 782 However, for some SUTs it may still be possible to buffer enough frames,
 783 so they are still sending them (perhaps in bursts)
 784 when the next trial measurement starts.
 785 Sometimes, this can be detected as a negative trial loss count, e.g. TA receiving
 786 more frames than TG has sent during this trial measurement. Frame duplication
 787 is another way of causing the negative trial loss count.
 788
 789 https://datatracker.ietf.org/doc/html/rfc2544#section-10
 790 recommends to use sequence numbers in frame payloads,
 791 but generating and verifying them requires test equipment resources,
 792 which may be not plenty enough to suport at high loads.
 793 (Using low enough max load would work, but frequently that would be
 794 smaller than SUT's sctual Throughput.)
 795
 796 RFC2544 does not offer any solution to the negative loss problem,
 797 except implicitly treating negative trial loss counts
 798 the same way as positive trial loss counts.
 799
 800 This document also does not offer any practical solution.
 801
 802 Instead, this document SUGGESTS the search algorithm to take any precaution
 803 necessary to avoid very late frames.
 804
 805 This document also REQUIRES any detected duplicate frames to be counted
 806 as additional lost frames.
 807 This document also REQUIRES, any negative trial loss ratio
 808 to be treated as positive trial loss ratio of the same absolute value.
 809
 810 {::comment}
 811
 812     !!! Make sure this is covered elsewere, at least in better comments. !!!
 813
 814     ## TODO: Bad behavior of SUT
 815
 816     (Highest load with always zero loss can be quite far from lowest load
 817     with always nonzero loss.)
 818     (Non-determinism: warm up, periodic "stalls", perf decrease over time, ...)
 819
 820     Big buffers:
 821     http://www.hit.bme.hu/~lencse/publications/ECC-2017-B-M-DNS64-revised.pdf
 822     See page 8 and search for the word "gaming".
 823
 824 {:/comment}
 825
 826 !!! Nothing below is up-to-date with draft v02. !!!
 827
 828 # MLRsearch Background
 829
 830 TODO: Old section, probably obsoleted by preceding section(s).
 831
 832 Multiple Loss Ratio search (MLRsearch) is a packet throughput search
 833 algorithm suitable for deterministic systems (as opposed to
 834 probabilistic systems). MLRsearch discovers multiple packet throughput
 835 rates in a single search, each rate is associated with a distinct
 836 Packet Loss Ratio (PLR) criterion.
 837
 838 For cases when multiple rates need to be found, this property makes
 839 MLRsearch more efficient in terms of time execution, compared to
 840 traditional throughput search algorithms that discover a single packet
 841 rate per defined search criteria (e.g. a binary search specified by
 842 [RFC2544]). MLRsearch reduces execution time even further by relying on
 843 shorter trial durations of intermediate steps, with only the final
 844 measurements conducted at the specified final trial duration. This
 845 results in the shorter overall search execution time when compared to a
 846 traditional binary search, while guaranteeing the same results for
 847 deterministic systems.
 848
 849 In practice, two rates with distinct PLRs are commonly used for packet
 850 throughput measurements of NFV systems: Non Drop Rate (NDR) with PLR=0
 851 and Partial Drop Rate (PDR) with PLR>0. The rest of this document
 852 describes MLRsearch with NDR and PDR pair as an example.
 853
 854 Similarly to other throughput search approaches like binary search,
 855 MLRsearch is effective for SUTs/DUTs with PLR curve that is
 856 non-decreasing with growing offered load. It may not be as
 857 effective for SUTs/DUTs with abnormal PLR curves, although
 858 it will always converge to some value.
 859
 860 MLRsearch relies on traffic generator to qualify the received packet
 861 stream as error-free, and invalidate the results if any disqualifying
 862 errors are present e.g. out-of-sequence frames.
 863
 864 MLRsearch can be applied to both uni-directional and bi-directional
 865 throughput tests.
 866
 867 For bi-directional tests, MLRsearch rates and ratios are aggregates of
 868 both directions, based on the following assumptions:
 869
 870 * Traffic transmitted by traffic generator and received by SUT/DUT
 871   has the same packet rate in each direction,
 872   in other words the offered load is symmetric.
 873 * SUT/DUT packet processing capacity is the same in both directions,
 874   resulting in the same packet loss under load.
 875
 876 MLRsearch can be applied even without those assumptions,
 877 but in that case the aggregate loss ratio is less useful as a metric.
 878
 879 MLRsearch can be used for network transactions consisting of more than
 880 just one packet, or anything else that has intended load as input
 881 and loss ratio as output (duration as input is optional).
 882 This text uses mostly packet-centric language.
 883
 884 # MLRsearch Overview
 885
 886 The main properties of MLRsearch:
 887
 888 * MLRsearch is a duration aware multi-phase multi-rate search algorithm:
 889   * Initial Phase determines promising starting interval for the search.
 890   * Intermediate Phases progress towards defined final search criteria.
 891   * Final Phase executes measurements according to the final search
 892     criteria.
 893   * Final search criteria are defined by following inputs:
 894     * Target PLRs (e.g. 0.0 and 0.005 when searching for NDR and PDR).
 895     * Final trial duration.
 896     * Measurement resolution.
 897 * Initial Phase:
 898   * Measure MRR over initial trial duration.
 899   * Measured MRR is used as an input to the first intermediate phase.
 900 * Multiple Intermediate Phases:
 901   * Trial duration:
 902     * Start with initial trial duration in the first intermediate phase.
 903     * Converge geometrically towards the final trial duration.
 904   * Track all previous trial measurement results:
 905     * Duration, offered load and loss ratio are tracked.
 906     * Effective loss ratios are tracked.
 907       * While in practice, real loss ratios can decrease with increasing load,
 908         effective loss ratios never decrease. This is achieved by sorting
 909         results by load, and using the effective loss ratio of the previous load
 910         if the current loss ratio is smaller than that.
 911     * The algorithm queries the results to find best lower and upper bounds.
 912       * Effective loss ratios are always used.
 913     * The phase ends if all target loss ratios have tight enough bounds.
 914   * Search:
 915     * Iterate over target loss ratios in increasing order.
 916     * If both upper and lower bound are in measurement results for this duration,
 917       apply bisect until the bounds are tight enough,
 918       and continue with next loss ratio.
 919     * If a bound is missing for this duration, but there exists a bound
 920       from the previous duration (compatible with the other bound
 921       at this duration), re-measure at the current duration.
 922     * If a bound in one direction (upper or lower) is missing for this duration,
 923       and the previous duration does not have a compatible bound,
 924       compute the current "interval size" from the second tightest bound
 925       in the other direction (lower or upper respectively)
 926       for the current duration, and choose next offered load for external search.
 927     * The logic guarantees that a measurement is never repeated with both
 928       duration and offered load being the same.
 929     * The logic guarantees that measurements for higher target loss ratio
 930       iterations (still within the same phase duration) do not affect validity
 931       and tightness of bounds for previous target loss ratio iterations
 932       (at the same duration).
 933   * Use of internal and external searches:
 934     * External search:
 935       * It is a variant of "exponential search".
 936       * The "interval size" is multiplied by a configurable constant
 937         (powers of two work well with the subsequent internal search).
 938     * Internal search:
 939       * A variant of binary search that measures at offered load between
 940         the previously found bounds.
 941       * The interval does not need to be split into exact halves,
 942         if other split can get to the target width goal faster.
 943         * The idea is to avoid returning interval narrower than the current
 944           width goal. See sample implementation details, below.
 945 * Final Phase:
 946   * Executed with the final test trial duration, and the final width
 947     goal that determines resolution of the overall search.
 948 * Intermediate Phases together with the Final Phase are called
 949   Non-Initial Phases.
 950 * The returned bounds stay within prescribed min_rate and max_rate.
 951   * When returning min_rate or max_rate, the returned bounds may be invalid.
 952     * E.g. upper bound at max_rate may come from a measurement
 953       with loss ratio still not higher than the target loss ratio.
 954
 955 The main benefits of MLRsearch vs. binary search include:
 956
 957 * In general, MLRsearch is likely to execute more trials overall, but
 958   likely less trials at a set final trial duration.
 959 * In well behaving cases, e.g. when results do not depend on trial
 960   duration, it greatly reduces (>50%) the overall duration compared to a
 961   single PDR (or NDR) binary search over duration, while finding
 962   multiple drop rates.
 963 * In all cases MLRsearch yields the same or similar results to binary
 964   search.
 965 * Note: both binary search and MLRsearch are susceptible to reporting
 966   non-repeatable results across multiple runs for very bad behaving
 967   cases.
 968
 969 Caveats:
 970
 971 * Worst case MLRsearch can take longer than a binary search, e.g. in case of
 972   drastic changes in behaviour for trials at varying durations.
 973   * Re-measurement at higher duration can trigger a long external search.
 974     That never happens in binary search, which uses the final duration
 975     from the start.
 976
 977 # Sample Implementation
 978
 979 Following is a brief description of a sample MLRsearch implementation,
 980 which is a simplified version of the existing implementation.
 981
 982 ## Input Parameters
 983
 984 1. **max_rate** - Maximum Transmit Rate (MTR) of packets to
 985    be used by external traffic generator implementing MLRsearch,
 986    limited by the actual Ethernet link(s) rate, NIC model or traffic
 987    generator capabilities.
 988 2. **min_rate** - minimum packet transmit rate to be used for
 989    measurements. MLRsearch fails if lower transmit rate needs to be
 990    used to meet search criteria.
 991 3. **final_trial_duration** - required trial duration for final rate
 992    measurements.
 993 4. **initial_trial_duration** - trial duration for initial MLRsearch phase.
 994 5. **final_relative_width** - required measurement resolution expressed as
 995    (lower_bound, upper_bound) interval width relative to upper_bound.
 996 6. **packet_loss_ratios** - list of maximum acceptable PLR search criteria.
 997 7. **number_of_intermediate_phases** - number of phases between the initial
 998    phase and the final phase. Impacts the overall MLRsearch duration.
 999    Less phases are required for well behaving cases, more phases
1000    may be needed to reduce the overall search duration for worse behaving cases.
1001
1002 ## Initial Phase
1003
1004 1. First trial measures at configured maximum transmit rate (MTR) and
1005    discovers maximum receive rate (MRR).
1006    * IN: trial_duration = initial_trial_duration.
1007    * IN: offered_transmit_rate = maximum_transmit_rate.
1008    * DO: single trial.
1009    * OUT: measured loss ratio.
1010    * OUT: MRR = measured receive rate.
1011    Received rate is computed as intended load multiplied by pass ratio
1012    (which is one minus loss ratio). This is useful when loss ratio is computed
1013    from a different metric than intended load. For example, intended load
1014    can be in transactions (multiple packets each), but loss ratio is computed
1015    on level of packets, not transactions.
1016
1017    * Example: If MTR is 10 transactions per second, and each transaction has
1018      10 packets, and receive rate is 90 packets per second, then loss rate
1019      is 10%, and MRR is computed to be 9 transactions per second.
1020
1021    If MRR is too close to MTR, MRR is set below MTR so that interval width
1022    is equal to the width goal of the first intermediate phase.
1023    If MRR is less than min_rate, min_rate is used.
1024 2. Second trial measures at MRR and discovers MRR2.
1025    * IN: trial_duration = initial_trial_duration.
1026    * IN: offered_transmit_rate = MRR.
1027    * DO: single trial.
1028    * OUT: measured loss ratio.
1029    * OUT: MRR2 = measured receive rate.
1030    If MRR2 is less than min_rate, min_rate is used.
1031    If loss ratio is less or equal to the smallest target loss ratio,
1032    MRR2 is set to a value above MRR, so that interval width is equal
1033    to the width goal of the first intermediate phase.
1034    MRR2 could end up being equal to MTR (for example if both measurements so far
1035    had zero loss), which was already measured, step 3 is skipped in that case.
1036 3. Third trial measures at MRR2.
1037    * IN: trial_duration = initial_trial_duration.
1038    * IN: offered_transmit_rate = MRR2.
1039    * DO: single trial.
1040    * OUT: measured loss ratio.
1041    * OUT: MRR3 = measured receive rate.
1042    If MRR3 is less than min_rate, min_rate is used.
1043    If step 3 is not skipped, the first trial measurement is forgotten.
1044    This is done because in practice (if MRR2 is above MRR), external search
1045    from MRR and MRR2 is likely to lead to a faster intermediate phase
1046    than a bisect between MRR2 and MTR.
1047
1048 ## Non-Initial Phases
1049
1050 1. Main phase loop:
1051    1. IN: trial_duration for the current phase. Set to
1052       initial_trial_duration for the first intermediate phase; to
1053       final_trial_duration for the final phase; or to the element of
1054       interpolating geometric sequence for other intermediate phases.
1055       For example with two intermediate phases, trial_duration of the
1056       second intermediate phase is the geometric average of
1057       initial_trial_duration and final_trial_duration.
1058    2. IN: relative_width_goal for the current phase. Set to
1059       final_relative_width for the final phase; doubled for each
1060       preceding phase. For example with two intermediate phases, the
1061       first intermediate phase uses quadruple of final_relative_width
1062       and the second intermediate phase uses double of
1063       final_relative_width.
1064    3. IN: Measurement results from the previous phase (previous duration).
1065    4. Internal target ratio loop:
1066       1. IN: Target loss ratio for this iteration of ratio loop.
1067       2. IN: Measurement results from all previous ratio loop iterations
1068          of current phase (current duration).
1069       3. DO: According to the procedure described in point 2:
1070          1. either exit the phase (by jumping to 1.5),
1071          2. or exit loop iteration (by continuing with next target loss ratio,
1072             jumping to 1.4.1),
1073          3. or calculate new transmit rate to measure with.
1074       4. DO: Perform the trial measurement at the new transmit rate and
1075          current trial duration, compute its loss ratio.
1076       5. DO: Add the result and go to next iteration (1.4.1),
1077          including the added trial result in 1.4.2.
1078    5. OUT: Measurement results from this phase.
1079    6. OUT: In the final phase, bounds for each target loss ratio
1080       are extracted and returned.
1081       1. If a valid bound does not exist, use min_rate or max_rate.
1082 2. New transmit rate (or exit) calculation (for point 1.4.3):
1083    1. If the previous duration has the best upper and lower bound,
1084       select the middle point as the new transmit rate.
1085       1. See 2.5.3. below for the exact splitting logic.
1086       2. This can be a no-op if interval is narrow enough already,
1087          in that case continue with 2.2.
1088       3. Discussion, assuming the middle point is selected and measured:
1089          1. Regardless of loss rate measured, the result becomes
1090             either best upper or best lower bound at current duration.
1091          2. So this condition is satisfied at most once per iteration.
1092          3. This also explains why previous phase has double width goal:
1093             1. We avoid one more bisection at previous phase.
1094             2. At most one bound (per iteration) is re-measured
1095                with current duration.
1096             3. Each re-measurement can trigger an external search.
1097             4. Such surprising external searches are the main hurdle
1098                in achieving low overall search durations.
1099             5. Even without 1.1, there is at most one external search
1100                per phase and target loss ratio.
1101             6. But without 1.1 there can be two re-measurements,
1102                each coming with a risk of triggering external search.
1103    2. If the previous duration has one bound best, select its transmit rate.
1104       In deterministic case this is the last measurement needed this iteration.
1105    3. If only upper bound exists in current duration results:
1106       1. This can only happen for the smallest target loss ratio.
1107       2. If the upper bound was measured at min_rate,
1108          exit the whole phase early (not investigating other target loss ratios).
1109       3. Select new transmit rate using external search:
1110          1. For computing previous interval size, use:
1111             1. second tightest bound at current duration,
1112             2. or tightest bound of previous duration,
1113                if compatible and giving a more narrow interval,
1114             3. or target interval width if none of the above is available.
1115             4. In any case increase to target interval width if smaller.
1116          2. Quadruple the interval width.
1117          3. Use min_rate if the new transmit rate is lower.
1118    4. If only lower bound exists in current duration results:
1119       1. If the lower bound was measured at max_rate,
1120          exit this iteration (continue with next lowest target loss ratio).
1121       2. Select new transmit rate using external search:
1122          1. For computing previous interval size, use:
1123             1. second tightest bound at current duration,
1124             2. or tightest bound of previous duration,
1125                if compatible and giving a more narrow interval,
1126             3. or target interval width if none of the above is available.
1127             4. In any case increase to target interval width if smaller.
1128          2. Quadruple the interval width.
1129          3. Use max_rate if the new transmit rate is higher.
1130    5. The only remaining option is both bounds in current duration results.
1131       1. This can happen in two ways, depending on how the lower bound
1132          was chosen.
1133          1. It could have been selected for the current loss ratio,
1134             e.g. in re-measurement (2.2) or in initial bisect (2.1).
1135          2. It could have been found as an upper bound for the previous smaller
1136             target loss ratio, in which case it might be too low.
1137          3. The algorithm does not track which one is the case,
1138             as the decision logic works well regardless.
1139       2. Compute "extending down" candidate transmit rate exactly as in 2.3.
1140       3. Compute "bisecting" candidate transmit rate:
1141          1. Compute the current interval width from the two bounds.
1142          2. Express the width as a (float) multiple of the target width goal
1143             for this phase.
1144          3. If the multiple is not higher than one, it means the width goal
1145             is met. Exit this iteration and continue with next higher
1146             target loss ratio.
1147          4. If the multiple is two or less, use half of that
1148             for new width if the lower subinterval.
1149          5. Round the multiple up to nearest even integer.
1150          6. Use half of that for new width if the lower subinterval.
1151          7. Example: If lower bound is 2.0 and upper bound is 5.0, and width
1152             goal is 1.0, the new candidate transmit rate will be 4.0.
1153             This can save a measurement when 4.0 has small loss.
1154             Selecting the average (3.5) would never save a measurement,
1155             giving more narrow bounds instead.
1156       4. If either candidate computation want to exit the iteration,
1157          do as bisecting candidate computation says.
1158       5. The remaining case is both candidates wanting to measure at some rate.
1159          Use the higher rate. This prefers external search down narrow enough
1160          interval, competing with perfectly sized lower bisect subinterval.
1161
1162 # FD.io CSIT Implementation
1163
1164 The only known working implementation of MLRsearch is in
1165 the open-source code running in Linux Foundation
1166 FD.io CSIT project [FDio-CSIT-MLRsearch] as part of
1167 a Continuous Integration / Continuous Development (CI/CD) framework.
1168
1169 MLRsearch is also available as a Python package in [PyPI-MLRsearch].
1170
1171 ## Additional details
1172
1173 This document so far has been describing a simplified version of
1174 MLRsearch algorithm. The full algorithm as implemented in CSIT contains
1175 additional logic, which makes some of the details (but not general
1176 ideas) above incorrect. Here is a short description of the additional
1177 logic as a list of principles, explaining their main differences from
1178 (or additions to) the simplified description, but without detailing
1179 their mutual interaction.
1180
1181 1. Logarithmic transmit rate.
1182    * In order to better fit the relative width goal, the interval
1183      doubling and halving is done differently.
1184    * For example, the middle of 2 and 8 is 4, not 5.
1185 2. Timeout for bad cases.
1186    * The worst case for MLRsearch is when each phase converges to
1187      intervals way different than the results of the previous phase.
1188    * Rather than suffer total search time several times larger than pure
1189      binary search, the implemented tests fail themselves when the
1190      search takes too long (given by argument *timeout*).
1191 3. Intended count.
1192    * The number of packets to send during the trial should be equal to
1193      the intended load multiplied by the duration.
1194      * Also multiplied by a coefficient, if loss ratio is calculated
1195        from a different metric.
1196        * Example: If a successful transaction uses 10 packets,
1197          load is given in transactions per second, but loss ratio is calculated
1198          from packets, so the coefficient to get intended count of packets
1199          is 10.
1200    * But in practice that does not work.
1201      * It could result in a fractional number of packets,
1202      * so it has to be rounded in a way traffic generator chooses,
1203      * which may depend on the number of traffic flows
1204        and traffic generator worker threads.
1205 4. Attempted count. As the real number of intended packets is not known exactly,
1206    the computation uses the number of packets traffic generator reports as sent.
1207    Unless overridden by the next point.
1208 5. Duration stretching.
1209    * In some cases, traffic generator may get overloaded,
1210      causing it to take significantly longer (than duration) to send all packets.
1211    * The implementation uses an explicit stop,
1212      * causing lower attempted count in those cases.
1213    * The implementation tolerates some small difference between
1214      attempted count and intended count.
1215      * 10 microseconds worth of traffic is sufficient for our tests.
1216    * If the difference is higher, the unsent packets are counted as lost.
1217      * This forces the search to avoid the regions of high duration stretching.
1218      * The final bounds describe the performance of not just SUT,
1219        but of the whole system, including the traffic generator.
1220 6. Excess packets.
1221    * In some test (e.g. using TCP flows) Traffic generator reacts to packet loss
1222      by retransmission. Usually, such packet loss is already affecting loss ratio.
1223      If a test also wants to treat retransmissions due to heavily delayed packets
1224      also as a failure, this is once again visible as a mismatch between
1225      the intended count and the attempted count.
1226    * The CSIT implementation simply looks at absolute value of the difference,
1227      so it offers the same small tolerance before it starts marking a "loss".
1228 7. For result processing, we use lower bounds and ignore upper bounds.
1229
1230 ### FD.io CSIT Input Parameters
1231
1232 1. **max_rate** - Typical values: 2 * 14.88 Mpps for 64B
1233    10GE link rate, 2 * 18.75 Mpps for 64B 40GE NIC (specific model).
1234 2. **min_rate** - Value: 2 * 9001 pps (we reserve 9000 pps
1235    for latency measurements).
1236 3. **final_trial_duration** - Value: 30.0 seconds.
1237 4. **initial_trial_duration** - Value: 1.0 second.
1238 5. **final_relative_width** - Value: 0.005 (0.5%).
1239 6. **packet_loss_ratios** - Value: 0.0, 0.005 (0.0% for NDR, 0.5% for PDR).
1240 7. **number_of_intermediate_phases** - Value: 2.
1241    The value has been chosen based on limited experimentation to date.
1242    More experimentation needed to arrive to clearer guidelines.
1243 8. **timeout** - Limit for the overall search duration (for one search).
1244    If MLRsearch oversteps this limit, it immediately declares the test failed,
1245    to avoid wasting even more time on a misbehaving SUT.
1246    Value: 600.0 (seconds).
1247 9. **expansion_coefficient** - Width multiplier for external search.
1248    Value: 4.0 (interval width is quadroupled).
1249    Value of 2.0 is best for well-behaved SUTs, but value of 4.0 has been found
1250    to decrease overall search time for worse-behaved SUT configurations,
1251    contributing more to the overall set of different SUT configurations tested.
1252
1253
1254 ## Example MLRsearch Run
1255
1256
1257 The following list describes a search from a real test run in CSIT
1258 (using the default input values as above).
1259
1260 * Initial phase, trial duration 1.0 second.
1261
1262 Measurement 1, intended load 18750000.0 pps (MTR),
1263 measured loss ratio 0.7089514628479618 (valid upper bound for both NDR and PDR).
1264
1265 Measurement 2, intended load 5457160.071600716 pps (MRR),
1266 measured loss ratio 0.018650817320118702 (new tightest upper bounds).
1267
1268 Measurement 3, intended load 5348832.933500009 pps (slightly less than MRR2
1269 in preparation for first intermediate phase target interval width),
1270 measured loss ratio 0.00964383362905351 (new tightest upper bounds).
1271
1272 * First intermediate phase starts, trial duration still 1.0 seconds.
1273
1274 Measurement 4, intended load 4936605.579021453 pps (no lower bound,
1275 performing external search downwards, for NDR),
1276 measured loss ratio 0.0 (valid lower bound for both NDR and PDR).
1277
1278 Measurement 5, intended load 5138587.208637197 pps (bisecting for NDR),
1279 measured loss ratio 0.0 (new tightest lower bounds).
1280
1281 Measurement 6, intended load 5242656.244044665 pps (bisecting),
1282 measured loss ratio 0.013523745379347257 (new tightest upper bounds).
1283
1284 * Both intervals are narrow enough.
1285 * Second intermediate phase starts, trial duration 5.477225575051661 seconds.
1286
1287 Measurement 7, intended load 5190360.904111567 pps (initial bisect for NDR),
1288 measured loss ratio 0.0023533920869969953 (NDR upper bound, PDR lower bound).
1289
1290 Measurement 8, intended load 5138587.208637197 pps (re-measuring NDR lower bound),
1291 measured loss ratio 1.2080222912800403e-06 (new tightest NDR upper bound).
1292
1293 * The two intervals have separate bounds from now on.
1294
1295 Measurement 9, intended load 4936605.381062318 pps (external NDR search down),
1296 measured loss ratio 0.0 (new valid NDR lower bound).
1297
1298 Measurement 10, intended load 5036583.888432355 pps (NDR bisect),
1299 measured loss ratio 0.0 (new tightest NDR lower bound).
1300
1301 Measurement 11, intended load 5087329.903232804 pps (NDR bisect),
1302 measured loss ratio 0.0 (new tightest NDR lower bound).
1303
1304 * NDR interval is narrow enough, PDR interval not ready yet.
1305
1306 Measurement 12, intended load 5242656.244044665 pps (re-measuring PDR upper bound),
1307 measured loss ratio 0.0101174866190136 (still valid PDR upper bound).
1308
1309 * Also PDR interval is narrow enough, with valid bounds for this duration.
1310 * Final phase starts, trial duration 30.0 seconds.
1311
1312 Measurement 13, intended load 5112894.3238511775 pps (initial bisect for NDR),
1313 measured loss ratio 0.0 (new tightest NDR lower bound).
1314
1315 Measurement 14, intended load 5138587.208637197 (re-measuring NDR upper bound),
1316 measured loss ratio 2.030389804256833e-06 (still valid PDR upper bound).
1317
1318 * NDR interval is narrow enough, PDR interval not yet.
1319
1320 Measurement 15, intended load 5216443.04126728 pps (initial bisect for PDR),
1321 measured loss ratio 0.005620871287975237 (new tightest PDR upper bound).
1322
1323 Measurement 16, intended load 5190360.904111567 (re-measuring PDR lower bound),
1324 measured loss ratio 0.0027629971184465604 (still valid PDR lower bound).
1325
1326 * PDR interval is also narrow enough.
1327 * Returning bounds:
1328 * NDR_LOWER = 5112894.3238511775 pps; NDR_UPPER = 5138587.208637197 pps;
1329 * PDR_LOWER = 5190360.904111567 pps; PDR_UPPER = 5216443.04126728 pps.
1330
1331 # IANA Considerations
1332
1333 No requests of IANA.
1334
1335 # Security Considerations
1336
1337 Benchmarking activities as described in this memo are limited to
1338 technology characterization of a DUT/SUT using controlled stimuli in a
1339 laboratory environment, with dedicated address space and the constraints
1340 specified in the sections above.
1341
1342 The benchmarking network topology will be an independent test setup and
1343 MUST NOT be connected to devices that may forward the test traffic into
1344 a production network or misroute traffic to the test management network.
1345
1346 Further, benchmarking is performed on a "black-box" basis, relying
1347 solely on measurements observable external to the DUT/SUT.
1348
1349 Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
1350 benchmarking purposes. Any implications for network security arising
1351 from the DUT/SUT SHOULD be identical in the lab and in production
1352 networks.
1353
1354 # Acknowledgements
1355
1356 Many thanks to Alec Hothan of OPNFV NFVbench project for thorough
1357 review and numerous useful comments and suggestions.
1358
1359 --- back