docs/content/methodology/measurements/data_plane_throughput/mlr_search.md

   1 ---
   2 title: "MLR Search"
   3 weight: 2
   4 ---
   5
   6 # MLR Search
   7
   8 ## Overview
   9
  10 Multiple Loss Ratio search (MLRsearch) tests use an optimized search algorithm
  11 implemented in FD.io CSIT project. MLRsearch discovers conditional throughput
  12 corresponding to any number of loss ratio goals, within a single search.
  13
  14 Two loss ratio goals are of interest in FD.io CSIT, leading to Non-Drop Rate
  15 (NDR, loss ratio goal is exact zero) and Partial Drop Rate
  16 (PDR, 0.5% loss ratio goal).
  17 Instead of a single long trial, a sequence of short (1s) trials is done.
  18 Thus, instead of final trial duration, a duration sum (21s) is prescribed.
  19 This allows the algorithm to make a decision sooner,
  20 when the results are quite one-sided.
  21 Also, only one half of the trial results is required to meet
  22 the loss ratio requirement, making the conditional throughput more stable.
  23 The conditional throughput in this case is in principle the median forwarding rate
  24 among all trials at the relevant lower bound intended load.
  25 In practice, the search stops when missing trial results cannot
  26 disprove the load as a lower bound, so conditional throughput
  27 is the worst forwarding rate among the measured good trials.
  28
  29 MLRsearch discovers all the loads in single search, reducing required time
  30 duration compared to separate `binary search`es[^1] for each rate. Overall
  31 search time is reduced even further by relying on shorter trial
  32 duration sums for intermediate targets, with only measurements for
  33 final targets require the full duration sum. This results in the
  34 shorter overall execution time when compared to standard NDR/PDR binary
  35 search, while guaranteeing similar results.
  36
  37     Note: The conditional throughput is *always* reported by Robot code
  38     as a bi-directional aggregate of two (usually symmetric)
  39     uni-directional packet rates received and reported by an
  40     external traffic generator (TRex), unless the test specifically requires
  41     unidirectional traffic. The underlying Python library uses
  42     unidirectional values instead, as min and max load are given for those.
  43
  44 ## Search Implementation
  45
  46 Detailed description of the MLRsearch algorithm is included in the IETF
  47 draft
  48 [draft-ietf-bmwg-mlrsearch](https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch)
  49 that is in the process of being standardized in the IETF Benchmarking
  50 Methodology Working Group (BMWG).
  51
  52 MLRsearch is also available as a
  53 [PyPI (Python Package Index) library](https://pypi.org/project/MLRsearch/).
  54
  55 ## Algorithm highlights
  56
  57 MRR and receive rate at MRR load are used as initial guesses for the search.
  58
  59 All previously measured trials (except the very first one which acts
  60 as a warm-up) are taken into consideration.
  61
  62 For every loss ratio goal, the relevant upper and lower bound
  63 (intended loads, among loads of large enough duration sum) form an interval.
  64 Exit condition is given by that interval reaching low enough relative width.
  65 Small enough width is achieved by bisecting the current interval.
  66 The bisection can be uneven, to save measurements based on information theory.
  67 The width value is 0.5%, the same as PDR goal loss ratio,
  68 as smaller values may report PDR conditional throughput smaller than NDR.
  69
  70 Switching to higher trial duration sum generally requires additional trials
  71 at a load from previous duration sum target.
  72 When this refinement does not confirm previous bound classification
  73 (e.g. a lower bound for preceding target
  74 becomes an upper bound of the new target due to new trail results),
  75 external search is used to find close enough bound of the lost type.
  76 External search is a generalization of the first stage of
  77 `exponential search`[^2].
  78
  79 A preceding target uses double of the next width goal,
  80 because one bisection is always safe before risking external search.
  81
  82 As different search targets are interested at different loads,
  83 lower intended load are measured first,
  84 as that approach saves more time when trial results are not very consistent.
  85 Other heuristics are there, aimed to prevent unneccessarily narrow intervals,
  86 and to handle corner cases around min and max load.
  87
  88 ## Deviations from RFC 2544
  89
  90 RFC 2544 implies long final trial duration (just one long trial is needed
  91 for classification to lower or uper bound, so exceed ratio does not matter).
  92 With 1s trials and 0.5 exceed ratio, NDR values reported by CSIT
  93 are likely higher than RFC 2544 throughput (especially for less stable tests).
  94
  95 CSIT does not have any explicit wait times before and after trial traffic.
  96 (But the TRex-based measurer takes almost half a second between targets.)
  97
  98 Small difference between intended load and offered load is tolerated,
  99 mainly due to various time overheads preventing precise measurement
 100 of the traffic duration (and TRex can sometimes suffer from duration
 101 stretching). Large difference is reported as unsent packets
 102 (measurement is forcibly stopped after given time), counted as
 103 a packet loss, so search focuses on loads actually achievable by TRex.
 104
 105 In some tests, negative loss count is observed (TRex sees more packets
 106 coming back to it than TRex sent this trial). CSIT code treats that
 107 as a packet loss (as if VPP duplicated the packets),
 108 but TRex does not check other packets for duplication
 109 (as many traffic profiles generate non-unique packets).
 110
 111 [^1]: [binary search](https://en.wikipedia.org/wiki/Binary_search)
 112 [^2]: [exponential search](https://en.wikipedia.org/wiki/Exponential_search)