X-Git-Url: https://gerrit.fd.io/r/gitweb?a=blobdiff_plain;f=docs%2Freport%2Fintroduction%2Fmethodology_plrsearch.rst;h=65ac59022277385ab36b4dfc2bd8c401a903328d;hb=a8c3f441fa595311f60bcc634a2720f204ced733;hp=b21fad367719297adc9d00ea2aa366919a87a24f;hpb=971d0585f457e733a82349443ae2ff1cb39b5e11;p=csit.git diff --git a/docs/report/introduction/methodology_plrsearch.rst b/docs/report/introduction/methodology_plrsearch.rst index b21fad3677..65ac590222 100644 --- a/docs/report/introduction/methodology_plrsearch.rst +++ b/docs/report/introduction/methodology_plrsearch.rst @@ -1,4 +1,4 @@ -.. _plrsearch_algorithm: +.. _`PLRsearch algorithm`: PLRsearch ^^^^^^^^^ @@ -13,7 +13,7 @@ Eventually, a better description of the abstract search algorithm will appear at this IETF standard: `plrsearch draft`_. Motivation ----------- +`````````` Network providers are interested in throughput a device can sustain. @@ -27,7 +27,7 @@ actually is. We need another algorithm, which takes this indeterminism into account. Model ------ +````` Each algorithm searches for an answer to a precisely formulated question. When the question involves indeterministic systems, it has to @@ -55,7 +55,7 @@ generator detects duplicate packets on receive detection, and reports this as an error. Poisson distribution --------------------- +```````````````````` For given offered load, number of packets lost during trial measurement is assumed to come from `Poisson distribution`_, @@ -112,7 +112,7 @@ in an attempt to speed up convergence, but such methods tend to be scpecific for a particular system under tests. Fitting function coefficients distribution ------------------------------------------- +`````````````````````````````````````````` To accomodate systems with different behaviours, the fitting function is expected to have few numeric parameters affecting its shape (mainly @@ -136,7 +136,7 @@ stdev should refer to the equilibrium estimates based on the trend, not to immediate posterior values. Integration ------------ +``````````` The posterior distributions for fitting function parameters will not be integrable in general. @@ -146,7 +146,7 @@ time, so this time can be used for numeric integration (using suitable method, such as Monte Carlo) to achieve sufficient precision. Optimizations -------------- +````````````` After enough trials, the posterior distribution will be concentrated in a narrow area of parameter space. The integration method should take @@ -169,7 +169,7 @@ it only cares about overall counts of packets sent and packets lost), but debug output from traffic generator lists unidirectional values. Measurement delay ------------------ +````````````````` In a sample implemenation in FD.io CSIT project, there is roughly 0.5 second delay between trials due to restrictons imposed by packet traffic @@ -191,7 +191,7 @@ trial durations with configurable coefficients each subsequent trial being 0.2 second longer). Rounding errors and underflows ------------------------------- +`````````````````````````````` In order to avoid them, the current implementation tracks natural logarithm (instead of the original quantity) for any quantity which is never negative. @@ -202,7 +202,7 @@ Specific functions for frequent operations are defined to handle None correctly. Fitting functions ------------------ +````````````````` Current implementation uses two fitting functions. In general, their estimates for critical rate differ, @@ -257,7 +257,7 @@ During each measurement, two estimates are computed, even though only one (in alternating order) is used for next offered load. Stretch function -________________ +---------------- The original function (before applying logarithm) is primitive function to `logistic function`_. @@ -265,13 +265,13 @@ The name "stretch" is used for related function in context of neural networks with sigmoid activation function. Erf function -____________ +------------ The original function is double primitive function to `Gaussian function`_. The name "erf" comes from error function, the first primitive to Gaussian. Prior distributions -------------------- +``````````````````` The numeric integrator expects all the parameters to be distributed (independently and) uniformly on an interval (-1, 1). @@ -288,7 +288,7 @@ raised to a random power between zero and one; thus it follows a `reciprocal distribution`_. Integrator ----------- +`````````` After few measurements, the posterior distribution of fitting function arguments gets quite concentrated into a small area. @@ -314,17 +314,17 @@ is concentrated at. The second phase (dominated by whole sample population) is actually relevant for the critical rate estimation. Caveats -------- +``````` Current implementation does not constrict the critical rate (as computed for every sample) to the min_rate, max_rate interval. Earlier implementations were targeting loss rate (as opposed to loss ratio). -The chosen fitting functions do not even allow arbitrarily low loss ratios, -especially if the "spread" value is high enough (relative to "mrr" value). +The chosen fitting functions do allow arbitrarily low loss ratios, +but may suffer from rounding errors in corresponding parameter regions. Internal loss rate target is computed from given loss ratio -using the current trial offered load, which increases search instability -if measurements with surprisingly high loss count appear. +using the current trial offered load, which increases search instability, +especially if measurements with surprisingly high loss count appear. As high loss count measurements add many bits of information, they need a large amount of small loss count measurements to balance them, @@ -338,9 +338,67 @@ as the observed trends have varied characteristics. Probably, using a more realistic fitting functions will give better estimates than trend analysis. -.. TODO: Add a 1901 result section when results are available. +Graphical examples +`````````````````` + +The following pictures show the upper and lower bound (one sigma) +on estimated critical rate, as computed by PLRsearch, after each trial measurement +within the 30 minute duration of a test run. + +Both graphs are focusing on later estimates. Estimates computed from +few initial measurements are wildly off the y-axis range shown. + +L2 patch +-------- + +This test case shows quite narrow critical region. Both fitting functions +give similar estimates, the graph shows the randomness of measurements, +and a trend. Both fitting functions seem to be somewhat overestimating +the critical rate. The final estimated interval is too narrow, +a longer run would report estimates somewhat bellow the current lower bound. + +.. only:: latex + + .. raw:: latex + + \begin{figure}[H] + \centering + \graphicspath{{../_tmp/src/introduction/}} + \includegraphics[width=0.90\textwidth]{PLR_patch} + \label{fig:PLR_patch} + \end{figure} + +.. only:: html + + .. figure:: PLR_patch.svg + :alt: PLR_patch + :align: center + +Vhost +----- + +This test case shows quite broad critical region. Fitting functions give +fairly differing estimates. One overestimates, the other underestimates. +The graph mostly shows later measurements slowly bringing the estimates +towards each other. The final estimated interval is too broad, +a longer run would return a smaller interval within the current one. + +.. only:: latex + + .. raw:: latex + + \begin{figure}[H] + \centering + \graphicspath{{../_tmp/src/introduction/}} + \includegraphics[width=0.90\textwidth]{PLR_vhost} + \label{fig:PLR_vhost} + \end{figure} + +.. only:: html -.. TODO: Add a graph of time evolution when 1901 run is available. + .. figure:: PLR_vhost.svg + :alt: PLR_vhost + :align: center .. _plrsearch draft: https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch-00 .. _RFC 2544: https://tools.ietf.org/html/rfc2544