sources, listed below.
Git Suites
-----------
+``````````
The suites present in git repository act as templates for generating suites.
One of autogen design principles is that any template suite should also act
the same content, it is one of checks that autogen works correctly.
Regenerate Script
------------------
+`````````````````
Not all suites present in CSIT git repository act as template for autogen.
The distinction is on per-directory level. Directories with
(protocol "ip4" is the default, leading to 64B frame size).
Constants
----------
+`````````
Values in Constants.py are taken into consideration when generating suites.
The values are mostly related to different NIC models and NIC drivers.
Python Code
------------
+```````````
Python code in resources/libraries/python/autogen contains several other
information sources.
but do not affect the suites generated by autogen.
Testbeds
---------
+````````
Overall, no information visible in topology yaml files is taken into account
by autogen.
Robot tag marks the difference, but the link presence is not explicitly checked.
Job specs
----------
+`````````
Information in job spec files depend on generated suites (not the other way).
Autogen should generate more suites, as job spec is limited by time budget.
so autogen covers that.
Bootstrap Scripts
------------------
+`````````````````
Historically, bootstrap scripts perform some logic,
perhaps adding exclusion options to Robot invocation
Data Plane Throughput
-=====================
+---------------------
.. toctree::
.. _data_plane_throughput:
Data Plane Throughput Tests
----------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
Network data plane throughput is measured using multiple test methods in
order to obtain representative and repeatable results across the large
shared by all methods.
MLRsearch Tests
-^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~
Description
-~~~~~~~~~~~
+```````````
Multiple Loss Ratio search (MLRsearch) tests discover multiple packet
throughput rates in a single search, reducing the overall test execution
:rfc:`2544`.
Usage
-~~~~~
+`````
MLRsearch tests are run to discover NDR and PDR rates for each VPP and
DPDK release covered by CSIT report. Results for small frame sizes
tables.
Details
-~~~~~~~
+```````
See :ref:`mlrsearch_algorithm` section for more detail. MLRsearch is
being standardized in IETF in `draft-ietf-bmwg-mlrsearch
<https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch-01>`_.
MRR Tests
-^^^^^^^^^
+~~~~~~~~~
Description
-~~~~~~~~~~~
+```````````
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum “raw” throughput benchmark for development and
specified Ethernet frame size is set to the bi-directional link rate.
Usage
-~~~~~
+`````
MRR tests are much faster than MLRsearch as they rely on a single trial
or a small set of trials with very short duration. It is this property
only (64b/78B, IMIX).
Details
-~~~~~~~
+```````
See :ref:`mrr_throughput` section for more detail about MRR tests
configuration.
<https://s3-docs.fd.io/csit/master/trending/methodology/perpatch_performance_tests.html>`_.
PLRsearch Tests
-^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~
Description
-~~~~~~~~~~~
+```````````
Probabilistic Loss Ratio search (PLRsearch) tests discovers a packet
throughput rate associated with configured Packet Loss Ratio (PLR)
nature, and not deterministic.
Usage
-~~~~~
+`````
PLRsearch are run to discover a sustained throughput for PLR=10^-7
(close to NDR) for VPP release covered by CSIT report. Results for small
compared against NDR and PDR rates discovered with MLRsearch.
Details
-~~~~~~~
+```````
See :ref:`plrsearch` methodology section for more detail. PLRsearch is
being standardized in IETF in `draft-vpolak-bmwg-plrsearch
<https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch>`_.
Generic Test Properties
-^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~
All data plane throughput test methodologies share following generic
properties:
.. _mlrsearch_algorithm:
MLRsearch Tests
----------------
+^^^^^^^^^^^^^^^
Overview
~~~~~~~~
.. _mrr_throughput:
MRR Throughput
---------------
+^^^^^^^^^^^^^^
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum "raw" throughput benchmark for development and
.. _plrsearch:
PLRsearch
----------
+^^^^^^^^^
Motivation for PLRsearch
~~~~~~~~~~~~~~~~~~~~~~~~
of being standardized in the IETF Benchmarking Methodology Working Group (BMWG).
Terms
------
+`````
The rest of this page assumes the reader is familiar with the following terms
defined in the IETF draft:
reveals the quality is good (considering the measurement results).
L2 patch
---------
+________
Both fitting functions give similar estimates, the graph shows
"stochasticity" of measurements (estimates increase and decrease
:align: center
Vhost
------
+_____
This test case shows what looks like a quite broad estimation interval,
compared to other test cases with similarly looking zero loss frequencies.
:align: center
Summary
--------
+_______
The two graphs show the behavior of PLRsearch algorithm applied to soaking test
when some of PLRsearch assumptions do not hold:
DUT state considerations
-------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^
This page discusses considerations for Device Under Test (DUT) state.
DUTs such as VPP require configuration, to be provided before the aplication
it wants to test, and manipulate the DUT state to achieve the intended impact.
Ramp-up trial
-_____________
+`````````````
Tests aiming at sustain performance need to make sure DUT state is created.
We achieve this via a ramp-up trial, specific purpose of which
Test fails if the state is not (completely) created.
State Reset
-___________
+```````````
Tests aiming at ramp-up performance do not use ramp-up trial,
and they need to reset the DUT state before each trial measurement.
violating assumptions of search algorithms.
DUT versus protocol ramp-up
-___________________________
+```````````````````````````
There are at least three different causes for bandwidth possibly increasing
within a single measurement trial.
.. _nat44_methodology:
Network Address Translation IPv4 to IPv4
-----------------------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NAT44 Prefix Bindings
-^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~
NAT44 prefix bindings should be representative to target applications,
where a number of private IPv4 addresses from the range defined by
- Used in tests for up to 1 048 576 inside addresses (inside hosts).
NAT44 Session Scale
-~~~~~~~~~~~~~~~~~~~
+```````````````````
NAT44 session scale tested is govern by the following logic:
+---+---------+------------+
NAT44 Deterministic
-^^^^^^^^^^^^^^^^^^^
+```````````````````
NAT44det performance tests are using TRex STL (Stateless) API and traffic
profiles, similar to all other stateless packet forwarding tests like
TODO: Make traffic profile names resemble suite names more closely.
NAT44 Endpoint-Dependent
-^^^^^^^^^^^^^^^^^^^^^^^^
+````````````````````````
In order to excercise NAT44ed ability to translate based on both
source and destination address and port, the inside-to-outside traffic
- [mrr|ndrpdr|soak], bidirectional stateful tests MRR, NDRPDR, or SOAK.
Stateful traffic profiles
-^^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~~
There are several important details which distinguish ASTF profiles
from stateless profiles.
General considerations
-~~~~~~~~~~~~~~~~~~~~~~
+``````````````````````
Protocols
_________
See TCP TPUT profile below.
UDP CPS
-~~~~~~~
+```````
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
Transaction counts as successful when ipackets counter increases on client side.
TCP CPS
-~~~~~~~
+```````
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
on client side.
UDP TPUT
-~~~~~~~~
+````````
This profile uses a small transaction of "request-response" type,
with several packets simulating data payload.
but it leads to more stable results then alternatives.
TCP TPUT
-~~~~~~~~
+````````
This profile uses a small transaction of "request-response" type,
with some data amount to be transferred both ways.
the results are comparable to the old traffic profile.
Ip4base tests
-^^^^^^^^^^^^^
+~~~~~~~~~~~~~
Contrary to stateless traffic profiles, we do not have a simple limit
that would guarantee TRex is able to send traffic at specified load.
destination port) are altered.
Incremental Ordering
---------------------
+~~~~~~~~~~~~~~~~~~~~
This case is simpler to implement and offers greater control.
It is possible to use increments other than 1.
Randomized Ordering
--------------------
+~~~~~~~~~~~~~~~~~~~
This case chooses each field value at random (from the allowed range).
In case of two fields, they are treated independently.
.. _latency_methodology:
Packet Latency
---------------
+^^^^^^^^^^^^^^
TRex Traffic Generator (TG) is used for measuring one-way latency in
2-Node and 3-Node physical testbed topologies. TRex integrates `High
changes which would decrease performance without a good reason.
Existing jobs
-`````````````
+~~~~~~~~~~~~~
VPP is the only project currently using such jobs.
They are not started automatically, must be triggered on demand.
2n-clx, 2n-dnv, 2n-tx2, 2n-zn2, 3n-dnv, 3n-tsh.
Test selection
---------------
+~~~~~~~~~~~~~~
..
TODO: Majority of this section is also useful for CSIT verify jobs. Move it somewhere.
to help users to select the minimal set of tests cases.
Verify cycles
-_____________
+`````````````
When Gerrit schedules multiple jobs to run for the same patch set,
it waits until all runs are complete.
Only when 3n-icx job finishes, the user can trigger 2n-icx.
One comment many jobs
-_____________________
+`````````````````````
In the past, the CSIT code which parses for perftest trigger comments
was buggy, which lead to bad behavior (as in selection all performance test,
to use just one trigger word per Gerrit comment, just to be safe.
Multiple test cases in run
-__________________________
+``````````````````````````
While Robot supports OR operator, it does not support parentheses,
so the OR operator is not very useful. It is recommended
See below for more concrete examples.
Suite tags
-__________
+``````````
Traditionally, CSIT maintains broad Robot tags that can be used to select tests,
for details on existing tags, see
to a single test case within a suite.
Fully specified tag expressions
-_______________________________
+```````````````````````````````
Here is one template to select a single test case:
{test_type}AND{nic_model}AND{nic_driver}AND{cores}AND{frame_size}AND{suite_tag}
TODO: Explain why "periodic" job spec link lands at report_iterative.
Shortening triggers
-___________________
+```````````````````
Advanced users may use the following tricks to avoid writing long trigger comments.
which does not support RDMA core driver.
Complete example
-________________
+````````````````
A user wants to test a VPP change which may affect load balance whith bonding.
Searching tag documentation for "bonding" finds LBOND tag and its variants.
perftest-3n-icx mrrANDnic_intel-x710AND1cAND64bAND?lbvpplacp-dot1q-l2xcbase-eth-2vhostvr1024-1vm*NOTdrv_af_xdp
Basic operation
-```````````````
+~~~~~~~~~~~~~~~
The job builds VPP .deb packages for both the patch under test
(called "current") and its parent patch (called "parent").
or if any test was declared a regression.
Temporary specifics
-```````````````````
+~~~~~~~~~~~~~~~~~~~
The Minimal Description Length analysis is performed by
CSIT code equivalent to jumpavg-0.1.3 library available on PyPI.
probability of false positives.
Console output
-``````````````
+~~~~~~~~~~~~~~
The following information as visible towards the end of Jenkins console output,
repeated for each analyzed test.
.. _reconf_tests:
Reconfiguration Tests
----------------------
+^^^^^^^^^^^^^^^^^^^^^
.. important::
Trending Methodology
-====================
+--------------------
.. toctree::
~~~~~~~~~~~~~~~~~~~~~~
Partitioning into groups
-------------------------
+````````````````````````
While sometimes the samples within a group are far from being distributed
normally, currently we do not have a better tractable model.
The group boundaries are selected based on `Minimum Description Length`_.
Minimum Description Length
---------------------------
+``````````````````````````
`Minimum Description Length`_ (MDL) is a particular formalization
of `Occam's razor`_ principle.
as they motivate our choice of trend compliance metrics.
Sample time and analysis time
------------------------------
+`````````````````````````````
But first we need to distinguish two roles time plays in analysis,
so it is more clear which role we are referring to.
from the later analysis time results shown in dashboard and graphs.
Ordinary regression
--------------------
+```````````````````
The real performance changes from previously stable value
into a new stable value.
Ordinary progressions are detected in the same way.
Small regression
-----------------
+````````````````
The real performance changes from previously stable value
into a new stable value, but the difference is small.
Small progressions have the same behavior.
Reverted regression
--------------------
+```````````````````
This pattern can have two different causes.
We would like to distinguish them, but that is usually
almost never happens.
Summary
--------
+```````
There is a trade-off between detecting small regressions
and not reporting the same old regressions for a long time.
Separate tables are generated for each testbed.
Regressions and progressions
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These tables list tests which encountered a regression or progression during the
specified time period, which is currently set to the last 21 days.
TRex Traffic Generator
-----------------------
+^^^^^^^^^^^^^^^^^^^^^^
Usage
~~~~~
TRex is primarily used in two (mutually incompatible) modes.
Stateless mode
-______________
+``````````````
Sometimes abbreviated as STL.
A mode with high performance, which is unable to react to incoming traffic.
(opackets, ipackets) for each traffic direction.
Stateful mode
-_____________
+`````````````
A mode capable of reacting to incoming traffic.
Contrary to the stateless mode, only UDP and TCP is supported
Both modes support both continuities in principle.
Continuous traffic
-__________________
+``````````````````
Traffic is started without any data size goal.
Traffic is ended based on time duration, as hinted by search algorithm.
The default for stateless mode.
Limited traffic
-_______________
+```````````````
Traffic has defined data size goal (given as number of transactions),
duration is computed based on this goal.
or asynchronously (test operates during traffic and stops traffic explicitly).
Synchronous traffic
-___________________
+```````````````````
Trial measurement is driven by given (or precomputed) duration,
no activity from test driver during the traffic.
Used for most trials.
Asynchronous traffic
-____________________
+````````````````````
Traffic is started, but then the test driver is free to perform
other actions, before stopping the traffic explicitly.
so CSIT defines some terms to use instead of mode-specific TRex terms.
Transactions
-____________
+````````````
TRex traffic profile defines a small number of behaviors,
in CSIT called transaction templates. Traffic profiles also instruct
bidirectional stateless profiles define two transaction templates.
TPS multiplier
-______________
+``````````````
TRex aims to open transaction specified by the profile at a steady rate.
While TRex allows the transaction template to define its intended "cps" value,
as a unidirectional input value.
Duration stretching
-___________________
+```````````````````
TRex can be IO-bound, CPU-bound, or have any other reason
why it is not able to generate the traffic at the requested TPS.
If the results are very similar, it is probable TRex was the bottleneck.
Startup delay
-_____________
+`````````````
By investigating TRex behavior, it was found that TRex does not start
the traffic in ASTF mode immediately. There is a delay of zero traffic,