Unify characters marking section levels,
(at least for methodology Vratko contributed to):
Level 0: ==== Do not use, or use for index.rst only.
(Because git conflicts also create ====.)
Level 1: ^^^^
Level 2: ~~~~
Level 3: ````
Level 4: ____
Level 5: ---- Do not use.
(Because other documents use this as level 0,
and it also appears in tables.)
Change-Id: I10813f718b2ee34d1e34c58e62e88353000340e9
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
16 files changed:
sources, listed below.
Git Suites
sources, listed below.
Git Suites
The suites present in git repository act as templates for generating suites.
One of autogen design principles is that any template suite should also act
The suites present in git repository act as templates for generating suites.
One of autogen design principles is that any template suite should also act
the same content, it is one of checks that autogen works correctly.
Regenerate Script
the same content, it is one of checks that autogen works correctly.
Regenerate Script
Not all suites present in CSIT git repository act as template for autogen.
The distinction is on per-directory level. Directories with
Not all suites present in CSIT git repository act as template for autogen.
The distinction is on per-directory level. Directories with
(protocol "ip4" is the default, leading to 64B frame size).
Constants
(protocol "ip4" is the default, leading to 64B frame size).
Constants
Values in Constants.py are taken into consideration when generating suites.
The values are mostly related to different NIC models and NIC drivers.
Python Code
Values in Constants.py are taken into consideration when generating suites.
The values are mostly related to different NIC models and NIC drivers.
Python Code
Python code in resources/libraries/python/autogen contains several other
information sources.
Python code in resources/libraries/python/autogen contains several other
information sources.
but do not affect the suites generated by autogen.
Testbeds
but do not affect the suites generated by autogen.
Testbeds
Overall, no information visible in topology yaml files is taken into account
by autogen.
Overall, no information visible in topology yaml files is taken into account
by autogen.
Robot tag marks the difference, but the link presence is not explicitly checked.
Job specs
Robot tag marks the difference, but the link presence is not explicitly checked.
Job specs
Information in job spec files depend on generated suites (not the other way).
Autogen should generate more suites, as job spec is limited by time budget.
Information in job spec files depend on generated suites (not the other way).
Autogen should generate more suites, as job spec is limited by time budget.
so autogen covers that.
Bootstrap Scripts
so autogen covers that.
Bootstrap Scripts
Historically, bootstrap scripts perform some logic,
perhaps adding exclusion options to Robot invocation
Historically, bootstrap scripts perform some logic,
perhaps adding exclusion options to Robot invocation
.. _data_plane_throughput:
Data Plane Throughput Tests
.. _data_plane_throughput:
Data Plane Throughput Tests
----------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
Network data plane throughput is measured using multiple test methods in
order to obtain representative and repeatable results across the large
Network data plane throughput is measured using multiple test methods in
order to obtain representative and repeatable results across the large
shared by all methods.
MLRsearch Tests
shared by all methods.
MLRsearch Tests
Multiple Loss Ratio search (MLRsearch) tests discover multiple packet
throughput rates in a single search, reducing the overall test execution
Multiple Loss Ratio search (MLRsearch) tests discover multiple packet
throughput rates in a single search, reducing the overall test execution
MLRsearch tests are run to discover NDR and PDR rates for each VPP and
DPDK release covered by CSIT report. Results for small frame sizes
MLRsearch tests are run to discover NDR and PDR rates for each VPP and
DPDK release covered by CSIT report. Results for small frame sizes
See :ref:`mlrsearch_algorithm` section for more detail. MLRsearch is
being standardized in IETF in `draft-ietf-bmwg-mlrsearch
<https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch-01>`_.
MRR Tests
See :ref:`mlrsearch_algorithm` section for more detail. MLRsearch is
being standardized in IETF in `draft-ietf-bmwg-mlrsearch
<https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch-01>`_.
MRR Tests
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum “raw” throughput benchmark for development and
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum “raw” throughput benchmark for development and
specified Ethernet frame size is set to the bi-directional link rate.
Usage
specified Ethernet frame size is set to the bi-directional link rate.
Usage
MRR tests are much faster than MLRsearch as they rely on a single trial
or a small set of trials with very short duration. It is this property
MRR tests are much faster than MLRsearch as they rely on a single trial
or a small set of trials with very short duration. It is this property
only (64b/78B, IMIX).
Details
only (64b/78B, IMIX).
Details
See :ref:`mrr_throughput` section for more detail about MRR tests
configuration.
See :ref:`mrr_throughput` section for more detail about MRR tests
configuration.
<https://s3-docs.fd.io/csit/master/trending/methodology/perpatch_performance_tests.html>`_.
PLRsearch Tests
<https://s3-docs.fd.io/csit/master/trending/methodology/perpatch_performance_tests.html>`_.
PLRsearch Tests
Probabilistic Loss Ratio search (PLRsearch) tests discovers a packet
throughput rate associated with configured Packet Loss Ratio (PLR)
Probabilistic Loss Ratio search (PLRsearch) tests discovers a packet
throughput rate associated with configured Packet Loss Ratio (PLR)
nature, and not deterministic.
Usage
nature, and not deterministic.
Usage
PLRsearch are run to discover a sustained throughput for PLR=10^-7
(close to NDR) for VPP release covered by CSIT report. Results for small
PLRsearch are run to discover a sustained throughput for PLR=10^-7
(close to NDR) for VPP release covered by CSIT report. Results for small
compared against NDR and PDR rates discovered with MLRsearch.
Details
compared against NDR and PDR rates discovered with MLRsearch.
Details
See :ref:`plrsearch` methodology section for more detail. PLRsearch is
being standardized in IETF in `draft-vpolak-bmwg-plrsearch
<https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch>`_.
Generic Test Properties
See :ref:`plrsearch` methodology section for more detail. PLRsearch is
being standardized in IETF in `draft-vpolak-bmwg-plrsearch
<https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch>`_.
Generic Test Properties
All data plane throughput test methodologies share following generic
properties:
All data plane throughput test methodologies share following generic
properties:
.. _mlrsearch_algorithm:
MLRsearch Tests
.. _mlrsearch_algorithm:
MLRsearch Tests
.. _mrr_throughput:
MRR Throughput
.. _mrr_throughput:
MRR Throughput
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum "raw" throughput benchmark for development and
Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
as they provide a maximum "raw" throughput benchmark for development and
Motivation for PLRsearch
~~~~~~~~~~~~~~~~~~~~~~~~
Motivation for PLRsearch
~~~~~~~~~~~~~~~~~~~~~~~~
of being standardized in the IETF Benchmarking Methodology Working Group (BMWG).
Terms
of being standardized in the IETF Benchmarking Methodology Working Group (BMWG).
Terms
The rest of this page assumes the reader is familiar with the following terms
defined in the IETF draft:
The rest of this page assumes the reader is familiar with the following terms
defined in the IETF draft:
reveals the quality is good (considering the measurement results).
L2 patch
reveals the quality is good (considering the measurement results).
L2 patch
Both fitting functions give similar estimates, the graph shows
"stochasticity" of measurements (estimates increase and decrease
Both fitting functions give similar estimates, the graph shows
"stochasticity" of measurements (estimates increase and decrease
This test case shows what looks like a quite broad estimation interval,
compared to other test cases with similarly looking zero loss frequencies.
This test case shows what looks like a quite broad estimation interval,
compared to other test cases with similarly looking zero loss frequencies.
The two graphs show the behavior of PLRsearch algorithm applied to soaking test
when some of PLRsearch assumptions do not hold:
The two graphs show the behavior of PLRsearch algorithm applied to soaking test
when some of PLRsearch assumptions do not hold:
-------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^
This page discusses considerations for Device Under Test (DUT) state.
DUTs such as VPP require configuration, to be provided before the aplication
This page discusses considerations for Device Under Test (DUT) state.
DUTs such as VPP require configuration, to be provided before the aplication
it wants to test, and manipulate the DUT state to achieve the intended impact.
Ramp-up trial
it wants to test, and manipulate the DUT state to achieve the intended impact.
Ramp-up trial
Tests aiming at sustain performance need to make sure DUT state is created.
We achieve this via a ramp-up trial, specific purpose of which
Tests aiming at sustain performance need to make sure DUT state is created.
We achieve this via a ramp-up trial, specific purpose of which
Test fails if the state is not (completely) created.
State Reset
Test fails if the state is not (completely) created.
State Reset
Tests aiming at ramp-up performance do not use ramp-up trial,
and they need to reset the DUT state before each trial measurement.
Tests aiming at ramp-up performance do not use ramp-up trial,
and they need to reset the DUT state before each trial measurement.
violating assumptions of search algorithms.
DUT versus protocol ramp-up
violating assumptions of search algorithms.
DUT versus protocol ramp-up
-___________________________
+```````````````````````````
There are at least three different causes for bandwidth possibly increasing
within a single measurement trial.
There are at least three different causes for bandwidth possibly increasing
within a single measurement trial.
.. _nat44_methodology:
Network Address Translation IPv4 to IPv4
.. _nat44_methodology:
Network Address Translation IPv4 to IPv4
-----------------------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NAT44 prefix bindings should be representative to target applications,
where a number of private IPv4 addresses from the range defined by
NAT44 prefix bindings should be representative to target applications,
where a number of private IPv4 addresses from the range defined by
- Used in tests for up to 1 048 576 inside addresses (inside hosts).
NAT44 Session Scale
- Used in tests for up to 1 048 576 inside addresses (inside hosts).
NAT44 Session Scale
NAT44 session scale tested is govern by the following logic:
NAT44 session scale tested is govern by the following logic:
+---+---------+------------+
NAT44 Deterministic
+---+---------+------------+
NAT44 Deterministic
NAT44det performance tests are using TRex STL (Stateless) API and traffic
profiles, similar to all other stateless packet forwarding tests like
NAT44det performance tests are using TRex STL (Stateless) API and traffic
profiles, similar to all other stateless packet forwarding tests like
TODO: Make traffic profile names resemble suite names more closely.
NAT44 Endpoint-Dependent
TODO: Make traffic profile names resemble suite names more closely.
NAT44 Endpoint-Dependent
-^^^^^^^^^^^^^^^^^^^^^^^^
+````````````````````````
In order to excercise NAT44ed ability to translate based on both
source and destination address and port, the inside-to-outside traffic
In order to excercise NAT44ed ability to translate based on both
source and destination address and port, the inside-to-outside traffic
- [mrr|ndrpdr|soak], bidirectional stateful tests MRR, NDRPDR, or SOAK.
Stateful traffic profiles
- [mrr|ndrpdr|soak], bidirectional stateful tests MRR, NDRPDR, or SOAK.
Stateful traffic profiles
-^^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~~
There are several important details which distinguish ASTF profiles
from stateless profiles.
General considerations
There are several important details which distinguish ASTF profiles
from stateless profiles.
General considerations
See TCP TPUT profile below.
UDP CPS
See TCP TPUT profile below.
UDP CPS
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
Transaction counts as successful when ipackets counter increases on client side.
TCP CPS
Transaction counts as successful when ipackets counter increases on client side.
TCP CPS
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
This profile uses a minimalistic transaction to verify NAT44ed session has been
created and it allows outside-to-inside traffic.
This profile uses a small transaction of "request-response" type,
with several packets simulating data payload.
This profile uses a small transaction of "request-response" type,
with several packets simulating data payload.
but it leads to more stable results then alternatives.
TCP TPUT
but it leads to more stable results then alternatives.
TCP TPUT
This profile uses a small transaction of "request-response" type,
with some data amount to be transferred both ways.
This profile uses a small transaction of "request-response" type,
with some data amount to be transferred both ways.
the results are comparable to the old traffic profile.
Ip4base tests
the results are comparable to the old traffic profile.
Ip4base tests
Contrary to stateless traffic profiles, we do not have a simple limit
that would guarantee TRex is able to send traffic at specified load.
Contrary to stateless traffic profiles, we do not have a simple limit
that would guarantee TRex is able to send traffic at specified load.
destination port) are altered.
Incremental Ordering
destination port) are altered.
Incremental Ordering
This case is simpler to implement and offers greater control.
This case is simpler to implement and offers greater control.
It is possible to use increments other than 1.
Randomized Ordering
It is possible to use increments other than 1.
Randomized Ordering
This case chooses each field value at random (from the allowed range).
In case of two fields, they are treated independently.
This case chooses each field value at random (from the allowed range).
In case of two fields, they are treated independently.
.. _latency_methodology:
Packet Latency
.. _latency_methodology:
Packet Latency
TRex Traffic Generator (TG) is used for measuring one-way latency in
2-Node and 3-Node physical testbed topologies. TRex integrates `High
TRex Traffic Generator (TG) is used for measuring one-way latency in
2-Node and 3-Node physical testbed topologies. TRex integrates `High
changes which would decrease performance without a good reason.
Existing jobs
changes which would decrease performance without a good reason.
Existing jobs
VPP is the only project currently using such jobs.
They are not started automatically, must be triggered on demand.
VPP is the only project currently using such jobs.
They are not started automatically, must be triggered on demand.
2n-clx, 2n-dnv, 2n-tx2, 2n-zn2, 3n-dnv, 3n-tsh.
Test selection
2n-clx, 2n-dnv, 2n-tx2, 2n-zn2, 3n-dnv, 3n-tsh.
Test selection
..
TODO: Majority of this section is also useful for CSIT verify jobs. Move it somewhere.
..
TODO: Majority of this section is also useful for CSIT verify jobs. Move it somewhere.
to help users to select the minimal set of tests cases.
Verify cycles
to help users to select the minimal set of tests cases.
Verify cycles
When Gerrit schedules multiple jobs to run for the same patch set,
it waits until all runs are complete.
When Gerrit schedules multiple jobs to run for the same patch set,
it waits until all runs are complete.
Only when 3n-icx job finishes, the user can trigger 2n-icx.
One comment many jobs
Only when 3n-icx job finishes, the user can trigger 2n-icx.
One comment many jobs
In the past, the CSIT code which parses for perftest trigger comments
was buggy, which lead to bad behavior (as in selection all performance test,
In the past, the CSIT code which parses for perftest trigger comments
was buggy, which lead to bad behavior (as in selection all performance test,
to use just one trigger word per Gerrit comment, just to be safe.
Multiple test cases in run
to use just one trigger word per Gerrit comment, just to be safe.
Multiple test cases in run
-__________________________
+``````````````````````````
While Robot supports OR operator, it does not support parentheses,
so the OR operator is not very useful. It is recommended
While Robot supports OR operator, it does not support parentheses,
so the OR operator is not very useful. It is recommended
See below for more concrete examples.
Suite tags
See below for more concrete examples.
Suite tags
Traditionally, CSIT maintains broad Robot tags that can be used to select tests,
for details on existing tags, see
Traditionally, CSIT maintains broad Robot tags that can be used to select tests,
for details on existing tags, see
to a single test case within a suite.
Fully specified tag expressions
to a single test case within a suite.
Fully specified tag expressions
-_______________________________
+```````````````````````````````
Here is one template to select a single test case:
{test_type}AND{nic_model}AND{nic_driver}AND{cores}AND{frame_size}AND{suite_tag}
Here is one template to select a single test case:
{test_type}AND{nic_model}AND{nic_driver}AND{cores}AND{frame_size}AND{suite_tag}
TODO: Explain why "periodic" job spec link lands at report_iterative.
Shortening triggers
TODO: Explain why "periodic" job spec link lands at report_iterative.
Shortening triggers
Advanced users may use the following tricks to avoid writing long trigger comments.
Advanced users may use the following tricks to avoid writing long trigger comments.
which does not support RDMA core driver.
Complete example
which does not support RDMA core driver.
Complete example
A user wants to test a VPP change which may affect load balance whith bonding.
Searching tag documentation for "bonding" finds LBOND tag and its variants.
A user wants to test a VPP change which may affect load balance whith bonding.
Searching tag documentation for "bonding" finds LBOND tag and its variants.
perftest-3n-icx mrrANDnic_intel-x710AND1cAND64bAND?lbvpplacp-dot1q-l2xcbase-eth-2vhostvr1024-1vm*NOTdrv_af_xdp
Basic operation
perftest-3n-icx mrrANDnic_intel-x710AND1cAND64bAND?lbvpplacp-dot1q-l2xcbase-eth-2vhostvr1024-1vm*NOTdrv_af_xdp
Basic operation
The job builds VPP .deb packages for both the patch under test
(called "current") and its parent patch (called "parent").
The job builds VPP .deb packages for both the patch under test
(called "current") and its parent patch (called "parent").
or if any test was declared a regression.
Temporary specifics
or if any test was declared a regression.
Temporary specifics
The Minimal Description Length analysis is performed by
CSIT code equivalent to jumpavg-0.1.3 library available on PyPI.
The Minimal Description Length analysis is performed by
CSIT code equivalent to jumpavg-0.1.3 library available on PyPI.
probability of false positives.
Console output
probability of false positives.
Console output
The following information as visible towards the end of Jenkins console output,
repeated for each analyzed test.
The following information as visible towards the end of Jenkins console output,
repeated for each analyzed test.
.. _reconf_tests:
Reconfiguration Tests
.. _reconf_tests:
Reconfiguration Tests
~~~~~~~~~~~~~~~~~~~~~~
Partitioning into groups
~~~~~~~~~~~~~~~~~~~~~~
Partitioning into groups
-------------------------
+````````````````````````
While sometimes the samples within a group are far from being distributed
normally, currently we do not have a better tractable model.
While sometimes the samples within a group are far from being distributed
normally, currently we do not have a better tractable model.
The group boundaries are selected based on `Minimum Description Length`_.
Minimum Description Length
The group boundaries are selected based on `Minimum Description Length`_.
Minimum Description Length
---------------------------
+``````````````````````````
`Minimum Description Length`_ (MDL) is a particular formalization
of `Occam's razor`_ principle.
`Minimum Description Length`_ (MDL) is a particular formalization
of `Occam's razor`_ principle.
as they motivate our choice of trend compliance metrics.
Sample time and analysis time
as they motivate our choice of trend compliance metrics.
Sample time and analysis time
------------------------------
+`````````````````````````````
But first we need to distinguish two roles time plays in analysis,
so it is more clear which role we are referring to.
But first we need to distinguish two roles time plays in analysis,
so it is more clear which role we are referring to.
from the later analysis time results shown in dashboard and graphs.
Ordinary regression
from the later analysis time results shown in dashboard and graphs.
Ordinary regression
The real performance changes from previously stable value
into a new stable value.
The real performance changes from previously stable value
into a new stable value.
Ordinary progressions are detected in the same way.
Small regression
Ordinary progressions are detected in the same way.
Small regression
The real performance changes from previously stable value
into a new stable value, but the difference is small.
The real performance changes from previously stable value
into a new stable value, but the difference is small.
Small progressions have the same behavior.
Reverted regression
Small progressions have the same behavior.
Reverted regression
This pattern can have two different causes.
We would like to distinguish them, but that is usually
This pattern can have two different causes.
We would like to distinguish them, but that is usually
almost never happens.
Summary
almost never happens.
Summary
There is a trade-off between detecting small regressions
and not reporting the same old regressions for a long time.
There is a trade-off between detecting small regressions
and not reporting the same old regressions for a long time.
Separate tables are generated for each testbed.
Regressions and progressions
Separate tables are generated for each testbed.
Regressions and progressions
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These tables list tests which encountered a regression or progression during the
specified time period, which is currently set to the last 21 days.
These tables list tests which encountered a regression or progression during the
specified time period, which is currently set to the last 21 days.
TRex is primarily used in two (mutually incompatible) modes.
Stateless mode
TRex is primarily used in two (mutually incompatible) modes.
Stateless mode
Sometimes abbreviated as STL.
A mode with high performance, which is unable to react to incoming traffic.
Sometimes abbreviated as STL.
A mode with high performance, which is unable to react to incoming traffic.
(opackets, ipackets) for each traffic direction.
Stateful mode
(opackets, ipackets) for each traffic direction.
Stateful mode
A mode capable of reacting to incoming traffic.
Contrary to the stateless mode, only UDP and TCP is supported
A mode capable of reacting to incoming traffic.
Contrary to the stateless mode, only UDP and TCP is supported
Both modes support both continuities in principle.
Continuous traffic
Both modes support both continuities in principle.
Continuous traffic
Traffic is started without any data size goal.
Traffic is ended based on time duration, as hinted by search algorithm.
Traffic is started without any data size goal.
Traffic is ended based on time duration, as hinted by search algorithm.
The default for stateless mode.
Limited traffic
The default for stateless mode.
Limited traffic
Traffic has defined data size goal (given as number of transactions),
duration is computed based on this goal.
Traffic has defined data size goal (given as number of transactions),
duration is computed based on this goal.
or asynchronously (test operates during traffic and stops traffic explicitly).
Synchronous traffic
or asynchronously (test operates during traffic and stops traffic explicitly).
Synchronous traffic
Trial measurement is driven by given (or precomputed) duration,
no activity from test driver during the traffic.
Used for most trials.
Asynchronous traffic
Trial measurement is driven by given (or precomputed) duration,
no activity from test driver during the traffic.
Used for most trials.
Asynchronous traffic
Traffic is started, but then the test driver is free to perform
other actions, before stopping the traffic explicitly.
Traffic is started, but then the test driver is free to perform
other actions, before stopping the traffic explicitly.
so CSIT defines some terms to use instead of mode-specific TRex terms.
Transactions
so CSIT defines some terms to use instead of mode-specific TRex terms.
Transactions
TRex traffic profile defines a small number of behaviors,
in CSIT called transaction templates. Traffic profiles also instruct
TRex traffic profile defines a small number of behaviors,
in CSIT called transaction templates. Traffic profiles also instruct
bidirectional stateless profiles define two transaction templates.
TPS multiplier
bidirectional stateless profiles define two transaction templates.
TPS multiplier
TRex aims to open transaction specified by the profile at a steady rate.
While TRex allows the transaction template to define its intended "cps" value,
TRex aims to open transaction specified by the profile at a steady rate.
While TRex allows the transaction template to define its intended "cps" value,
as a unidirectional input value.
Duration stretching
as a unidirectional input value.
Duration stretching
TRex can be IO-bound, CPU-bound, or have any other reason
why it is not able to generate the traffic at the requested TPS.
TRex can be IO-bound, CPU-bound, or have any other reason
why it is not able to generate the traffic at the requested TPS.
If the results are very similar, it is probable TRex was the bottleneck.
Startup delay
If the results are very similar, it is probable TRex was the bottleneck.
Startup delay
By investigating TRex behavior, it was found that TRex does not start
the traffic in ASTF mode immediately. There is a delay of zero traffic,
By investigating TRex behavior, it was found that TRex does not start
the traffic in ASTF mode immediately. There is a delay of zero traffic,