From cddac498bafc7a6092dade5e183e5c7a95cff64d Mon Sep 17 00:00:00 2001 From: Maciek Konstantynowicz Date: Mon, 19 Feb 2018 14:16:30 +0000 Subject: [PATCH 1/1] rls1801 report: edits to static content for vpp and dpdk perf sections. Change-Id: I22a38d2704b3a414798823c1846ff12f8f69d7b7 Signed-off-by: Maciek Konstantynowicz --- .../dpdk_performance_tests/csit_release_notes.rst | 2 +- .../vpp_performance_tests/csit_release_notes.rst | 72 +++++---- docs/report/vpp_performance_tests/overview.rst | 164 ++++++++++++--------- 3 files changed, 143 insertions(+), 95 deletions(-) diff --git a/docs/report/dpdk_performance_tests/csit_release_notes.rst b/docs/report/dpdk_performance_tests/csit_release_notes.rst index 9c5c3d7ed7..413c7c3cea 100644 --- a/docs/report/dpdk_performance_tests/csit_release_notes.rst +++ b/docs/report/dpdk_performance_tests/csit_release_notes.rst @@ -14,7 +14,7 @@ Here is the list of known issues in CSIT |release| for Testpmd performance tests +---+---------------------------------------------------+------------+-----------------------------------------------------------------+ | # | Issue | Jira ID | Description | +---+---------------------------------------------------+------------+-----------------------------------------------------------------+ -| 1 | Testpmd in 1t1c and 2t2c setups - large variation | CSIT-568 | Suspected NIC firmware or DPDK driver issue affecting NDR | +| 1 | Testpmd in 1t1c and 2t2c setups - large variation | CSIT-569 | Suspected NIC firmware or DPDK driver issue affecting NDR | | | of discovered NDR throughput values across | | throughput. Applies to XL710 and X710 NICs, no issues observed | | | multiple test runs with xl710 and x710 NICs. | | on x520 NICs. | +---+---------------------------------------------------+------------+-----------------------------------------------------------------+ diff --git a/docs/report/vpp_performance_tests/csit_release_notes.rst b/docs/report/vpp_performance_tests/csit_release_notes.rst index 754abc0d13..17003bc85a 100644 --- a/docs/report/vpp_performance_tests/csit_release_notes.rst +++ b/docs/report/vpp_performance_tests/csit_release_notes.rst @@ -6,32 +6,32 @@ Changes in CSIT |release| #. Added VPP performance tests - - **Container Topologies Orchestrated by K8s with VPP memif tests** - - - Added tests with VPP in L2 Cross-Connect and Bridge-Domain - configurations containers, with service chain topologies orchestrated by - Kubernetes. Added following forwarding topologies: i) "Parallel" with - packets flowing from NIC via VPP to container and back to VPP and NIC; - ii) "Chained" a.k.a. "Snake" with packets flowing via VPP to container, - back to VPP, to next container, back to VPP and so on until the last - container in chain, then back to VPP and NIC; iii) "Horizontal" with - packets flowing via VPP to container, then via "horizontal" memif to - next container, and so on until the last container, then back to VPP and - NIC;. + - **Container Service Chain Topologies Orchestrated by K8s with VPP Memif** + + - Added tests with VPP vswitch in container connecting a number of VPP- + in-container service chain topologies with L2 Cross-Connect and L2 + Bridge-Domain configurations, orchestrated by Kubernetes. Added + following forwarding topologies: i) "Parallel" with packets flowing from + NIC via VPP to container and back to VPP and NIC; ii) "Chained" (a.k.a. + "Snake") with packets flowing via VPP to container, back to VPP, to next + container, back to VPP and so on until the last container in a chain, + then back to VPP and NIC; iii) "Horizontal" with packets flowing via VPP + to container, then via "horizontal" memif to next container, and so on + until the last container, then back to VPP and NIC; - **VPP TCP/IP stack** - Added tests for VPP TCP/IP stack using VPP built-in HTTP server. WRK traffic generator is used as a client-side; - - **SRv6 tests** + - **SRv6** - Initial SRv6 (Segment Routing IPv6) tests verifying performance of IPv6 and SRH (Segment Routing Header) encapsulation, decapsulation, lookups and rewrites based on configured End and End.DX6 SRv6 egress functions; - - **IPSecSW tests** + - **IPSecSW** - SW computed IPSec encryption with AES-GCM, CBC-SHA1 ciphers, in combination with IPv4 routed-forwarding; @@ -42,7 +42,7 @@ Changes in CSIT |release| VPP tests into Presentation and Analytics Layer (PAL) for automated CSIT test results analysis; -#. Other improvements +#. Other changes - **Framework optimizations** @@ -50,15 +50,30 @@ Changes in CSIT |release| - Overall stability improvements; + - **NDR and PDR throughput binary search change** + + - Increased binary search resolution by reducing final step from + 100kpps to 50kpps; + + - **VPP plugin loaded as needed by tests** + + - From this release only plugins required by tests are loaded at + VPP initialization time. Previously all plugins were loaded for + all tests; + Performance Changes ------------------- -Substantial changes in measured packet throughput have been observed in a -number of CSIT |release| tests listed below. Relative changes for this release -are calculated against the test results listed in CSIT |release-1| report. The -comparison is calculated between the mean values based on collected and -archived test results' samples for involved VPP releases. Standard deviation -has been also listed for CSIT |release|. +Relative performance changes in measured packet throughput in CSIT +|release| are calculated against the results from CSIT |release-1| +report. Listed mean and standard deviation values are computed based on +a series of the same tests executed against respective VPP releases to +verify test results repeatibility, with percentage change calculated for +mean values. Note that the standard deviation is quite high for a small +number of packet throughput tests, what indicates poor test results +repeatability and makes the relative change of mean throughput value not +fully representative for these tests. The root causes behind poor +results repeatibility vary between the test cases. NDR Throughput Changes ~~~~~~~~~~~~~~~~~~~~~~ @@ -97,13 +112,14 @@ Here is the list of known issues in CSIT |release| for VPP performance tests: | 1 | Vic1385 and Vic1227 low performance. | VPP-664 | Low NDR performance. | | | | | | +---+-------------------------------------------------+------------+-----------------------------------------------------------------+ -| 2 | Sporadic NDR discovery test failures on x520. | CSIT-750 | Suspected issue with HW combination of X710-X520 in LF | -| | | | infrastructure. Issue can't be replicated outside LF. | +| 2 | Sporadic (1 in 200) NDR discovery test failures | CSIT-570 | DPDK reporting rx-errors, indicating L1 issue. Suspected issue | +| | on x520. | | with HW combination of X710-X520 in LF testbeds. Not observed | +| | | | outside of LF testbeds. | +---+-------------------------------------------------+------------+-----------------------------------------------------------------+ -| 3 | VPP in 2t2c setups - large variation | CSIT-568 | Suspected NIC firmware or DPDK driver issue affecting NDR | -| | of discovered NDR throughput values across | | throughput. Applies to XL710 and X710 NICs, x520 NICs are fine. | -| | multiple test runs with xl710 and x710 NICs. | | | -+---+-------------------------------------------------+------------+-----------------------------------------------------------------+ -| 4 | Lower than expected NDR throughput with | CSIT-569 | Suspected NIC firmware or DPDK driver issue affecting NDR and | +| 3 | Lower than expected NDR throughput with | CSIT-571 | Suspected NIC firmware or DPDK driver issue affecting NDR and | | | xl710 and x710 NICs, compared to x520 NICs. | | PDR throughput. Applies to XL710 and X710 NICs. | +---+-------------------------------------------------+------------+-----------------------------------------------------------------+ +| 4 | QAT IPSec scale with 1000 tunnels (interfaces) | VPP-1121 | VPP crashes during configuration of 1000 IPsec tunnels. | +| | in 2t2c config, all tests are failing. | | 1t1c tests are not affected | ++---+-------------------------------------------------+------------+-----------------------------------------------------------------+ + diff --git a/docs/report/vpp_performance_tests/overview.rst b/docs/report/vpp_performance_tests/overview.rst index f243637a6f..86bea87c0b 100644 --- a/docs/report/vpp_performance_tests/overview.rst +++ b/docs/report/vpp_performance_tests/overview.rst @@ -10,23 +10,23 @@ CSIT VPP performance tests are executed on physical baremetal servers hosted by :abbr:`LF (Linux Foundation)` FD.io project. Testbed physical topology is shown in the figure below.:: - +------------------------+ +------------------------+ - | | | | - | +------------------+ | | +------------------+ | - | | | | | | | | - | | <-----------------> | | - | | DUT1 | | | | DUT2 | | - | +--^---------------+ | | +---------------^--+ | - | | | | | | - | | SUT1 | | SUT2 | | - +------------------------+ +------------------^-----+ - | | - | | - | +-----------+ | - | | | | - +------------------> TG <------------------+ - | | - +-----------+ + +------------------------+ +------------------------+ + | | | | + | +------------------+ | | +------------------+ | + | | | | | | | | + | | <-----------------> | | + | | DUT1 | | | | DUT2 | | + | +--^---------------+ | | +---------------^--+ | + | | | | | | + | | SUT1 | | SUT2 | | + +------------------------+ +------------------^-----+ + | | + | | + | +-----------+ | + | | | | + +------------------> TG <------------------+ + | | + +-----------+ SUT1 and SUT2 are two System Under Test servers (Cisco UCS C240, each with two Intel XEON CPUs), TG is a Traffic Generator (TG, another Cisco UCS C240, with @@ -53,43 +53,59 @@ Going forward CSIT project will be looking to add more hardware into FD.io performance labs to address larger scale multi-interface and multi-NIC performance testing scenarios. -For test cases that require DUT (VPP) to communicate with -VirtualMachines (VMs) / Linux or Docker Containers (Ctrs) over +For service chain topology test cases that require DUT (VPP) to communicate with +VirtualMachines (VMs) or with Linux/Docker Containers (Ctrs) over vhost-user/memif interfaces, N of VM/Ctr instances are created on SUT1 -and SUT2. For N=1 DUT forwards packets between vhost/memif and physical -interfaces. For N>1 DUT a logical service chain forwarding topology is -created on DUT by applying L2 or IPv4/IPv6 configuration depending on -the test suite. DUT test topology with N VM/Ctr instances is shown in -the figure below including applicable packet flow thru the DUTs and -VMs/Ctrs (marked in the figure with ``***``).:: - - +-------------------------+ +-------------------------+ - | +---------+ +---------+ | | +---------+ +---------+ | - | |VM/Ctr[1]| |VM/Ctr[N]| | | |VM/Ctr[1]| |VM/Ctr[N]| | - | | ***** | | ***** | | | | ***** | | ***** | | - | +--^---^--+ +--^---^--+ | | +--^---^--+ +--^---^--+ | - | *| |* *| |* | | *| |* *| |* | - | +--v---v-------v---v--+ | | +--v---v-------v---v--+ | - | | * * * * |*|***********|*| * * * * | | - | | * ********* ***<-|-----------|->*** ********* * | | - | | * DUT1 | | | | DUT2 * | | - | +--^------------------+ | | +------------------^--+ | - | *| | | |* | - | *| SUT1 | | SUT2 |* | - +-------------------------+ +-------------------------+ - *| |* - *| |* - *| +-----------+ |* - *| | | |* - *+--------------------> TG <--------------------+* - **********************| |********************** - +-----------+ - -For VM/Ctr tests, packets are switched by DUT multiple times: twice for -a single VM/Ctr, three times for two VMs/Ctrs, N+1 times for N VMs/Ctrs. -Hence the external throughput rates measured by TG and listed in this -report must be multiplied by (N+1) to represent the actual DUT aggregate -packet forwarding rate. +and SUT2. Three types of service chain topologies are tested in CSIT |release|: + +#. "Parallel" topology with packets flowing from NIC via DUT (VPP) to + VM/Container and back to VPP and NIC; + +#. "Chained" topology (a.k.a. "Snake") with packets flowing via DUT (VPP) to + VM/Container, back to DUT, then to the next VM/Container, back to DUT and + so on until the last VM/Container in a chain, then back to DUT and NIC; + +#. "Horizontal" topology with packets flowing via DUT (VPP) to Container, + then via "horizontal" memif to the next Container, and so on until the + last Container, then back to DUT and NIC. "Horizontal" topology is not + supported for VMs; + +For each of the above topologies, DUT (VPP) is tested in a range of L2 +or IPv4/IPv6 configurations depending on the test suite. A sample DUT +"Chained" service topology with N of VM/Ctr instances is shown in the +figure below. Packet flow thru the DUTs and VMs/Ctrs is marked with +``***``:: + + +-------------------------+ +-------------------------+ + | +---------+ +---------+ | | +---------+ +---------+ | + | |VM/Ctr[1]| |VM/Ctr[N]| | | |VM/Ctr[1]| |VM/Ctr[N]| | + | | ***** | | ***** | | | | ***** | | ***** | | + | +--^---^--+ +--^---^--+ | | +--^---^--+ +--^---^--+ | + | *| |* *| |* | | *| |* *| |* | + | +--v---v-------v---v--+ | | +--v---v-------v---v--+ | + | | * * * * |*|***********|*| * * * * | | + | | * ********* ***<-|-----------|->*** ********* * | | + | | * DUT1 | | | | DUT2 * | | + | +--^------------------+ | | +------------------^--+ | + | *| | | |* | + | *| SUT1 | | SUT2 |* | + +-------------------------+ +-------------------------+ + *| |* + *| |* + *| +-----------+ |* + *| | | |* + *+--------------------> TG <--------------------+* + **********************| |********************** + +-----------+ + +In above "Chained" topology, packets are switched by DUT multiple times: +twice for a single VM/Ctr, three times for two VMs/Ctrs, N+1 times for N +VMs/Ctrs. Hence the external throughput rates measured by TG and listed +in this report must be multiplied by (N+1) to represent the actual DUT +aggregate packet forwarding rate. + +For a "Parallel" and "Horizontal" service topologies packets are always +switched by DUT twice per service chain. Note that reported DUT (VPP) performance results are specific to the SUTs tested. Current :abbr:`LF (Linux Foundation)` FD.io SUTs are based on Intel @@ -162,8 +178,8 @@ CSIT |release| includes following performance test suites, listed per NIC type: number of users and ports per user. - **Container memif connections** - VPP memif virtual interface tests to interconnect VPP instances with L2XC and L2BD. - - **Container K8s Orchestrated Topologies** - Container topologies connected over - the memif virtual interface. + - **Container K8s Orchestrated Topologies** - Container topologies connected + over the memif virtual interface. - **SRv6** - Segment Routing IPv6 tests. - 2port40GE XL710 Intel @@ -236,11 +252,17 @@ following VPP thread and core configurations: #. 1t1c - 1 VPP worker thread on 1 CPU physical core. #. 2t2c - 2 VPP worker threads on 2 CPU physical cores. +#. 4t4c - 4 VPP worker threads on 4 CPU physical cores. -VPP worker threads are the data plane threads. VPP control thread is running on -a separate non-isolated core together with other Linux processes. Note that in -quite a few test cases running VPP workers on 2 physical cores hits the tested -NIC I/O bandwidth or packets-per-second limit. +VPP worker threads are the data plane threads. VPP control thread is +running on a separate non-isolated core together with other Linux +processes. Note that in quite a few test cases running VPP workers on 2 +or 4 physical cores hits the I/O bandwidth or packets-per-second limit +of tested NIC. + +Section :ref:`throughput_speedup_multi_core` includes a set of graphs +illustrating packet throughout speedup when running VPP on multiple +cores. Methodology: Packet Throughput ------------------------------ @@ -250,23 +272,33 @@ Following values are measured and reported for packet throughput tests: - NDR binary search per :rfc:`2544`: - Packet rate: "RATE: pps - (2x )" + (2x )"; - Aggregate bandwidth: "BANDWIDTH: Gbps (untagged)" + second> Gbps (untagged)"; - PDR binary search per :rfc:`2544`: - Packet rate: "RATE: pps (2x - )" + )"; - Aggregate bandwidth: "BANDWIDTH: Gbps (untagged)" + second> Gbps (untagged)"; - Packet loss tolerance: "LOSS_ACCEPTANCE "" + lost at PDR rate>"; - NDR and PDR are measured for the following L2 frame sizes: - - IPv4: 64B, IMIX_v4_1 (28x64B,16x570B,4x1518B), 1518B, 9000B. - - IPv6: 78B, 1518B, 9000B. + - IPv4: 64B, IMIX_v4_1 (28x64B,16x570B,4x1518B), 1518B, 9000B; + - IPv6: 78B, 1518B, 9000B; + +- NDR and PDR binary search resolution is determined by the final value of the + rate change, referred to as the final step: + + - The final step is set to 50kpps for all NIC to NIC tests and all L2 + frame sizes except 9000B (changed from 100kpps used in previous + releases). + + - The final step is set to 10kpps for all remaining tests, including 9000B + and all vhost VM and memif Container tests. All rates are reported from external Traffic Generator perspective. -- 2.16.6