Overview
========
-Tested Physical Topologies
---------------------------
-
-CSIT DPDK performance tests are executed on physical baremetal servers hosted
-by LF FD.io project. Testbed physical topology is shown in the figure below.
-
-::
-
- +------------------------+ +------------------------+
- | | | |
- | +------------------+ | | +------------------+ |
- | | | | | | | |
- | | <-----------------> | |
- | | DUT1 | | | | DUT2 | |
- | +--^---------------+ | | +---------------^--+ |
- | | | | | |
- | | SUT1 | | SUT2 | |
- +------------------------+ +------------------^-----+
- | |
- | |
- | +-----------+ |
- | | | |
- +------------------> TG <------------------+
- | |
- +-----------+
-
-SUT1 and SUT2 are two System Under Test servers (currently Cisco UCS C240,
-each with two Intel XEON CPUs), TG is a Traffic Generator (TG, currently
-another Cisco UCS C240, with two Intel XEON CPUs). SUTs run Testpmd/L3FWD SW
-application in Linux user-mode as a Device Under Test (DUT). TG runs TRex SW
-application as a packet Traffic Generator. Physical connectivity between SUTs
-and to TG is provided using direct links (no L2 switches) connecting different
-NIC models that need to be tested for performance. Currently installed and
-tested NIC models include:
-
-#. 2port10GE X520-DA2 Intel.
-#. 2port10GE X710 Intel.
-#. 2port10GE VIC1227 Cisco.
-#. 2port40GE VIC1385 Cisco.
-#. 2port40GE XL710 Intel.
-
-For detailed LF FD.io test bed specification and physical topology please refer
-to `LF FDio CSIT testbed wiki page <https://wiki.fd.io/view/CSIT/CSIT_LF_testbed>`_.
-
-Performance Tests Coverage
---------------------------
-
-Performance tests are split into the two main categories:
-
-- Throughput discovery - discovery of packet forwarding rate using binary search
- in accordance with RFC2544.
-
- - NDR - discovery of Non Drop Rate packet throughput, at zero packet loss;
- followed by packet one-way latency measurements at 10%, 50% and 100% of
- discovered NDR throughput.
- - PDR - discovery of Partial Drop Rate, with specified non-zero packet loss
- currently set to 0.5%; followed by packet one-way latency measurements at
- 100% of discovered PDR throughput.
+DPDK performance test results are reported for all three physical
+testbed types present in FD.io labs: 3-Node Xeon Haswell (3n-hsw),
+3-Node Xeon Skylake (3n-skx), 2-Node Xeon Skylake (2n-skx) and installed
+NIC models. For description of physical testbeds used for DPDK
+performance tests please refer to :ref:`tested_physical_topologies`.
-- Throughput verification - verification of packet forwarding rate against
- previously discovered NDR throughput. These tests are currently done against
- 0.9 of reference NDR, with reference rates updated periodically.
+Logical Topologies
+------------------
-CSIT |release| includes following performance test suites, listed per NIC type:
+CSIT DPDK performance tests are executed on physical testbeds described
+in :ref:`tested_physical_topologies`. Based on the packet path through
+server SUTs, one distinct logical topology type is used for DPDK DUT
+data plane testing: NIC-to-NIC switching topology.
-- 2port10GE X520-DA2 Intel
+NIC-to-NIC Switching
+~~~~~~~~~~~~~~~~~~~~
- - **L2IntLoop** - L2 Interface Loop forwarding any Ethernet frames between
- two Interfaces.
+The simplest logical topology for software data plane application like
+DPDK is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node
+testbeds are shown in figures below.
-- 2port40GE XL710 Intel
+.. only:: latex
- - **L2IntLoop** - L2 Interface Loop forwarding any Ethernet frames between
- two Interfaces.
+ .. raw:: latex
-- 2port10GE X520-DA2 Intel
+ \begin{figure}[H]
+ \centering
+ \graphicspath{{../_tmp/src/vpp_performance_tests/}}
+ \includegraphics[width=0.90\textwidth]{logical-2n-nic2nic}
+ \label{fig:logical-2n-nic2nic}
+ \end{figure}
- - **IPv4 Routed Forwarding** - L3 IP forwarding of Ethernet frames between
- two Interfaces.
+.. only:: html
-Execution of performance tests takes time, especially the throughput discovery
-tests. Due to limited HW testbed resources available within FD.io labs hosted
-by Linux Foundation, the number of tests for NICs other than X520 (a.k.a.
-Niantic) has been limited to few baseline tests. Over time we expect the HW
-testbed resources to grow, and will be adding complete set of performance
-tests for all models of hardware to be executed regularly and(or)
-continuously.
+ .. figure:: ../vpp_performance_tests/logical-2n-nic2nic.svg
+ :alt: logical-2n-nic2nic
+ :align: center
-Methodology: Multi-Thread and Multi-Core
-----------------------------------------
-**HyperThreading** - CSIT |release| performance tests are executed with SUT
-servers' Intel XEON CPUs configured in HyperThreading Disabled mode (BIOS
-settings). This is the simplest configuration used to establish baseline
-single-thread single-core SW packet processing and forwarding performance.
-Subsequent releases of CSIT will add performance tests with Intel
-HyperThreading Enabled (requires BIOS settings change and hard reboot).
+.. only:: latex
-**Multi-core Test** - CSIT |release| multi-core tests are executed in the
-following thread and core configurations:
+ .. raw:: latex
-#. 1t1c - 1 pmd thread on 1 CPU physical core.
-#. 2t2c - 2 pmd threads on 2 CPU physical cores.
+ \begin{figure}[H]
+ \centering
+ \graphicspath{{../_tmp/src/vpp_performance_tests/}}
+ \includegraphics[width=0.90\textwidth]{logical-3n-nic2nic}
+ \label{fig:logical-3n-nic2nic}
+ \end{figure}
-Note that in many tests running Testpmd/L3FWD reaches tested NIC I/O bandwidth
-or packets-per-second limit.
+.. only:: html
-Methodology: Packet Throughput
-------------------------------
+ .. figure:: ../vpp_performance_tests/logical-3n-nic2nic.svg
+ :alt: logical-3n-nic2nic
+ :align: center
-Following values are measured and reported for packet throughput tests:
+Server Systems Under Test (SUT) run DPDK Testpmd or L3fwd application in
+Linux user-mode as a Device Under Test (DUT). Server Traffic Generator
+(TG) runs T-Rex application. Physical connectivity between SUTs and TG
+is provided using different drivers and NIC models that need to be
+tested for performance (packet/bandwidth throughput and latency).
-- NDR binary search per RFC2544:
+From SUT and DUT perspectives, all performance tests involve forwarding
+packets between two physical Ethernet ports (10GE, 25GE, 40GE, 100GE).
+In most cases both physical ports on SUT are located on the same
+NIC. The only exceptions are link bonding and 100GE tests. In the latter
+case only one port per NIC can be driven at linerate due to PCIe Gen3
+x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3
+x8 slots.
- - Packet rate: "RATE: <aggregate packet rate in packets-per-second> pps
- (2x <per direction packets-per-second>)"
- - Aggregate bandwidth: "BANDWIDTH: <aggregate bandwidth in Gigabits per
- second> Gbps (untagged)"
+Note that reported DPDK DUT performance results are specific to the SUTs
+tested. SUTs with other processors than the ones used in FD.io lab are
+likely to yield different results. A good rule of thumb, that can be
+applied to estimate DPDK packet thoughput for NIC-to-NIC switching
+topology, is to expect the forwarding performance to be proportional to
+processor core frequency for the same processor architecture, assuming
+processor is the only limiting factor and all other SUT parameters are
+equivalent to FD.io CSIT environment.
-- PDR binary search per RFC2544:
-
- - Packet rate: "RATE: <aggregate packet rate in packets-per-second> pps (2x
- <per direction packets-per-second>)"
- - Aggregate bandwidth: "BANDWIDTH: <aggregate bandwidth in Gigabits per
- second> Gbps (untagged)"
- - Packet loss tolerance: "LOSS_ACCEPTANCE <accepted percentage of packets
- lost at PDR rate>""
-
-- NDR and PDR are measured for the following L2 frame sizes:
-
- - IPv4: 64B, 1518B, 9000B.
-
-
-Methodology: Packet Latency
----------------------------
-
-TRex Traffic Generator (TG) is used for measuring latency of Testpmd DUTs.
-Reported latency values are measured using following methodology:
+Performance Tests Coverage
+--------------------------
-- Latency tests are performed at 10%, 50% of discovered NDR rate (non drop rate)
- for each NDR throughput test and packet size (except IMIX).
-- TG sends dedicated latency streams, one per direction, each at the rate of
- 10kpps at the prescribed packet size; these are sent in addition to the main
- load streams.
-- TG reports min/avg/max latency values per stream direction, hence two sets
- of latency values are reported per test case; future release of TRex is
- expected to report latency percentiles.
-- Reported latency values are aggregate across two SUTs due to three node
- topology used for all performance tests; for per SUT latency, reported value
- should be divided by two.
-- 1usec is the measurement accuracy advertised by TRex TG for the setup used in
- FD.io labs used by CSIT project.
-- TRex setup introduces an always-on error of about 2*2usec per latency flow -
- additonal Tx/Rx interface latency induced by TRex SW writing and reading
- packet timestamps on CPU cores without HW acceleration on NICs closer to the
- interface line.
+Performance tests measure following metrics for tested DPDK DUT
+topologies and configurations:
+
+- Packet Throughput: measured in accordance with :rfc:`2544`, using
+ FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary
+ search algorithm, producing throughput at different Packet Loss Ratio
+ (PLR) values:
+
+ - Non Drop Rate (NDR): packet throughput at PLR=0%.
+ - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
+
+- One-Way Packet Latency: measured at different offered packet loads:
+
+ - 100% of discovered NDR throughput.
+ - 100% of discovered PDR throughput.
+
+- Maximum Receive Rate (MRR): measured packet forwarding rate under the
+ maximum load offered by traffic generator over a set trial duration,
+ regardless of packet loss. Maximum load for specified Ethernet frame
+ size is set to the bi-directional link rate.
+
+|csit-release| includes following DPDK Testpmd and L3fwd data plane
+functionality performance tested across a range of NIC drivers and NIC
+models:
+
++-----------------------+----------------------------------------------+
+| Functionality | Description |
++=======================+==============================================+
+| L2IntLoop | L2 Interface Loop forwarding all Ethernet |
+| | frames between two Interfaces. |
++-----------------------+----------------------------------------------+
+| IPv4 Routed | Longest Prefix Match (LPM) L3 IPv4 |
+| Forwarding | forwarding of Ethernet frames between two |
+| | Interfaces, with two /8 prefixes in lookup |
+| | table. |
++-----------------------+----------------------------------------------+