docs/report/vpp_performance_tests/overview.rst

   1 Overview
   2 ========
   3
   4 For description of physical testbeds used for VPP performance tests
   5 please refer to :ref:`physical_testbeds`.
   6
   7 Logical Topologies
   8 ------------------
   9
  10 CSIT VPP performance tests are executed on physical testbeds described
  11 in :ref:`physical_testbeds`. Based on the packet path thru server SUTs,
  12 three distinct logical topology types are used for VPP DUT data plane
  13 testing:
  14
  15 #. NIC-to-NIC switching topologies.
  16 #. VM service switching topologies.
  17 #. Container service switching topologies.
  18
  19 NIC-to-NIC Switching
  20 ~~~~~~~~~~~~~~~~~~~~
  21
  22 The simplest logical topology for software data plane application like
  23 VPP is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node
  24 testbeds are shown in figures below.
  25
  26 .. only:: latex
  27
  28     .. raw:: latex
  29
  30         \begin{figure}[H]
  31         \centering
  32             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-2n-nic2nic}
  33             \label{fig:logical-2n-nic2nic}
  34         \end{figure}
  35
  36 .. only:: html
  37
  38     .. figure:: logical-2n-nic2nic.svg
  39         :alt: logical-2n-nic2nic
  40         :align: center
  41
  42
  43 .. only:: latex
  44
  45     .. raw:: latex
  46
  47         \begin{figure}[H]
  48         \centering
  49             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-3n-nic2nic}
  50             \label{fig:logical-3n-nic2nic}
  51         \end{figure}
  52
  53 .. only:: html
  54
  55     .. figure:: logical-3n-nic2nic.svg
  56         :alt: logical-3n-nic2nic
  57         :align: center
  58
  59 Server Systems Under Test (SUT) run VPP application in Linux user-mode
  60 as a Device Under Test (DUT). Server Traffic Generator (TG) runs T-Rex
  61 application. Physical connectivity between SUTs and TG is provided using
  62 different drivers and NIC models that need to be tested for performance
  63 (packet/bandwidth throughput and latency).
  64
  65 From SUT and DUT perspectives, all performance tests involve forwarding
  66 packets between two (or more) physical Ethernet ports (10GE, 25GE, 40GE,
  67 100GE). In most cases both physical ports on SUT are located on the same
  68 NIC. The only exceptions are link bonding and 100GE tests. In the latter
  69 case only one port per NIC can be driven at linerate due to PCIe Gen3
  70 x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3
  71 x8 slots.
  72
  73 Note that reported VPP DUT performance results are specific to the SUTs
  74 tested. SUTs with other processors than the ones used in FD.io lab are
  75 likely to yield different results. A good rule of thumb, that can be
  76 applied to estimate VPP packet thoughput for NIC-to-NIC switching
  77 topology, is to expect the forwarding performance to be proportional to
  78 processor core frequency for the same processor architecture, assuming
  79 processor is the only limiting factor and all other SUT parameters are
  80 equivalent to FD.io CSIT environment.
  81
  82 VM Service Switching
  83 ~~~~~~~~~~~~~~~~~~~~
  84
  85 VM service switching topology test cases require VPP DUT to communicate
  86 with Virtual Machines (VMs) over vhost-user virtual interfaces.
  87
  88 Two types of VM service topologies are tested in CSIT |release|:
  89
  90 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
  91    VPP DUT to VM, back to VPP DUT, then out thru NIC(s).
  92
  93 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
  94    from NIC(s) via VPP DUT to VM, back to VPP DUT, then to the next VM,
  95    back to VPP DUT and so on and so forth until the last VM in a chain,
  96    then back to VPP DUT and out thru NIC(s).
  97
  98 For each of the above topologies, VPP DUT is tested in a range of L2
  99 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 100 "Chained" VM service topologies for 2-Node and 3-Node testbeds with each
 101 SUT running N of VM instances is shown in the figures below.
 102
 103 .. only:: latex
 104
 105     .. raw:: latex
 106
 107         \begin{figure}[H]
 108         \centering
 109             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-2n-vm-vhost}
 110             \label{fig:logical-2n-vm-vhost}
 111         \end{figure}
 112
 113 .. only:: html
 114
 115     .. figure:: logical-2n-vm-vhost.svg
 116         :alt: logical-2n-vm-vhost
 117         :align: center
 118
 119
 120 .. only:: latex
 121
 122     .. raw:: latex
 123
 124         \begin{figure}[H]
 125         \centering
 126             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-3n-vm-vhost}
 127             \label{fig:logical-3n-vm-vhost}
 128         \end{figure}
 129
 130 .. only:: html
 131
 132     .. figure:: logical-3n-vm-vhost.svg
 133         :alt: logical-3n-vm-vhost
 134         :align: center
 135
 136 In "Chained" VM topologies, packets are switched by VPP DUT multiple
 137 times: twice for a single VM, three times for two VMs, N+1 times for N
 138 VMs. Hence the external throughput rates measured by TG and listed in
 139 this report must be multiplied by N+1 to represent the actual VPP DUT
 140 aggregate packet forwarding rate.
 141
 142 For "Parallel" service topology packets are always switched twice by VPP
 143 DUT per service chain.
 144
 145 Note that reported VPP DUT performance results are specific to the SUTs
 146 tested. SUTs with other processor than the ones used in FD.io lab are
 147 likely to yield different results. Similarly to NIC-to-NIC switching
 148 topology, here one can also expect the forwarding performance to be
 149 proportional to processor core frequency for the same processor
 150 architecture, assuming processor is the only limiting factor. However
 151 due to much higher dependency on intensive memory operations in VM
 152 service chained topologies and sensitivity to Linux scheduler settings
 153 and behaviour, this estimation may not always yield good enough
 154 accuracy.
 155
 156 Container Service Switching
 157 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 158
 159 Container service switching topology test cases require VPP DUT to
 160 communicate with Containers (Ctrs) over memif virtual interfaces.
 161
 162 Three types of VM service topologies are tested in CSIT |release|:
 163
 164 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
 165    VPP DUT to Container, back to VPP DUT, then out thru NIC(s).
 166
 167 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
 168    from NIC(s) via VPP DUT to Container, back to VPP DUT, then to the
 169    next Container, back to VPP DUT and so on and so forth until the
 170    last Container in a chain, then back to VPP DUT and out thru NIC(s).
 171
 172 #. "Horizontal" topology with packets flowing within SUT from NIC(s) via
 173    VPP DUT to Container, then via "horizontal" memif to the next
 174    Container, and so on and so forth until the last Container, then
 175    back to VPP DUT and out thru NIC(s).
 176
 177 For each of the above topologies, VPP DUT is tested in a range of L2
 178 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 179 "Chained" Container service topologies for 2-Node and 3-Node testbeds
 180 with each SUT running N of Container instances is shown in the figures
 181 below.
 182
 183 .. only:: latex
 184
 185     .. raw:: latex
 186
 187         \begin{figure}[H]
 188         \centering
 189             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-2n-container-memif}
 190             \label{fig:logical-2n-container-memif}
 191         \end{figure}
 192
 193 .. only:: html
 194
 195     .. figure:: logical-2n-container-memif.svg
 196         :alt: logical-2n-container-memif
 197         :align: center
 198
 199
 200 .. only:: latex
 201
 202     .. raw:: latex
 203
 204         \begin{figure}[H]
 205         \centering
 206             \includesvg[width=0.90\textwidth]{../_tmp/src/vpp_performance_tests/logical-3n-container-memif}
 207             \label{fig:logical-3n-container-memif}
 208         \end{figure}
 209
 210 .. only:: html
 211
 212     .. figure:: logical-3n-container-memif.svg
 213         :alt: logical-3n-container-memif
 214         :align: center
 215
 216 In "Chained" Container topologies, packets are switched by VPP DUT
 217 multiple times: twice for a single Container, three times for two
 218 Containers, N+1 times for N Containers. Hence the external throughput
 219 rates measured by TG and listed in this report must be multiplied by N+1
 220 to represent the actual VPP DUT aggregate packet forwarding rate.
 221
 222 For a "Parallel" and "Horizontal" service topologies packets are always
 223 switched by VPP DUT twice per service chain.
 224
 225 Note that reported VPP DUT performance results are specific to the SUTs
 226 tested. SUTs with other processor than the ones used in FD.io lab are
 227 likely to yield different results. Similarly to NIC-to-NIC switching
 228 topology, here one can also expect the forwarding performance to be
 229 proportional to processor core frequency for the same processor
 230 architecture, assuming processor is the only limiting factor. However
 231 due to much higher dependency on intensive memory operations in
 232 Container service chained topologies and sensitivity to Linux scheduler
 233 settings and behaviour, this estimation may not always yield good enough
 234 accuracy.
 235
 236 Performance Tests Coverage
 237 --------------------------
 238
 239 Performance tests measure following metrics for tested VPP DUT
 240 topologies and configurations:
 241
 242 - Packet Throughput: measured in accordance with :rfc:`2544`, using
 243   FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary
 244   search algorithm, producing throughput at different Packet Loss Ratio
 245   (PLR) values:
 246
 247   - Non Drop Rate (NDR): packet throughput at PLR=0%.
 248   - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
 249
 250 - One-Way Packet Latency: measured at different offered packet loads:
 251
 252   - 100% of discovered NDR throughput.
 253   - 100% of discovered PDR throughput.
 254
 255 - Maximum Receive Rate (MRR): measure packet forwarding rate under the
 256   maximum load offered by traffic generator over a set trial duration,
 257   regardless of packet loss. Maximum load for specified Ethernet frame
 258   size is set to the bi-directional link rate.
 259
 260 CSIT |release| includes following performance test areas covered across
 261 a range of NIC drivers and NIC models:
 262
 263 +-----------------------+----------------------------------------------+
 264 | Test Area             |  Description                                 |
 265 +=======================+==============================================+
 266 | ACL                   | L2 Bridge-Domain switching and               |
 267 |                       | IPv4and IPv6 routing with iACL and oACL IP   |
 268 |                       | address, MAC address and L4 port security.   |
 269 +-----------------------+----------------------------------------------+
 270 | COP                   | IPv4 and IPv6 routing with COP address       |
 271 |                       | security.                                    |
 272 +-----------------------+----------------------------------------------+
 273 | IPv4                  | IPv4 routing.                                |
 274 +-----------------------+----------------------------------------------+
 275 | IPv6                  | IPv6 routing.                                |
 276 +-----------------------+----------------------------------------------+
 277 | IPv4 Scale            | IPv4 routing with 20k, 200k and 2M FIB       |
 278 |                       | entries.                                     |
 279 +-----------------------+----------------------------------------------+
 280 | IPv6 Scale            | IPv6 routing with 20k, 200k and 2M FIB       |
 281 |                       | entries.                                     |
 282 +-----------------------+----------------------------------------------+
 283 | IPSecHW               | IPSec encryption with AES-GCM, CBC-SHA1      |
 284 |                       | ciphers, in combination with IPv4 routing.   |
 285 |                       | Intel QAT HW acceleration.                   |
 286 +-----------------------+----------------------------------------------+
 287 | IPSec+LISP            | IPSec encryption with CBC-SHA1 ciphers, in   |
 288 |                       | combination with LISP-GPE overlay tunneling  |
 289 |                       | for IPv4-over-IPv4.                          |
 290 +-----------------------+----------------------------------------------+
 291 | IPSecSW               | IPSec encryption with AES-GCM, CBC-SHA1      |
 292 |                       | ciphers, in combination with IPv4 routing.   |
 293 +-----------------------+----------------------------------------------+
 294 | K8s Containers Memif  | K8s orchestrated container VPP service chain |
 295 |                       | topologies connected over the memif virtual  |
 296 |                       | interface.                                   |
 297 +-----------------------+----------------------------------------------+
 298 | KVM VMs vhost-user    | Virtual topologies with service              |
 299 |                       | chains of 1 and 2 VMs using vhost-user       |
 300 |                       | interfaces, with different VPP forwarding    |
 301 |                       | modes incl. L2XC, L2BD, VXLAN with L2BD,     |
 302 |                       | IPv4 routing.                                |
 303 +-----------------------+----------------------------------------------+
 304 | L2BD                  | L2 Bridge-Domain switching of untagged       |
 305 |                       | Ethernet frames with MAC learning; disabled  |
 306 |                       | MAC learning i.e. static MAC tests to be     |
 307 |                       | added.                                       |
 308 +-----------------------+----------------------------------------------+
 309 | L2BD Scale            | L2 Bridge-Domain switching of untagged       |
 310 |                       | Ethernet frames with MAC learning; disabled  |
 311 |                       | MAC learning i.e. static MAC tests to be     |
 312 |                       | added with 20k, 200k and 2M FIB entries.     |
 313 +-----------------------+----------------------------------------------+
 314 | L2XC                  | L2 Cross-Connect switching of untagged,      |
 315 |                       | dot1q, dot1ad VLAN tagged Ethernet frames.   |
 316 +-----------------------+----------------------------------------------+
 317 | LISP                  | LISP overlay tunneling for IPv4-over-IPv4,   |
 318 |                       | IPv6-over-IPv4, IPv6-over-IPv6,              |
 319 |                       | IPv4-over-IPv6 in IPv4 and IPv6 routing      |
 320 |                       | modes.                                       |
 321 +-----------------------+----------------------------------------------+
 322 | LXC/DRC Containers    | Container VPP memif virtual interface tests  |
 323 | Memif                 | with different VPP forwarding modes incl.    |
 324 |                       | L2XC, L2BD.                                  |
 325 +-----------------------+----------------------------------------------+
 326 | NAT                   | (Source) Network Address Translation tests   |
 327 |                       | with varying number of users and ports per   |
 328 |                       | user.                                        |
 329 +-----------------------+----------------------------------------------+
 330 | QoS Policer           | Ingress packet rate measuring, marking and   |
 331 |                       | limiting (IPv4).                             |
 332 +-----------------------+----------------------------------------------+
 333 | SRv6 Routing          | Segment Routing IPv6 tests.                  |
 334 +-----------------------+----------------------------------------------+
 335 | VPP TCP/IP stack      | Tests of VPP TCP/IP stack used with VPP      |
 336 |                       | built-in HTTP server.                        |
 337 +-----------------------+----------------------------------------------+
 338 | VXLAN                 | VXLAN overlay tunnelling integration with    |
 339 |                       | L2XC and L2BD.                               |
 340 +-----------------------+----------------------------------------------+
 341
 342 Execution of performance tests takes time, especially the throughput
 343 tests. Due to limited HW testbed resources available within FD.io labs
 344 hosted by :abbr:`LF (Linux Foundation)`, the number of tests for some
 345 NIC models has been limited to few baseline tests.
 346
 347 Performance Tests Naming
 348 ------------------------
 349
 350 FD.io CSIT |release| follows a common structured naming convention for
 351 all performance and system functional tests, introduced in CSIT rls1701.
 352
 353 The naming should be intuitive for majority of the tests. Complete
 354 description of FD.io CSIT test naming convention is provided on
 355 :ref:`csit_test_naming`.