docs/report/vpp_performance_tests/overview.rst

   1 Overview
   2 ========
   3
   4 For description of physical testbeds used for VPP performance tests
   5 please refer to :ref:`tested_physical_topologies`.
   6
   7 .. _tested_logical_topologies:
   8
   9 Logical Topologies
  10 ------------------
  11
  12 CSIT VPP performance tests are executed on physical testbeds described
  13 in :ref:`tested_physical_topologies`. Based on the packet path thru
  14 server SUTs, three distinct logical topology types are used for VPP DUT
  15 data plane testing:
  16
  17 #. NIC-to-NIC switching topologies.
  18 #. VM service switching topologies.
  19 #. Container service switching topologies.
  20
  21 NIC-to-NIC Switching
  22 ~~~~~~~~~~~~~~~~~~~~
  23
  24 The simplest logical topology for software data plane application like
  25 VPP is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node
  26 testbeds are shown in figures below.
  27
  28 .. only:: latex
  29
  30     .. raw:: latex
  31
  32         \begin{figure}[H]
  33             \centering
  34                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
  35                 \includegraphics[width=0.90\textwidth]{logical-2n-nic2nic}
  36                 \label{fig:logical-2n-nic2nic}
  37         \end{figure}
  38
  39 .. only:: html
  40
  41     .. figure:: logical-2n-nic2nic.svg
  42         :alt: logical-2n-nic2nic
  43         :align: center
  44
  45
  46 .. only:: latex
  47
  48     .. raw:: latex
  49
  50         \begin{figure}[H]
  51             \centering
  52                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
  53                 \includegraphics[width=0.90\textwidth]{logical-3n-nic2nic}
  54                 \label{fig:logical-3n-nic2nic}
  55         \end{figure}
  56
  57 .. only:: html
  58
  59     .. figure:: logical-3n-nic2nic.svg
  60         :alt: logical-3n-nic2nic
  61         :align: center
  62
  63 Server Systems Under Test (SUT) run VPP application in Linux user-mode
  64 as a Device Under Test (DUT). Server Traffic Generator (TG) runs T-Rex
  65 application. Physical connectivity between SUTs and TG is provided using
  66 different drivers and NIC models that need to be tested for performance
  67 (packet/bandwidth throughput and latency).
  68
  69 From SUT and DUT perspectives, all performance tests involve forwarding
  70 packets between two (or more) physical Ethernet ports (10GE, 25GE, 40GE,
  71 100GE). In most cases both physical ports on SUT are located on the same
  72 NIC. The only exceptions are link bonding and 100GE tests. In the latter
  73 case only one port per NIC can be driven at linerate due to PCIe Gen3
  74 x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3
  75 x8 slots.
  76
  77 Note that reported VPP DUT performance results are specific to the SUTs
  78 tested. SUTs with other processors than the ones used in FD.io lab are
  79 likely to yield different results. A good rule of thumb, that can be
  80 applied to estimate VPP packet thoughput for NIC-to-NIC switching
  81 topology, is to expect the forwarding performance to be proportional to
  82 processor core frequency for the same processor architecture, assuming
  83 processor is the only limiting factor and all other SUT parameters are
  84 equivalent to FD.io CSIT environment.
  85
  86 VM Service Switching
  87 ~~~~~~~~~~~~~~~~~~~~
  88
  89 VM service switching topology test cases require VPP DUT to communicate
  90 with Virtual Machines (VMs) over vhost-user virtual interfaces.
  91
  92 Two types of VM service topologies are tested in |csit-release|:
  93
  94 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
  95    VPP DUT to VM, back to VPP DUT, then out thru NIC(s).
  96
  97 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
  98    from NIC(s) via VPP DUT to VM, back to VPP DUT, then to the next VM,
  99    back to VPP DUT and so on and so forth until the last VM in a chain,
 100    then back to VPP DUT and out thru NIC(s).
 101
 102 For each of the above topologies, VPP DUT is tested in a range of L2
 103 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 104 "Chained" VM service topologies for 2-Node and 3-Node testbeds with each
 105 SUT running N of VM instances is shown in the figures below.
 106
 107 .. only:: latex
 108
 109     .. raw:: latex
 110
 111         \begin{figure}[H]
 112             \centering
 113                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 114                 \includegraphics[width=0.90\textwidth]{logical-2n-vm-vhost}
 115                 \label{fig:logical-2n-vm-vhost}
 116         \end{figure}
 117
 118 .. only:: html
 119
 120     .. figure:: logical-2n-vm-vhost.svg
 121         :alt: logical-2n-vm-vhost
 122         :align: center
 123
 124
 125 .. only:: latex
 126
 127     .. raw:: latex
 128
 129         \begin{figure}[H]
 130             \centering
 131                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 132                 \includegraphics[width=0.90\textwidth]{logical-3n-vm-vhost}
 133                 \label{fig:logical-3n-vm-vhost}
 134         \end{figure}
 135
 136 .. only:: html
 137
 138     .. figure:: logical-3n-vm-vhost.svg
 139         :alt: logical-3n-vm-vhost
 140         :align: center
 141
 142 In "Chained" VM topologies, packets are switched by VPP DUT multiple
 143 times: twice for a single VM, three times for two VMs, N+1 times for N
 144 VMs. Hence the external throughput rates measured by TG and listed in
 145 this report must be multiplied by N+1 to represent the actual VPP DUT
 146 aggregate packet forwarding rate.
 147
 148 For "Parallel" service topology packets are always switched twice by VPP
 149 DUT per service chain.
 150
 151 Note that reported VPP DUT performance results are specific to the SUTs
 152 tested. SUTs with other processor than the ones used in FD.io lab are
 153 likely to yield different results. Similarly to NIC-to-NIC switching
 154 topology, here one can also expect the forwarding performance to be
 155 proportional to processor core frequency for the same processor
 156 architecture, assuming processor is the only limiting factor. However
 157 due to much higher dependency on intensive memory operations in VM
 158 service chained topologies and sensitivity to Linux scheduler settings
 159 and behaviour, this estimation may not always yield good enough
 160 accuracy.
 161
 162 Container Service Switching
 163 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 164
 165 Container service switching topology test cases require VPP DUT to
 166 communicate with Containers (Ctrs) over memif virtual interfaces.
 167
 168 Three types of VM service topologies are tested in |csit-release|:
 169
 170 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
 171    VPP DUT to Container, back to VPP DUT, then out thru NIC(s).
 172
 173 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
 174    from NIC(s) via VPP DUT to Container, back to VPP DUT, then to the
 175    next Container, back to VPP DUT and so on and so forth until the
 176    last Container in a chain, then back to VPP DUT and out thru NIC(s).
 177
 178 #. "Horizontal" topology with packets flowing within SUT from NIC(s) via
 179    VPP DUT to Container, then via "horizontal" memif to the next
 180    Container, and so on and so forth until the last Container, then
 181    back to VPP DUT and out thru NIC(s).
 182
 183 For each of the above topologies, VPP DUT is tested in a range of L2
 184 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 185 "Chained" Container service topologies for 2-Node and 3-Node testbeds
 186 with each SUT running N of Container instances is shown in the figures
 187 below.
 188
 189 .. only:: latex
 190
 191     .. raw:: latex
 192
 193         \begin{figure}[H]
 194             \centering
 195                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 196                 \includegraphics[width=0.90\textwidth]{logical-2n-container-memif}
 197                 \label{fig:logical-2n-container-memif}
 198         \end{figure}
 199
 200 .. only:: html
 201
 202     .. figure:: logical-2n-container-memif.svg
 203         :alt: logical-2n-container-memif
 204         :align: center
 205
 206
 207 .. only:: latex
 208
 209     .. raw:: latex
 210
 211         \begin{figure}[H]
 212             \centering
 213                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 214                 \includegraphics[width=0.90\textwidth]{logical-3n-container-memif}
 215                 \label{fig:logical-3n-container-memif}
 216         \end{figure}
 217
 218 .. only:: html
 219
 220     .. figure:: logical-3n-container-memif.svg
 221         :alt: logical-3n-container-memif
 222         :align: center
 223
 224 In "Chained" Container topologies, packets are switched by VPP DUT
 225 multiple times: twice for a single Container, three times for two
 226 Containers, N+1 times for N Containers. Hence the external throughput
 227 rates measured by TG and listed in this report must be multiplied by N+1
 228 to represent the actual VPP DUT aggregate packet forwarding rate.
 229
 230 For a "Parallel" and "Horizontal" service topologies packets are always
 231 switched by VPP DUT twice per service chain.
 232
 233 Note that reported VPP DUT performance results are specific to the SUTs
 234 tested. SUTs with other processor than the ones used in FD.io lab are
 235 likely to yield different results. Similarly to NIC-to-NIC switching
 236 topology, here one can also expect the forwarding performance to be
 237 proportional to processor core frequency for the same processor
 238 architecture, assuming processor is the only limiting factor. However
 239 due to much higher dependency on intensive memory operations in
 240 Container service chained topologies and sensitivity to Linux scheduler
 241 settings and behaviour, this estimation may not always yield good enough
 242 accuracy.
 243
 244 Performance Tests Coverage
 245 --------------------------
 246
 247 Performance tests measure following metrics for tested VPP DUT
 248 topologies and configurations:
 249
 250 - Packet Throughput: measured in accordance with :rfc:`2544`, using
 251   FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary
 252   search algorithm, producing throughput at different Packet Loss Ratio
 253   (PLR) values:
 254
 255   - Non Drop Rate (NDR): packet throughput at PLR=0%.
 256   - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
 257
 258 - One-Way Packet Latency: measured at different offered packet loads:
 259
 260   - 100% of discovered NDR throughput.
 261   - 100% of discovered PDR throughput.
 262
 263 - Maximum Receive Rate (MRR): measure packet forwarding rate under the
 264   maximum load offered by traffic generator over a set trial duration,
 265   regardless of packet loss. Maximum load for specified Ethernet frame
 266   size is set to the bi-directional link rate.
 267
 268 |csit-release| includes following performance test areas covered across
 269 a range of NIC drivers and NIC models:
 270
 271 +-----------------------+----------------------------------------------+
 272 | Test Area             |  Description                                 |
 273 +=======================+==============================================+
 274 | ACL                   | L2 Bridge-Domain switching and               |
 275 |                       | IPv4and IPv6 routing with iACL and oACL IP   |
 276 |                       | address, MAC address and L4 port security.   |
 277 +-----------------------+----------------------------------------------+
 278 | COP                   | IPv4 and IPv6 routing with COP address       |
 279 |                       | security.                                    |
 280 +-----------------------+----------------------------------------------+
 281 | IPv4                  | IPv4 routing.                                |
 282 +-----------------------+----------------------------------------------+
 283 | IPv6                  | IPv6 routing.                                |
 284 +-----------------------+----------------------------------------------+
 285 | IPv4 Scale            | IPv4 routing with 20k, 200k and 2M FIB       |
 286 |                       | entries.                                     |
 287 +-----------------------+----------------------------------------------+
 288 | IPv6 Scale            | IPv6 routing with 20k, 200k and 2M FIB       |
 289 |                       | entries.                                     |
 290 +-----------------------+----------------------------------------------+
 291 | IPSecHW               | IPSec encryption with AES-GCM, CBC-SHA1      |
 292 |                       | ciphers, in combination with IPv4 routing.   |
 293 |                       | Intel QAT HW acceleration.                   |
 294 +-----------------------+----------------------------------------------+
 295 | IPSec+LISP            | IPSec encryption with CBC-SHA1 ciphers, in   |
 296 |                       | combination with LISP-GPE overlay tunneling  |
 297 |                       | for IPv4-over-IPv4.                          |
 298 +-----------------------+----------------------------------------------+
 299 | IPSecSW               | IPSec encryption with AES-GCM, CBC-SHA1      |
 300 |                       | ciphers, in combination with IPv4 routing.   |
 301 +-----------------------+----------------------------------------------+
 302 | K8s Containers Memif  | K8s orchestrated container VPP service chain |
 303 |                       | topologies connected over the memif virtual  |
 304 |                       | interface.                                   |
 305 +-----------------------+----------------------------------------------+
 306 | KVM VMs vhost-user    | Virtual topologies with service              |
 307 |                       | chains of 1 and 2 VMs using vhost-user       |
 308 |                       | interfaces, with different VPP forwarding    |
 309 |                       | modes incl. L2XC, L2BD, VXLAN with L2BD,     |
 310 |                       | IPv4 routing.                                |
 311 +-----------------------+----------------------------------------------+
 312 | L2BD                  | L2 Bridge-Domain switching of untagged       |
 313 |                       | Ethernet frames with MAC learning; disabled  |
 314 |                       | MAC learning i.e. static MAC tests to be     |
 315 |                       | added.                                       |
 316 +-----------------------+----------------------------------------------+
 317 | L2BD Scale            | L2 Bridge-Domain switching of untagged       |
 318 |                       | Ethernet frames with MAC learning; disabled  |
 319 |                       | MAC learning i.e. static MAC tests to be     |
 320 |                       | added with 20k, 200k and 2M FIB entries.     |
 321 +-----------------------+----------------------------------------------+
 322 | L2XC                  | L2 Cross-Connect switching of untagged,      |
 323 |                       | dot1q, dot1ad VLAN tagged Ethernet frames.   |
 324 +-----------------------+----------------------------------------------+
 325 | LISP                  | LISP overlay tunneling for IPv4-over-IPv4,   |
 326 |                       | IPv6-over-IPv4, IPv6-over-IPv6,              |
 327 |                       | IPv4-over-IPv6 in IPv4 and IPv6 routing      |
 328 |                       | modes.                                       |
 329 +-----------------------+----------------------------------------------+
 330 | LXC/DRC Containers    | Container VPP memif virtual interface tests  |
 331 | Memif                 | with different VPP forwarding modes incl.    |
 332 |                       | L2XC, L2BD.                                  |
 333 +-----------------------+----------------------------------------------+
 334 | NAT                   | (Source) Network Address Translation tests   |
 335 |                       | with varying number of users and ports per   |
 336 |                       | user.                                        |
 337 +-----------------------+----------------------------------------------+
 338 | QoS Policer           | Ingress packet rate measuring, marking and   |
 339 |                       | limiting (IPv4).                             |
 340 +-----------------------+----------------------------------------------+
 341 | SRv6 Routing          | Segment Routing IPv6 tests.                  |
 342 +-----------------------+----------------------------------------------+
 343 | VPP TCP/IP stack      | Tests of VPP TCP/IP stack used with VPP      |
 344 |                       | built-in HTTP server.                        |
 345 +-----------------------+----------------------------------------------+
 346 | VXLAN                 | VXLAN overlay tunnelling integration with    |
 347 |                       | L2XC and L2BD.                               |
 348 +-----------------------+----------------------------------------------+
 349
 350 Execution of performance tests takes time, especially the throughput
 351 tests. Due to limited HW testbed resources available within FD.io labs
 352 hosted by :abbr:`LF (Linux Foundation)`, the number of tests for some
 353 NIC models has been limited to few baseline tests.
 354
 355 Performance Tests Naming
 356 ------------------------
 357
 358 FD.io |csit-release| follows a common structured naming convention for
 359 all performance and system functional tests, introduced in CSIT-17.01.
 360
 361 The naming should be intuitive for majority of the tests. Complete
 362 description of FD.io CSIT test naming convention is provided on
 363 :ref:`csit_test_naming`.