docs/report/vpp_performance_tests/overview.rst

   1 Overview
   2 ========
   3
   4 VPP performance test results are reported for a range of processors.
   5 For description of physical testbeds used for VPP performance tests
   6 please refer to :ref:`tested_physical_topologies`.
   7
   8 .. _tested_logical_topologies:
   9
  10 Logical Topologies
  11 ------------------
  12
  13 CSIT VPP performance tests are executed on physical testbeds described
  14 in :ref:`tested_physical_topologies`. Based on the packet path thru
  15 server SUTs, three distinct logical topology types are used for VPP DUT
  16 data plane testing:
  17
  18 #. NIC-to-NIC switching topologies.
  19 #. VM service switching topologies.
  20 #. Container service switching topologies.
  21
  22 NIC-to-NIC Switching
  23 ~~~~~~~~~~~~~~~~~~~~
  24
  25 The simplest logical topology for software data plane application like
  26 VPP is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node
  27 testbeds are shown in figures below.
  28
  29 .. only:: latex
  30
  31     .. raw:: latex
  32
  33         \begin{figure}[H]
  34             \centering
  35                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
  36                 \includegraphics[width=0.90\textwidth]{logical-2n-nic2nic}
  37                 \label{fig:logical-2n-nic2nic}
  38         \end{figure}
  39
  40 .. only:: html
  41
  42     .. figure:: logical-2n-nic2nic.svg
  43         :alt: logical-2n-nic2nic
  44         :align: center
  45
  46
  47 .. only:: latex
  48
  49     .. raw:: latex
  50
  51         \begin{figure}[H]
  52             \centering
  53                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
  54                 \includegraphics[width=0.90\textwidth]{logical-3n-nic2nic}
  55                 \label{fig:logical-3n-nic2nic}
  56         \end{figure}
  57
  58 .. only:: html
  59
  60     .. figure:: logical-3n-nic2nic.svg
  61         :alt: logical-3n-nic2nic
  62         :align: center
  63
  64 Server Systems Under Test (SUT) run VPP application in Linux user-mode
  65 as a Device Under Test (DUT). Server Traffic Generator (TG) runs T-Rex
  66 application. Physical connectivity between SUTs and TG is provided using
  67 different drivers and NIC models that need to be tested for performance
  68 (packet/bandwidth throughput and latency).
  69
  70 From SUT and DUT perspectives, all performance tests involve forwarding
  71 packets between two (or more) physical Ethernet ports (10GE, 25GE, 40GE,
  72 100GE). In most cases both physical ports on SUT are located on the same
  73 NIC. The only exceptions are link bonding and 100GE tests. In the latter
  74 case only one port per NIC can be driven at linerate due to PCIe Gen3
  75 x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3
  76 x8 slots.
  77
  78 Note that reported VPP DUT performance results are specific to the SUTs
  79 tested. SUTs with other processors than the ones used in FD.io lab are
  80 likely to yield different results. A good rule of thumb, that can be
  81 applied to estimate VPP packet thoughput for NIC-to-NIC switching
  82 topology, is to expect the forwarding performance to be proportional to
  83 processor core frequency for the same processor architecture, assuming
  84 processor is the only limiting factor and all other SUT parameters are
  85 equivalent to FD.io CSIT environment.
  86
  87 VM Service Switching
  88 ~~~~~~~~~~~~~~~~~~~~
  89
  90 VM service switching topology test cases require VPP DUT to communicate
  91 with Virtual Machines (VMs) over vhost-user virtual interfaces.
  92
  93 Two types of VM service topologies are tested in |csit-release|:
  94
  95 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
  96    VPP DUT to VM, back to VPP DUT, then out thru NIC(s).
  97
  98 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
  99    from NIC(s) via VPP DUT to VM, back to VPP DUT, then to the next VM,
 100    back to VPP DUT and so on and so forth until the last VM in a chain,
 101    then back to VPP DUT and out thru NIC(s).
 102
 103 For each of the above topologies, VPP DUT is tested in a range of L2
 104 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 105 "Chained" VM service topologies for 2-Node and 3-Node testbeds with each
 106 SUT running N of VM instances is shown in the figures below.
 107
 108 .. only:: latex
 109
 110     .. raw:: latex
 111
 112         \begin{figure}[H]
 113             \centering
 114                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 115                 \includegraphics[width=0.90\textwidth]{logical-2n-vm-vhost}
 116                 \label{fig:logical-2n-vm-vhost}
 117         \end{figure}
 118
 119 .. only:: html
 120
 121     .. figure:: logical-2n-vm-vhost.svg
 122         :alt: logical-2n-vm-vhost
 123         :align: center
 124
 125
 126 .. only:: latex
 127
 128     .. raw:: latex
 129
 130         \begin{figure}[H]
 131             \centering
 132                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 133                 \includegraphics[width=0.90\textwidth]{logical-3n-vm-vhost}
 134                 \label{fig:logical-3n-vm-vhost}
 135         \end{figure}
 136
 137 .. only:: html
 138
 139     .. figure:: logical-3n-vm-vhost.svg
 140         :alt: logical-3n-vm-vhost
 141         :align: center
 142
 143 In "Chained" VM topologies, packets are switched by VPP DUT multiple
 144 times: twice for a single VM, three times for two VMs, N+1 times for N
 145 VMs. Hence the external throughput rates measured by TG and listed in
 146 this report must be multiplied by N+1 to represent the actual VPP DUT
 147 aggregate packet forwarding rate.
 148
 149 For "Parallel" service topology packets are always switched twice by VPP
 150 DUT per service chain.
 151
 152 Note that reported VPP DUT performance results are specific to the SUTs
 153 tested. SUTs with other processor than the ones used in FD.io lab are
 154 likely to yield different results. Similarly to NIC-to-NIC switching
 155 topology, here one can also expect the forwarding performance to be
 156 proportional to processor core frequency for the same processor
 157 architecture, assuming processor is the only limiting factor. However
 158 due to much higher dependency on intensive memory operations in VM
 159 service chained topologies and sensitivity to Linux scheduler settings
 160 and behaviour, this estimation may not always yield good enough
 161 accuracy.
 162
 163 Container Service Switching
 164 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 165
 166 Container service switching topology test cases require VPP DUT to
 167 communicate with Containers (Ctrs) over memif virtual interfaces.
 168
 169 Three types of VM service topologies are tested in |csit-release|:
 170
 171 #. "Parallel" topology with packets flowing within SUT from NIC(s) via
 172    VPP DUT to Container, back to VPP DUT, then out thru NIC(s).
 173
 174 #. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT
 175    from NIC(s) via VPP DUT to Container, back to VPP DUT, then to the
 176    next Container, back to VPP DUT and so on and so forth until the
 177    last Container in a chain, then back to VPP DUT and out thru NIC(s).
 178
 179 #. "Horizontal" topology with packets flowing within SUT from NIC(s) via
 180    VPP DUT to Container, then via "horizontal" memif to the next
 181    Container, and so on and so forth until the last Container, then
 182    back to VPP DUT and out thru NIC(s).
 183
 184 For each of the above topologies, VPP DUT is tested in a range of L2
 185 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT
 186 "Chained" Container service topologies for 2-Node and 3-Node testbeds
 187 with each SUT running N of Container instances is shown in the figures
 188 below.
 189
 190 .. only:: latex
 191
 192     .. raw:: latex
 193
 194         \begin{figure}[H]
 195             \centering
 196                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 197                 \includegraphics[width=0.90\textwidth]{logical-2n-container-memif}
 198                 \label{fig:logical-2n-container-memif}
 199         \end{figure}
 200
 201 .. only:: html
 202
 203     .. figure:: logical-2n-container-memif.svg
 204         :alt: logical-2n-container-memif
 205         :align: center
 206
 207
 208 .. only:: latex
 209
 210     .. raw:: latex
 211
 212         \begin{figure}[H]
 213             \centering
 214                 \graphicspath{{../_tmp/src/vpp_performance_tests/}}
 215                 \includegraphics[width=0.90\textwidth]{logical-3n-container-memif}
 216                 \label{fig:logical-3n-container-memif}
 217         \end{figure}
 218
 219 .. only:: html
 220
 221     .. figure:: logical-3n-container-memif.svg
 222         :alt: logical-3n-container-memif
 223         :align: center
 224
 225 In "Chained" Container topologies, packets are switched by VPP DUT
 226 multiple times: twice for a single Container, three times for two
 227 Containers, N+1 times for N Containers. Hence the external throughput
 228 rates measured by TG and listed in this report must be multiplied by N+1
 229 to represent the actual VPP DUT aggregate packet forwarding rate.
 230
 231 For a "Parallel" and "Horizontal" service topologies packets are always
 232 switched by VPP DUT twice per service chain.
 233
 234 Note that reported VPP DUT performance results are specific to the SUTs
 235 tested. SUTs with other processor than the ones used in FD.io lab are
 236 likely to yield different results. Similarly to NIC-to-NIC switching
 237 topology, here one can also expect the forwarding performance to be
 238 proportional to processor core frequency for the same processor
 239 architecture, assuming processor is the only limiting factor. However
 240 due to much higher dependency on intensive memory operations in
 241 Container service chained topologies and sensitivity to Linux scheduler
 242 settings and behaviour, this estimation may not always yield good enough
 243 accuracy.
 244
 245 Performance Tests Coverage
 246 --------------------------
 247
 248 Performance tests measure following metrics for tested VPP DUT
 249 topologies and configurations:
 250
 251 - Packet Throughput: measured in accordance with :rfc:`2544`, using
 252   FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary
 253   search algorithm, producing throughput at different Packet Loss Ratio
 254   (PLR) values:
 255
 256   - Non Drop Rate (NDR): packet throughput at PLR=0%.
 257   - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
 258
 259 - One-Way Packet Latency: measured at different offered packet loads:
 260
 261   - 90% of discovered PDR throughput.
 262   - 50% of discovered PDR throughput.
 263   - 10% of discovered PDR throughput.
 264   - Minimal offered load.
 265
 266 - Maximum Receive Rate (MRR): measure packet forwarding rate under the
 267   maximum load offered by traffic generator over a set trial duration,
 268   regardless of packet loss. Maximum load for specified Ethernet frame
 269   size is set to the bi-directional link rate, unless there is a known
 270   limitation preventing Traffic Generator from achieving the line rate.
 271
 272 .. todo::
 273
 274    - Connections per second (CPS): TODO
 275
 276 |csit-release| includes following VPP data plane functionality
 277 performance tested across a range of NIC drivers and NIC models:
 278
 279 +-----------------------+----------------------------------------------+
 280 | Functionality         |  Description                                 |
 281 +=======================+==============================================+
 282 | ACL                   | L2 Bridge-Domain switching and               |
 283 |                       | IPv4and IPv6 routing with iACL and oACL IP   |
 284 |                       | address, MAC address and L4 port security.   |
 285 +-----------------------+----------------------------------------------+
 286 | ADL                   | IPv4 and IPv6 routing with ADL address       |
 287 |                       | security.                                    |
 288 +-----------------------+----------------------------------------------+
 289 | GENEVE                | GENEVE tunnels for IPv4 routing.             |
 290 +-----------------------+----------------------------------------------+
 291 | IPv4                  | IPv4 routing.                                |
 292 +-----------------------+----------------------------------------------+
 293 | IPv6                  | IPv6 routing.                                |
 294 +-----------------------+----------------------------------------------+
 295 | IPv4 Scale            | IPv4 routing with 20k, 200k and 2M FIB       |
 296 |                       | entries.                                     |
 297 +-----------------------+----------------------------------------------+
 298 | IPv6 Scale            | IPv6 routing with 20k, 200k and 2M FIB       |
 299 |                       | entries.                                     |
 300 +-----------------------+----------------------------------------------+
 301 | IPSecAsyncHW          | IPSec encryption with AES-GCM, CBC-SHA-256   |
 302 |                       | ciphers in async mode, in combination with   |
 303 |                       | IPv4 routing. Intel QAT HW acceleration.     |
 304 +-----------------------+----------------------------------------------+
 305 | IPSecHW               | IPSec encryption with AES-GCM, CBC-SHA-256   |
 306 |                       | ciphers, in combination with IPv4 routing.   |
 307 |                       | Intel QAT HW acceleration.                   |
 308 +-----------------------+----------------------------------------------+
 309 | IPSec+LISP            | IPSec encryption with CBC-SHA1 ciphers, in   |
 310 |                       | combination with LISP-GPE overlay tunneling  |
 311 |                       | for IPv4-over-IPv4.                          |
 312 +-----------------------+----------------------------------------------+
 313 | IPSecSW               | IPSec encryption with AES-GCM, CBC-SHA-256   |
 314 |                       | ciphers, in combination with IPv4 routing.   |
 315 +-----------------------+----------------------------------------------+
 316 | KVM VMs vhost-user    | Virtual topologies with service              |
 317 |                       | chains of 1 VM using vhost-user              |
 318 |                       | interfaces, with different VPP forwarding    |
 319 |                       | modes incl. L2XC, L2BD, VXLAN with L2BD,     |
 320 |                       | IPv4 routing.                                |
 321 +-----------------------+----------------------------------------------+
 322 | L2BD                  | L2 Bridge-Domain switching of untagged       |
 323 |                       | Ethernet frames with MAC learning; disabled  |
 324 |                       | MAC learning i.e. static MAC tests to be     |
 325 |                       | added.                                       |
 326 +-----------------------+----------------------------------------------+
 327 | L2BD Scale            | L2 Bridge-Domain switching of untagged       |
 328 |                       | Ethernet frames with MAC learning; disabled  |
 329 |                       | MAC learning i.e. static MAC tests to be     |
 330 |                       | added with 20k, 200k and 2M FIB entries.     |
 331 +-----------------------+----------------------------------------------+
 332 | L2XC                  | L2 Cross-Connect switching of untagged,      |
 333 |                       | dot1q, dot1ad VLAN tagged Ethernet frames.   |
 334 +-----------------------+----------------------------------------------+
 335 | LISP                  | LISP overlay tunneling for IPv4-over-IPv4,   |
 336 |                       | IPv6-over-IPv4, IPv6-over-IPv6,              |
 337 |                       | IPv4-over-IPv6 in IPv4 and IPv6 routing      |
 338 |                       | modes.                                       |
 339 +-----------------------+----------------------------------------------+
 340 | LXC/DRC Containers    | Container VPP memif virtual interface tests  |
 341 | Memif                 | with different VPP forwarding modes incl.    |
 342 |                       | L2XC, L2BD.                                  |
 343 +-----------------------+----------------------------------------------+
 344 | NAT44                 | (Source) Network Address Translation         |
 345 |                       | deterministic mode and endpoint-dependent    |
 346 |                       | mode tests with varying number of users and  |
 347 |                       | ports per user for IPv4.                     |
 348 +-----------------------+----------------------------------------------+
 349 | QoS Policer           | Ingress packet rate measuring, marking and   |
 350 |                       | limiting (IPv4).                             |
 351 +-----------------------+----------------------------------------------+
 352 | SRv6 Routing          | Segment Routing IPv6 tests.                  |
 353 +-----------------------+----------------------------------------------+
 354 | VPP TCP/IP stack      | Tests of VPP TCP/IP stack used with VPP      |
 355 |                       | built-in HTTP server.                        |
 356 +-----------------------+----------------------------------------------+
 357 | VTS                   | Virtual Topology System use case tests       |
 358 |                       | combining VXLAN overlay tunneling with L2BD, |
 359 |                       | ACL and KVM VM vhost-user features.          |
 360 +-----------------------+----------------------------------------------+
 361 | VXLAN                 | VXLAN overlay tunnelling integration with    |
 362 |                       | L2XC and L2BD.                               |
 363 +-----------------------+----------------------------------------------+
 364
 365 Execution of performance tests takes time, especially the throughput
 366 tests. Due to limited HW testbed resources available within FD.io labs
 367 hosted by :abbr:`LF (Linux Foundation)`, the number of tests for some
 368 NIC models has been limited to few baseline tests.
 369
 370 Performance Tests Naming
 371 ------------------------
 372
 373 FD.io |csit-release| follows a common structured naming convention for
 374 all performance and system functional tests, introduced in CSIT-17.01.
 375
 376 The naming should be intuitive for majority of the tests. Complete
 377 description of FD.io CSIT test naming convention is provided on
 378 :ref:`csit_test_naming`.