X-Git-Url: https://gerrit.fd.io/r/gitweb?p=csit.git;a=blobdiff_plain;f=docs%2Freport%2Fintroduction%2Fmethodology.rst;h=16d3edacdb2bce01c459a9a326a9d660ddb50f76;hp=1a1c349ba293adee139982faa236d67b0e913735;hb=d90c2c87f2738cab2e9a4eca4058b379dd8d4dc8;hpb=5a1f4570778a7511415d94e58cc2299d02b871cd diff --git a/docs/report/introduction/methodology.rst b/docs/report/introduction/methodology.rst index 1a1c349ba2..16d3edacdb 100644 --- a/docs/report/introduction/methodology.rst +++ b/docs/report/introduction/methodology.rst @@ -1,16 +1,228 @@ -.. _performance_test_methodology: +.. _test_methodology: -Performance Test Methodology -============================ +Test Methodology +================ -Throughput ----------- +VPP Forwarding Modes +-------------------- + +VPP is tested in a number of L2 and IP packet lookup and forwarding +modes. Within each mode baseline and scale tests are executed, the +latter with varying number of lookup entries. + +L2 Ethernet Switching +~~~~~~~~~~~~~~~~~~~~~ + +VPP is tested in three L2 forwarding modes: + +- *l2patch*: L2 patch, the fastest point-to-point L2 path that loops + packets between two interfaces without any Ethernet frame checks or + lookups. +- *l2xc*: L2 cross-connect, point-to-point L2 path with all Ethernet + frame checks, but no MAC learning and no MAC lookup. +- *l2bd*: L2 bridge-domain, multipoint-to-multipoint L2 path with all + Ethernet frame checks, with MAC learning (unless static MACs are used) + and MAC lookup. + +l2bd tests are executed in baseline and scale configurations: + +- *l2bdbase*: low number of L2 flows (253 per direction) is switched by + VPP. They drive the content of MAC FIB size (506 total MAC entries). + Both source and destination MAC addresses are incremented on a packet + by packet basis. + +- *l2bdscale*: high number of L2 flows is switched by VPP. Tested MAC + FIB sizes include: i) 10k (5k unique flows per direction), ii) 100k + (2x 50k flows) and iii) 1M (2x 500k). Both source and destination MAC + addresses are incremented on a packet by packet basis, ensuring new + entries are learn refreshed and looked up at every packet, making it + the worst case scenario. + +Ethernet wire encapsulations tested include: untagged, dot1q, dot1ad. + +IPv4 Routing +~~~~~~~~~~~~ + +IPv4 routing tests are executed in baseline and scale configurations: + +- *ip4base*: low number of IPv4 flows (253 per direction) is routed by + VPP. They drive the content of IPv4 FIB size (506 total /32 prefixes). + Destination IPv4 addresses are incremented on a packet by packet + basis. + +- *ip4scale*: high number of IPv4 flows is routed by VPP. Tested IPv4 + FIB sizes of /32 prefixes include: i) 20k (10k unique flows per + direction), ii) 200k (2x 100k flows) and iii) 2M (2x 1M). Destination + IPv4 addresses are incremented on a packet by packet basis, ensuring + new FIB entries are looked up at every packet, making it the worst + case scenario. + +IPv6 Routing +~~~~~~~~~~~~ + +IPv6 routing tests are executed in baseline and scale configurations: + +- *ip6base*: low number of IPv6 flows (253 per direction) is routed by + VPP. They drive the content of IPv6 FIB size (506 total /128 prefixes). + Destination IPv6 addresses are incremented on a packet by packet + basis. + +- *ip6scale*: high number of IPv6 flows is routed by VPP. Tested IPv6 + FIB sizes of /128 prefixes include: i) 20k (10k unique flows per + direction), ii) 200k (2x 100k flows) and iii) 2M (2x 1M). Destination + IPv6 addresses are incremented on a packet by packet basis, ensuring + new FIB entries are looked up at every packet, making it the worst + case scenario. + +SRv6 Routing +~~~~~~~~~~~~ + +SRv6 routing tests are executed in a number of baseline configurations, +in each case SR policy and steering policy are configured for one +direction and one (or two) SR behaviours (functions) in the other +directions: + +- *srv6enc1sid*: One SID (no SRH present), one SR function - End. +- *srv6enc2sids*: Two SIDs (SRH present), two SR functions - End and + End.DX6. +- *srv6enc2sids-nodecaps*: Two SIDs (SRH present) without decapsulation, + one SR function - End. +- *srv6proxy-dyn*: Dynamic SRv6 proxy, one SR function - End.AD. +- *srv6proxy-masq*: Masquerading SRv6 proxy, one SR function - End.AM. +- *srv6proxy-stat*: Static SRv6 proxy, one SR function - End.AS. + +In all listed cases low number of IPv6 flows (253 per direction) is +routed by VPP. + +Tunnel Encapsulations +--------------------- + +Tunnel encapsulations testing is grouped based on the type of outer +header: IPv4 or IPv6. + +IPv4 Tunnels +~~~~~~~~~~~~ + +VPP is tested in the following IPv4 tunnel baseline configurations: + +- *ip4vxlan-l2bdbase*: VXLAN over IPv4 tunnels with L2 bridge-domain MAC + switching. +- *ip4vxlan-l2xcbase*: VXLAN over IPv4 tunnels with L2 cross-connect. +- *ip4lispip4-ip4base*: LISP over IPv4 tunnels with IPv4 routing. +- *ip4lispip6-ip6base*: LISP over IPv4 tunnels with IPv6 routing. + +In all cases listed above low number of MAC, IPv4, IPv6 flows (253 per +direction) is switched or routed by VPP. + +In addition selected IPv4 tunnels are tested at scale: + +- *dot1q--ip4vxlanscale-l2bd*: VXLAN over IPv4 tunnels with L2 bridge- + domain MAC switching, with scaled up dot1q VLANs (10, 100, 1k), + mapped to scaled up L2 bridge-domains (10, 100, 1k), that are in turn + mapped to (10, 100, 1k) VXLAN tunnels. 64.5k flows are transmitted per + direction. + +IPv6 Tunnels +~~~~~~~~~~~~ + +VPP is tested in the following IPv6 tunnel baseline configurations: + +- *ip6lispip4-ip4base*: LISP over IPv4 tunnels with IPv4 routing. +- *ip6lispip6-ip6base*: LISP over IPv4 tunnels with IPv6 routing. -Packet and bandwidth throughput are measured in accordance with -:rfc:`2544`, using FD.io CSIT Multiple Loss Ratio search (MLRsearch), an -optimized binary search algorithm, that measures SUT/DUT throughput at -different Packet Loss Ratio (PLR) values. +In all cases listed above low number of IPv4, IPv6 flows (253 per +direction) is routed by VPP. + +VPP Features +------------ + +VPP is tested in a number of data plane feature configurations across +different forwarding modes. Following sections list features tested. + +ACL Security-Groups +~~~~~~~~~~~~~~~~~~~ + +Both stateless and stateful access control lists (ACL), also known as +security-groups, are supported by VPP. + +Following ACL configurations are tested for MAC switching with L2 +bridge-domains: + +- *l2bdbasemaclrn-iacl{E}sl-{F}flows*: Input stateless ACL, with {E} + entries and {F} flows. +- *l2bdbasemaclrn-oacl{E}sl-{F}flows*: Output stateless ACL, with {E} + entries and {F} flows. +- *l2bdbasemaclrn-iacl{E}sf-{F}flows*: Input stateful ACL, with {E} + entries and {F} flows. +- *l2bdbasemaclrn-oacl{E}sf-{F}flows*: Output stateful ACL, with {E} + entries and {F} flows. + +Following ACL configurations are tested with IPv4 routing: + +- *ip4base-iacl{E}sl-{F}flows*: Input stateless ACL, with {E} entries + and {F} flows. +- *ip4base-oacl{E}sl-{F}flows*: Output stateless ACL, with {E} entries + and {F} flows. +- *ip4base-iacl{E}sf-{F}flows*: Input stateful ACL, with {E} entries and + {F} flows. +- *ip4base-oacl{E}sf-{F}flows*: Output stateful ACL, with {E} entries + and {F} flows. + +ACL tests are executed with the following combinations of ACL entries +and number of flows: + +- ACL entry definitions + + - flow non-matching deny entry: (src-ip4, dst-ip4, src-port, dst-port). + - flow matching permit ACL entry: (src-ip4, dst-ip4). + +- {E} - number of non-matching deny ACL entries, {E} = [1, 10, 50]. +- {F} - number of UDP flows with different tuple (src-ip4, dst-ip4, + src-port, dst-port), {F} = [100, 10k, 100k]. +- All {E}x{F} combinations are tested per ACL type, total of 9. + +ACL MAC-IP +~~~~~~~~~~ + +MAC-IP binding ACLs are tested for MAC switching with L2 bridge-domains: + +- *l2bdbasemaclrn-macip-iacl{E}sl-{F}flows*: Input stateless ACL, with + {E} entries and {F} flows. + +MAC-IP ACL tests are executed with the following combinations of ACL +entries and number of flows: + +- ACL entry definitions + + - flow non-matching deny entry: (dst-ip4, dst-mac, bit-mask) + - flow matching permit ACL entry: (dst-ip4, dst-mac, bit-mask) + +- {E} - number of non-matching deny ACL entries, {E} = [1, 10, 50] +- {F} - number of UDP flows with different tuple (dst-ip4, dst-mac), + {F} = [100, 10k, 100k] +- All {E}x{F} combinations are tested per ACL type, total of 9. + +NAT44 +~~~~~ + +NAT44 is tested in baseline and scale configurations with IPv4 routing: + +- *ip4base-nat44*: baseline test with single NAT entry (addr, port), + single UDP flow. +- *ip4base-udpsrcscale{U}-nat44*: baseline test with {U} NAT entries + (addr, {U}ports), {U}=15. +- *ip4scale{R}-udpsrcscale{U}-nat44*: scale tests with {R}*{U} NAT + entries ({R}addr, {U}ports), {R}=[100, 1k, 2k, 4k], {U}=15. + +Data Plane Throughput +--------------------- + +Network data plane packet and bandwidth throughput are measured in +accordance with :rfc:`2544`, using FD.io CSIT Multiple Loss Ratio search +(MLRsearch), an optimized throughput search algorithm, that measures +SUT/DUT packet throughput rates at different Packet Loss Ratio (PLR) +values. Following MLRsearch values are measured across a range of L2 frame sizes and reported: @@ -32,20 +244,22 @@ and reported: NDR and PDR are measured for the following L2 frame sizes (untagged Ethernet): -- IPv4 payload: 64B, IMIX_v4_1 (28x64B, 16x570B, 4x1518B), 1518B, 9000B. -- IPv6 payload: 78B, 1518B, 9000B. +- IPv4 payload: 64B, IMIX (28x64B, 16x570B, 4x1518B), 1518B, 9000B. +- IPv6 payload: 78B, IMIX (28x78B, 16x570B, 4x1518B), 1518B, 9000B. All rates are reported from external Traffic Generator perspective. .. _mlrsearch_algorithm: -MLRsearch Algorithm -------------------- +MLRsearch Tests +--------------- -Multiple Loss Rate search (MLRsearch) is a new search algorithm +Multiple Loss Rate search (MLRsearch) tests use new search algorithm implemented in FD.io CSIT project. MLRsearch discovers multiple packet throughput rates in a single search, with each rate associated with a -distinct Packet Loss Ratio (PLR) criteria. +distinct Packet Loss Ratio (PLR) criteria. MLRsearch is being +standardized in IETF with `draft-vpolak-mkonstan-mlrsearch-XX +`_. Two throughput measurements used in FD.io CSIT are Non-Drop Rate (NDR, with zero packet loss, PLR=0) and Partial Drop Rate (PDR, with packet @@ -166,7 +380,7 @@ Input Parameters Default (2). (Value chosen based on limited experimentation to date. More experimentation needed to arrive to clearer guidelines.) -Initial phase +Initial Phase ````````````` 1. First trial measures at maximum rate and discovers MRR. @@ -192,7 +406,7 @@ Initial phase c. *do*: single trial. d. *out*: measured loss ratio. -Non-initial phases +Non-initial Phases `````````````````` 1. Main loop: @@ -318,15 +532,22 @@ but without detailing their mutual interaction. than pure binary search, the implemented tests fail themselves when the search takes too long (given by argument *timeout*). -Maximum Receive Rate MRR ------------------------- +(B)MRR Throughput +----------------- -MRR tests measure the packet forwarding rate under the maximum -load offered by traffic generator over a set trial duration, +Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests, +as they provide a maximum "raw" throughput benchmark for development and +testing community. MRR tests measure the packet forwarding rate under +the maximum load offered by traffic generator over a set trial duration, regardless of packet loss. Maximum load for specified Ethernet frame size is set to the bi-directional link rate. -Current parameters for MRR tests: +In |csit-release| MRR test code has been updated with a configurable +burst MRR parameters: trial duration and number of trials in a single +burst. This enabled a new Burst MRR (BMRR) methodology for more precise +performance trending. + +Current parameters for BMRR tests: - Ethernet frame sizes: 64B (78B for IPv6), IMIX, 1518B, 9000B; all quoted sizes include frame CRC, but exclude per frame transmission @@ -346,17 +567,25 @@ Current parameters for MRR tests: XL710. Packet rate for other tested frame sizes is limited by PCIe Gen3 x8 bandwidth limitation of ~50Gbps. -- Trial duration: 10sec. +- Trial duration: 1 sec. + +- Number of trials per burst: 10. Similarly to NDR/PDR throughput tests, MRR test should be reporting bi- directional link rate (or NIC rate, if lower) if tested VPP configuration can handle the packet rate higher than bi-directional link rate, e.g. large packet tests and/or multi-core tests. -MRR tests are used for continuous performance trending and for -comparison between releases. Daily trending job tests subset of frame -sizes, focusing on 64B (78B for IPv6) for all tests and IMIX for -selected tests (vhost, memif). +MRR tests are currently used for FD.io CSIT continuous performance +trending and for comparison between releases. Daily trending job tests +subset of frame sizes, focusing on 64B (78B for IPv6) for all tests and +IMIX for selected tests (vhost, memif). + +MRR-like measurements are being used to establish starting conditions +for experimental Probabilistic Loss Ratio Search (PLRsearch) used for +soak testing, aimed at verifying continuous system performance over an +extended period of time, hours, days, weeks, months. PLRsearch code is +currently in experimental phase in FD.io CSIT project. Packet Latency -------------- @@ -510,7 +739,8 @@ following environment settings: [cfs,cfsrr1] settings. - Adjusted Linux kernel :abbr:`CFS (Completely Fair Scheduler)` scheduler policy for data plane threads used in CSIT is documented in - `CSIT Performance Environment Tuning wiki `_. + `CSIT Performance Environment Tuning wiki + `_. - The purpose is to verify performance impact (MRR and NDR/PDR throughput) and same test measurements repeatability, by making VPP and VM data plane threads less susceptible to other Linux OS system @@ -562,6 +792,20 @@ VMs as described earlier in :ref:`tested_physical_topologies`. Further documentation is available in :ref:`container_orchestration_in_csit`. +VPP_Device Functional +--------------------- + +|csit-release| added new VPP_Device test environment for functional VPP +device tests integrated into LFN CI/CD infrastructure. VPP_Device tests +run on 1-Node testbeds (1n-skx, 1n-arm) and rely on Linux SRIOV Virtual +Function (VF), dot1q VLAN tagging and external loopback cables to +facilitate packet passing over exernal physical links. Initial focus is +on few baseline tests. Existing CSIT VIRL tests can be moved to +VPP_Device framework by changing L1 and L2 KW(s). RF test definition +code stays unchanged with the exception of requiring adjustments from +3-Node to 2-Node logical topologies. CSIT VIRL to VPP_Device migration +is expected in the next CSIT release. + IPSec on Intel QAT ------------------ @@ -631,7 +875,7 @@ created (one for each direction) with TRex flow_stats parameter set to STLFlowLatencyStats. In that case, returned statistics will also include min/avg/max latency values. -HTTP/TCP with WRK tool +HTTP/TCP with WRK Tool ---------------------- `WRK HTTP benchmarking tool `_ is used for