fix(report): slightly update release notes
[csit.git] / docs / report / vpp_performance_tests / csit_release_notes.rst
1 .. _vpp_performance_tests_release_notes:
2
3 Release Notes
4 =============
5
6 Changes in |csit-release|
7 -------------------------
8
9 #. VPP PERFORMANCE TESTS
10
11    - **Added new performance testbed 3n-snr** (3 Node SnowRidge, with Intel
12      Atom processors), to later replace 3n-dnv and 2n-dnv (3 and 2 Node
13      Denverton) testbeds.
14
15    - **Added GTPU HW offload tests** using VPP GTPU hardware offload
16      with Intel e810 4p25ge NICs (3n-icx testbeds only). These tests
17      were already there in CSIT-2206, but were yielding invalid
18      results due to using TRex v2.97 that was incompatible with e810
19      NICs used for those tests.
20
21    - **Added Wireguard tests** using VPP software crypto (3n-icx, 3n-snr
22      testbeds) and using built-in hardware crypto QAT device (3n-snr testbed
23      only).
24
25    - **Reduction of tests**: Removed certain test variations executed
26      iteratively for the report (as well as in daily and weekly
27      trending) due to physical testbeds overload.
28
29 #. TEST FRAMEWORK
30
31    - CSIT-2210 executes all VPP v22.10 performance tests using vpp ubuntu2204
32      images, due to CSIT execution environment change as noted below. This
33      applies to all performance testbeds except Denverton. Consequently, VPP
34      v22.06 has not been re-tested in CSIT-2210 environment, as no ubuntu204
35      images are available for that VPP version. Performance comparison
36      between VPP v22.10 (current version) vs VPP v22.06 (previous version)
37      may be impacted by VPP build environment change (ubuntu2004 to ubuntu
38      2204) change and CSIT environment change. See :ref:`vpp_rca` for
39      details.
40
41    - **CSIT test environment** version has been updated to ver. 11, see
42      :ref:`test_environment_versioning`.
43
44    - **TCP TPUT profiles** had to be changed, as newer TRex versions
45      are not deterministic enough when deciding when to send an ACK.
46
47    - **CSIT PAPI support**: Due to issues with PAPI performance, and
48      deprecation of VAT, VPP CLI is used in CSIT for many VPP scale
49      tests. See :ref:`vpp_known_issues`.
50
51    - **General Code Housekeeping**: Ongoing code optimizations and bug
52      fixes.
53
54 #. PRESENTATION AND ANALYTICS LAYER
55
56    - **C-Dash** `performance dashboard <http://csit.fd.io/>`_ got updated UI and
57      updated backend increasing its performance and robustness.
58
59 .. raw:: latex
60
61     \clearpage
62
63 .. _vpp_known_issues:
64
65 Known Issues
66 ------------
67
68 New
69 ___
70
71 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
72 |  # | JiraID                                  | Issue Description                                                                                         |
73 +====+=========================================+===========================================================================================================+
74 |  1 | `CSIT-1850                              | 2n-dnv: sporadic 1518B tput tests failing to establish required sessions.                                 |
75 |    | <https://jira.fd.io/browse/CSIT-1850>`_ |                                                                                                           |
76 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
77 |  2 | `CSIT-1864                              | 2n-clx: half of the packets lost on PDR tests.                                                            |
78 |    | <https://jira.fd.io/browse/CSIT-1864>`_ |                                                                                                           |
79 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
80 |  3 | `CSIT-1871                              | 3n-snr: 25Ge Interface goes down randomly.                                                                |
81 |    | <https://jira.fd.io/browse/CSIT-1871>`_ |                                                                                                           |
82 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
83 |  4 | `CSIT-1877                              | 3n-alt, 3n-tsh: VM tests failing to boot VM.                                                              |
84 |    | <https://jira.fd.io/browse/CSIT-1877>`_ |                                                                                                           |
85 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
86 |  5 | `CSIT-1883                              | 3n-snr: All hwasync wireguard tests failing when trying to verify device.                                 |
87 |    | <https://jira.fd.io/browse/CSIT-1883>`_ |                                                                                                           |
88 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
89 |  6 | `CSIT-1884                              | 2n-clx, 2n-icx: All NDR PDR IMIX over 1M sessions BIDIR tests failing to create enough sessions.          |
90 |    | <https://jira.fd.io/browse/CSIT-1884>`_ |                                                                                                           |
91 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
92 |  7 | `CSIT-1885                              | 3n-icx: 9000b ip4 NDRPDR AVF tests are failing to forward traffic.                                        |
93 |    | <https://jira.fd.io/browse/CSIT-1885>`_ |                                                                                                           |
94 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
95
96 Previous
97 ________
98
99 Issues reported in previous releases which still affect the current results.
100
101 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
102 |  # | JiraID                                  | Issue Description                                                                                         |
103 +====+=========================================+===========================================================================================================+
104 |  1 | `CSIT-1671                              | All CSIT scale tests can not use PAPI due to much slower performance compared to VAT/CLI (it takes much   |
105 |    | <https://jira.fd.io/browse/CSIT-1671>`_ | longer to program VPP). This needs to be addressed on the PAPI side.                                      |
106 |    +-----------------------------------------+ Currently, the time critical code uses VAT running large files with exec statements and CLI commands.     |
107 |    | `VPP-1763                               | Still, we needed to reduce the number of scale tests run to keep overall duration reasonable.             |
108 |    | <https://jira.fd.io/browse/VPP-1763>`_  | More improvements needed to achieve sufficient configuration speed.                                       |
109 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
110 |  2 | `CSIT-1782                              | Multicore AVF tests are failing when trying to create interface.                                          |
111 |    | <https://jira.fd.io/browse/CSIT-1782>`_ | Frequency is reduced by CSIT workaround, but occasional failures do still happen.                         |
112 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
113 |  3 | `CSIT-1785                              | NAT44ED tests failing to establish all TCP sessions.                                                      |
114 |    | <https://jira.fd.io/browse/CSIT-1785>`_ | At least for max scale, in allotted time (limited by session 500s timeout) due to worse                   |
115 |    +-----------------------------------------+ slow path performance than previously measured and calibrated for.                                        |
116 |    | `VPP-1972                               | CSIT removed the max scale NAT tests to avoid this issue.                                                 |
117 |    | <https://jira.fd.io/browse/VPP-1972>`_  |                                                                                                           |
118 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
119 |  4 | `CSIT-1799                              | All NAT44-ED 16M sessions CPS scale tests fail while setting NAT44 address range.                         |
120 |    | <https://jira.fd.io/browse/CSIT-1799>`_ |                                                                                                           |
121 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
122 |  5 | `CSIT-1800                              | All Geneve L3 mode scale tests (1024 tunnels) are failing.                                                |
123 |    | <https://jira.fd.io/browse/CSIT-1800>`_ |                                                                                                           |
124 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
125 |  6 | `CSIT-1801                              | 9000B payload frames not forwarded over tunnels due to violating supported Max Frame Size (VxLAN, LISP,   |
126 |    | <https://jira.fd.io/browse/CSIT-1801>`_ | SRv6).                                                                                                    |
127 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
128 |  7 | `CSIT-1802                              | AF-XDP - NDR tests failing from time to time.                                                             |
129 |    | <https://jira.fd.io/browse/CSIT-1802>`_ |                                                                                                           |
130 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
131 |  8 | `CSIT-1804                              | All testbeds: NDR tests failing from time to time.                                                        |
132 |    | <https://jira.fd.io/browse/CSIT-1804>`_ |                                                                                                           |
133 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
134 |  9 | `CSIT-1808                              | All tests with 9000B payload frames not forwarded over memif interfaces.                                  |
135 |    | <https://jira.fd.io/browse/CSIT-1808>`_ |                                                                                                           |
136 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
137 | 10 | `CSIT-1827                              | 3n-icx, 3n-skx: all AVF crypto tests sporadically fail. 1518B with no traffic, IMIX with excessive        |
138 |    | <https://jira.fd.io/browse/CSIT-1827>`_ | packet loss.                                                                                              |
139 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
140 | 11 | `CSIT-1835                              | 3n-icx: vppecho BPS tests failing on timeout when checking hoststack finished.                            |
141 |    | <https://jira.fd.io/browse/CSIT-1835>`_ |                                                                                                           |
142 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
143 | 12 | `CSIT-1849                              | 2n-skx, 2n-clx, 2n-icx: UDP 16m TPUT tests fail to create all sessions.                                   |
144 |    | <https://jira.fd.io/browse/CSIT-1849>`_ |                                                                                                           |
145 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
146
147 Fixed
148 _____
149
150 Issues reported in previous releases which were fixed in this release:
151
152 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
153 |  # | JiraID                                  | Issue Description                                                                                         |
154 +====+=========================================+===========================================================================================================+
155 |  1 | `CSIT-1834                              | 2n-icx, 2n-skx: sporadic AVF soak tests failing to find critical load with PLRsearch.                     |
156 |    | <https://jira.fd.io/browse/CSIT-1834>`_ |                                                                                                           |
157 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
158 |  2 | `CSIT-1846                              | 2n-skx, 2n-clx, 2n-icx: ALL 1518B TCP tput tests failing with big packet loss.                            |
159 |    | <https://jira.fd.io/browse/CSIT-1846>`_ |                                                                                                           |
160 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
161 |  3 | `CSIT-1851                              | trending regression: various icelake tests around 2202-04-15                                              |
162 |    | <https://jira.fd.io/browse/CSIT-1851>`_ | Somewhat expected consequence of a VPP usability fix,                                                     |
163 |    |                                         | the previous VPP compiler version was too new for the OS used.                                            |
164 +----+-----------------------------------------+-----------------------------------------------------------------------------------------------------------+
165
166 .. _vpp_rca:
167
168 Root Cause Analysis for Performance Changes
169 -------------------------------------------
170
171 List of RCAs in |csit-release| for VPP performance changes:
172
173 +----+-----------------------------------------+-------------------------------------------------------------------------------------+
174 |  # | JiraID                                  | Issue Description                                                                   |
175 +====+=========================================+=====================================================================================+
176 |  1 | `VPP-2030                               | regression: ip6base on ICX around 2022-03-23                                        |
177 |    | <https://jira.fd.io/browse/VPP-2030>`_  | "Loads blocked due to overlapping with a preceding store that cannot be forwarded." |
178 |    |                                         | started happening in ip6-lookup graph node.                                         |
179 +----+-----------------------------------------+-------------------------------------------------------------------------------------+
180 |  2 | `CSIT-1852                              | 2n-zn2 mellanox performance cap                                                     |
181 |    | <https://jira.fd.io/browse/CSIT-1852>`_ | Old issue, only now distinguished from CSIT-1751.                                   |
182 |    |                                         | This testbed+nic combination is capped below 28 Mpps, cause not identified yet.     |
183 +----+-----------------------------------------+-------------------------------------------------------------------------------------+
184 |  3 | `CSIT-1853                              | trending regression: nat44ed cps around 2202-04-01                                  |
185 |    | <https://jira.fd.io/browse/CSIT-1853>`_ | VPP change added more computation to slow path (in order to support multiple VRFs). |
186 |    |                                         | Not clear if the VPP implementation is optimized enough.                            |
187 +----+-----------------------------------------+-------------------------------------------------------------------------------------+