X-Git-Url: https://gerrit.fd.io/r/gitweb?p=csit.git;a=blobdiff_plain;f=resources%2Ftools%2Fpresentation%2Fdoc%2Fpal_lld.rst;h=38ab6d9bcb48c4b07a8285999562d1031defab65;hp=9158b889b989293680ecfb0698076026ff343c72;hb=df0263faee6171805e8bc579a4e0a786e63c3a8d;hpb=5d31c77092fd784e21cd50d604be30924ec9a7c5 diff --git a/resources/tools/presentation/doc/pal_lld.rst b/resources/tools/presentation/doc/pal_lld.rst index 9158b889b9..38ab6d9bcb 100644 --- a/resources/tools/presentation/doc/pal_lld.rst +++ b/resources/tools/presentation/doc/pal_lld.rst @@ -1,5 +1,5 @@ -Presentation and Analytics Layer -================================ +Presentation and Analytics +========================== Overview -------- @@ -42,9 +42,10 @@ sub-layers, bottom up: .. raw:: latex \begin{figure}[H] - \centering - \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_layers} - \label{fig:pal_layers} + \centering + \graphicspath{{../_tmp/src/csit_framework_documentation/}} + \includegraphics[width=0.90\textwidth]{pal_layers} + \label{fig:pal_layers} \end{figure} .. only:: html @@ -82,6 +83,8 @@ the type: - type: "environment" + - + type: "configuration" - type: "debug" - @@ -123,6 +126,7 @@ This section has the following parts: - build-dirs - a list of the directories where the results are stored. The structure of the section "Environment" is as follows (example): + :: - @@ -165,10 +169,7 @@ The structure of the section "Environment" is as follows (example): DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results" DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results" DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results" - DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results" DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results" - DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results" - DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results" DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements" # Detailed test configurations @@ -186,7 +187,10 @@ The structure of the section "Environment" is as follows (example): urls: URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job" - URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job" + URL[S3_STORAGE,LOG]: "https://logs.nginx.service.consul/vex-yul-rot-jenkins-1" + URL[NEXUS,LOG]: "https://logs.fd.io/production/vex-yul-rot-jenkins-1" + URL[NEXUS,DOC]: "https://docs.fd.io/csit" + DIR[NEXUS,DOC]: "report/_static/archive" make-dirs: # List the directories which are created while preparing the environment. @@ -223,6 +227,108 @@ will be automatically changed to DIR[WORKING,DATA]: "_tmp/data" +Section: Configuration +'''''''''''''''''''''' + +This section specifies the groups of parameters which are repeatedly used in the +elements defined later in the specification file. It has the following parts: + + - data sets - Specification of data sets used later in element's specifications + to define the input data. + - plot layouts - Specification of plot layouts used later in plots' + specifications to define the plot layout. + +The structure of the section "Configuration" is as follows (example): + +:: + + - + type: "configuration" + data-sets: + plot-vpp-throughput-latency: + csit-vpp-perf-1710-all: + - 11 + - 12 + - 13 + - 14 + - 15 + - 16 + - 17 + - 18 + - 19 + - 20 + vpp-perf-results: + csit-vpp-perf-1710-all: + - 20 + - 23 + plot-layouts: + plot-throughput: + xaxis: + autorange: True + autotick: False + fixedrange: False + gridcolor: "rgb(238, 238, 238)" + linecolor: "rgb(238, 238, 238)" + linewidth: 1 + showgrid: True + showline: True + showticklabels: True + tickcolor: "rgb(238, 238, 238)" + tickmode: "linear" + title: "Indexed Test Cases" + zeroline: False + yaxis: + gridcolor: "rgb(238, 238, 238)'" + hoverformat: ".4s" + linecolor: "rgb(238, 238, 238)" + linewidth: 1 + range: [] + showgrid: True + showline: True + showticklabels: True + tickcolor: "rgb(238, 238, 238)" + title: "Packets Per Second [pps]" + zeroline: False + boxmode: "group" + boxgroupgap: 0.5 + autosize: False + margin: + t: 50 + b: 20 + l: 50 + r: 20 + showlegend: True + legend: + orientation: "h" + width: 700 + height: 1000 + +The definitions from this sections are used in the elements, e.g.: + +:: + + - + type: "plot" + title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" + algorithm: "plot_performance_box" + output-file-type: ".html" + output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc" + data: + "plot-vpp-throughput-latency" + filter: "'64B' and ('BASE' or 'SCALE') and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" + parameters: + - "throughput" + - "parent" + traces: + hoverinfo: "x+y" + boxpoints: "outliers" + whiskerwidth: 0 + layout: + title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" + layout: + "plot-throughput" + + Section: Debug mode ''''''''''''''''''' @@ -261,10 +367,6 @@ The structure of the section "Debug" is as follows (example): - build: 9 file: "csit-dpdk-perf-1707-all__9.xml" - csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - - - build: 2 - file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml" csit-vpp-functional-1707-ubuntu1604-virl: - build: lastSuccessfulBuild @@ -296,6 +398,7 @@ This section has these parts: processed. :: + - type: "static" src-path: "{DIR[RST]}" @@ -366,9 +469,6 @@ The structure of the section "Input" is as follows (example from 17.07 report): - 9 hc2vpp-csit-integration-1707-ubuntu1604: - lastSuccessfulBuild - csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - - 2 - Section: Output ''''''''''''''' @@ -735,6 +835,35 @@ latency in a box chart): width: 700 height: 1000 +The structure of the section "Plot" is as follows (example of a plot showing +VPP HTTP server performance in a box chart with pre-defined data +"plot-vpp-http-server-performance" set and plot layout "plot-cps"): + +:: + + - + type: "plot" + title: "VPP HTTP Server Performance" + algorithm: "plot_http_server_perf_box" + output-file-type: ".html" + output-file: "{DIR[STATIC,VPP]}/http-server-performance-cps" + data: + "plot-vpp-httlp-server-performance" + # Keep this formatting, the filter is enclosed with " (quotation mark) and + # each tag is enclosed with ' (apostrophe). + filter: "'HTTP' and 'TCP_CPS'" + parameters: + - "result" + - "name" + traces: + hoverinfo: "x+y" + boxpoints: "outliers" + whiskerwidth: 0 + layout: + title: "VPP HTTP Server Performance" + layout: + "plot-cps" + Section: file ''''''''''''' @@ -1004,8 +1133,9 @@ For example, the element which specification includes: filter: - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" -will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed -builds and the tests with the list of tags matching the filter conditions. +will be constructed using data from the job "csit-vpp-perf-1707-all", for all +listed builds and the tests with the list of tags matching the filter +conditions. The output data structure for filtered test data is: @@ -1034,19 +1164,144 @@ Data analytics part implements: - methods to compute statistical data from the filtered input data. - trending. - - etc. + +Throughput Speedup Analysis - Multi-Core with Multi-Threading +''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +Throughput Speedup Analysis (TSA) calculates throughput speedup ratios +for tested 1-, 2- and 4-core multi-threaded VPP configurations using the +following formula: + +:: + + N_core_throughput + N_core_throughput_speedup = ----------------- + 1_core_throughput + +Multi-core throughput speedup ratios are plotted in grouped bar graphs +for throughput tests with 64B/78B frame size, with number of cores on +X-axis and speedup ratio on Y-axis. + +For better comparison multiple test results' data sets are plotted per +each graph: + + - graph type: grouped bars; + - graph X-axis: (testcase index, number of cores); + - graph Y-axis: speedup factor. + +Subset of existing performance tests is covered by TSA graphs. + +**Model for TSA:** + +:: + + - + type: "plot" + title: "TSA: 64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" + algorithm: "plot_throughput_speedup_analysis" + output-file-type: ".html" + output-file: "{DIR[STATIC,VPP]}/10ge2p1x520-64B-l2-tsa-ndrdisc" + data: + "plot-throughput-speedup-analysis" + filter: "'NIC_Intel-X520-DA2' and '64B' and 'BASE' and 'NDRDISC' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" + parameters: + - "throughput" + - "parent" + - "tags" + layout: + title: "64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" + layout: + "plot-throughput-speedup-analysis" + + +Comparison of results from two sets of the same test executions +''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +This algorithm enables comparison of results coming from two sets of the +same test executions. It is used to quantify performance changes across +all tests after test environment changes e.g. Operating System +upgrades/patches, Hardware changes. + +It is assumed that each set of test executions includes multiple runs +of the same tests, 10 or more, to verify test results repeatibility and +to yield statistically meaningful results data. + +Comparison results are presented in a table with a specified number of +the best and the worst relative changes between the two sets. Following table +columns are defined: + + - name of the test; + - throughput mean values of the reference set; + - throughput standard deviation of the reference set; + - throughput mean values of the set to compare; + - throughput standard deviation of the set to compare; + - relative change of the mean values. + +**The model** + +The model specifies: + + - type: "table" - means this section defines a table. + - title: Title of the table. + - algorithm: Algorithm which is used to generate the table. The other + parameters in this section must provide all information needed by the used + algorithm. + - output-file-ext: Extension of the output file. + - output-file: File which the table will be written to. + - reference - the builds which are used as the reference for comparison. + - compare - the builds which are compared to the reference. + - data: Specify the sources, jobs and builds, providing data for generating + the table. + - filter: Filter based on tags applied on the input data, if "template" is + used, filtering is based on the template. + - parameters: Only these parameters will be put to the output data + structure. + - nr-of-tests-shown: Number of the best and the worst tests presented in the + table. Use 0 (zero) to present all tests. + +*Example:* + +:: + + - + type: "table" + title: "Performance comparison" + algorithm: "table_perf_comparison" + output-file-ext: ".csv" + output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/vpp_performance_comparison" + reference: + title: "csit-vpp-perf-1801-all - 1" + data: + csit-vpp-perf-1801-all: + - 1 + - 2 + compare: + title: "csit-vpp-perf-1801-all - 2" + data: + csit-vpp-perf-1801-all: + - 1 + - 2 + data: + "vpp-perf-comparison" + filter: "all" + parameters: + - "name" + - "parent" + - "throughput" + nr-of-tests-shown: 20 Advanced data analytics ``````````````````````` -As the next steps, advanced data analytics (ADA) will be implemented using -machine learning (ML) and artificial intelligence (AI). +In the future advanced data analytics (ADA) will be added to analyze the +telemetry data collected from SUT telemetry sources and correlate it to +performance test results. -TODO: +:TODO: - - describe the concept of ADA. - - add specification. + - describe the concept of ADA. + - add specification. Data presentation @@ -1063,7 +1318,8 @@ Tables - tables are generated by algorithms implemented in PAL, the model includes the algorithm and all necessary information. - output format: csv - - generated tables are stored in specified directories and linked to .rst files. + - generated tables are stored in specified directories and linked to .rst + files. Plots @@ -1079,8 +1335,8 @@ Report generation ----------------- Report is generated using Sphinx and Read_the_Docs template. PAL generates html -and pdf formats. It is possible to define the content of the report by specifying -the version (TODO: define the names and content of versions). +and pdf formats. It is possible to define the content of the report by +specifying the version (TODO: define the names and content of versions). The process @@ -1098,12 +1354,216 @@ The process 5. Generate the report. 6. Store the report (Nexus). -The process is model driven. The elements’ models (tables, plots, files and -report itself) are defined in the specification file. Script reads the elements’ -models from specification file and generates the elements. +The process is model driven. The elements' models (tables, plots, files +and report itself) are defined in the specification file. Script reads +the elements' models from specification file and generates the elements. + +It is easy to add elements to be generated in the report. If a new type +of an element is required, only a new algorithm needs to be implemented +and integrated. + + +Continuous Performance Measurements and Trending +------------------------------------------------ + +Performance analysis and trending execution sequence: +````````````````````````````````````````````````````` + +CSIT PA runs performance analysis, change detection and trending using specified +trend analysis metrics over the rolling window of last sets of historical +measurement data. PA is defined as follows: + + #. PA job triggers: + + #. By PT job at its completion. + #. Manually from Jenkins UI. + + #. Download and parse archived historical data and the new data: + + #. New data from latest PT job is evaluated against the rolling window + of sets of historical data. + #. Download RF output.xml files and compressed archived data. + #. Parse out the data filtering test cases listed in PA specification + (part of CSIT PAL specification file). + + #. Calculate trend metrics for the rolling window of sets of historical + data: + + #. Calculate quartiles Q1, Q2, Q3. + #. Trim outliers using IQR. + #. Calculate TMA and TMSD. + #. Calculate normal trending range per test case based on TMA and TMSD. + + #. Evaluate new test data against trend metrics: + + #. If within the range of (TMA +/- 3*TMSD) => Result = Pass, + Reason = Normal. + #. If below the range => Result = Fail, Reason = Regression. + #. If above the range => Result = Pass, Reason = Progression. + + #. Generate and publish results + + #. Relay evaluation result to job result. + #. Generate a new set of trend analysis summary graphs and drill-down + graphs. + + #. Summary graphs to include measured values with Normal, + Progression and Regression markers. MM shown in the background if + possible. + #. Drill-down graphs to include MM, TMA and TMSD. + + #. Publish trend analysis graphs in html format on + https://docs.fd.io/csit/master/trending/. + + +Parameters to specify: +`````````````````````` + +*General section - parameters common to all plots:* + + - type: "cpta"; + - title: The title of this section; + - output-file-type: only ".html" is supported; + - output-file: path where the generated files will be stored. + +*Plots section:* + + - plot title; + - output file name; + - input data for plots; + + - job to be monitored - the Jenkins job which results are used as input + data for this test; + - builds used for trending plot(s) - specified by a list of build + numbers or by a range of builds defined by the first and the last + build number; + + - tests to be displayed in the plot defined by a filter; + - list of parameters to extract from the data; + - plot layout + +*Example:* + +:: + + - + type: "cpta" + title: "Continuous Performance Trending and Analysis" + output-file-type: ".html" + output-file: "{DIR[STATIC,VPP]}/cpta" + plots: + + - title: "VPP 1T1C L2 64B Packet Throughput - Trending" + output-file-name: "l2-1t1c-x520" + data: "plot-performance-trending-vpp" + filter: "'NIC_Intel-X520-DA2' and 'MRR' and '64B' and ('BASE' or 'SCALE') and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST' and not 'MEMIF'" + parameters: + - "result" + layout: "plot-cpta-vpp" + + - title: "DPDK 4T4C IMIX MRR Trending" + output-file-name: "dpdk-imix-4t4c-xl710" + data: "plot-performance-trending-dpdk" + filter: "'NIC_Intel-XL710' and 'IMIX' and 'MRR' and '4T4C' and 'DPDK'" + parameters: + - "result" + layout: "plot-cpta-dpdk" + +The Dashboard +````````````` + +Performance dashboard tables provide the latest VPP throughput trend, trend +compliance and detected anomalies, all on a per VPP test case basis. +The Dashboard is generated as three tables for 1t1c, 2t2c and 4t4c MRR tests. + +At first, the .csv tables are generated (only the table for 1t1c is shown): + +:: + + - + type: "table" + title: "Performance trending dashboard" + algorithm: "table_perf_trending_dash" + output-file-ext: ".csv" + output-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c" + data: "plot-performance-trending-all" + filter: "'MRR' and '1T1C'" + parameters: + - "name" + - "parent" + - "result" + ignore-list: + - "tests.vpp.perf.l2.10ge2p1x520-eth-l2bdscale1mmaclrn-mrr.tc01-64b-1t1c-eth-l2bdscale1mmaclrn-ndrdisc" + outlier-const: 1.5 + window: 14 + evaluated-window: 14 + long-trend-window: 180 + +Then, html tables stored inside .rst files are generated: + +:: + + - + type: "table" + title: "HTML performance trending dashboard 1t1c" + algorithm: "table_perf_trending_dash_html" + input-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c.csv" + output-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c.rst" + +Root Cause Analysis +------------------- + +Root Cause Analysis (RCA) by analysing archived performance results – re-analyse +available data for specified: + + - range of jobs builds, + - set of specific tests and + - PASS/FAIL criteria to detect performance change. + +In addition, PAL generates trending plots to show performance over the specified +time interval. + +Root Cause Analysis - Option 1: Analysing Archived VPP Results +`````````````````````````````````````````````````````````````` + +It can be used to speed-up the process, or when the existing data is sufficient. +In this case, PAL uses existing data saved in Nexus, searches for performance +degradations and generates plots to show performance over the specified time +interval for the selected tests. + +Execution Sequence +'''''''''''''''''' + + #. Download and parse archived historical data and the new data. + #. Calculate trend metrics. + #. Find regression / progression. + #. Generate and publish results: + + #. Summary graphs to include measured values with Progression and + Regression markers. + #. List the DUT build(s) where the anomalies were detected. + +CSIT PAL Specification +'''''''''''''''''''''' + + - What to test: + + - first build (Good); specified by the Jenkins job name and the build + number + - last build (Bad); specified by the Jenkins job name and the build + number + - step (1..n). + + - Data: + + - tests of interest; list of tests (full name is used) which results are + used + +*Example:* + +:: -It is easy to add elements to be generated, if a new kind of element is -required, only a new algorithm is implemented and integrated. + TODO API @@ -1228,9 +1688,10 @@ PAL functional diagram .. raw:: latex \begin{figure}[H] - \centering - \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_func_diagram} - \label{fig:pal_func_diagram} + \centering + \graphicspath{{../_tmp/src/csit_framework_documentation/}} + \includegraphics[width=0.90\textwidth]{pal_func_diagram} + \label{fig:pal_func_diagram} \end{figure} .. only:: html @@ -1243,12 +1704,12 @@ PAL functional diagram How to add an element ````````````````````` -Element can be added by adding its model to the specification file. If the -element will be generated by an existing algorithm, only its parameters must be -set. +Element can be added by adding it's model to the specification file. If +the element is to be generated by an existing algorithm, only it's +parameters must be set. -If a brand new type of element will be added, also the algorithm must be -implemented. -The algorithms are implemented in the files which names start with "generator". -The name of the function implementing the algorithm and the name of algorithm in -the specification file had to be the same. +If a brand new type of element needs to be added, also the algorithm +must be implemented. Element generation algorithms are implemented in +the files with names starting with "generator" prefix. The name of the +function implementing the algorithm and the name of algorithm in the +specification file have to be the same.