1 Presentation and Analytics Layer
2 ================================
7 The presentation and analytics layer (PAL) is the fourth layer of CSIT
8 hierarchy. The model of presentation and analytics layer consists of four
11 - sL1 - Data - input data to be processed:
13 - Static content - .rst text files, .svg static figures, and other files
14 stored in the CSIT git repository.
15 - Data to process - .xml files generated by Jenkins jobs executing tests,
16 stored as robot results files (output.xml).
17 - Specification - .yaml file with the models of report elements (tables,
18 plots, layout, ...) generated by this tool. There is also the configuration
19 of the tool and the specification of input data (jobs and builds).
21 - sL2 - Data processing
23 - The data are read from the specified input files (.xml) and stored as
24 multi-indexed `pandas.Series <https://pandas.pydata.org/pandas-docs/stable/
25 generated/pandas.Series.html>`_.
26 - This layer provides also interface to input data and filtering of the input
29 - sL3 - Data presentation - This layer generates the elements specified in the
32 - Tables: .csv files linked to static .rst files.
33 - Plots: .html files generated using plot.ly linked to static .rst files.
35 - sL4 - Report generation - Sphinx generates required formats and versions:
38 - versions: minimal, full (TODO: define the names and scope of versions)
46 \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_layers}
47 \label{fig:pal_layers}
52 .. figure:: pal_layers.svg
62 The report specification file defines which data is used and which outputs are
63 generated. It is human readable and structured. It is easy to add / remove /
64 change items. The specification includes:
66 - Specification of the environment.
67 - Configuration of debug mode (optional).
68 - Specification of input data (jobs, builds, files, ...).
69 - Specification of the output.
70 - What and how is generated:
71 - What: plots, tables.
72 - How: specification of all properties and parameters.
75 Structure of the specification file
76 '''''''''''''''''''''''''''''''''''
78 The specification file is organized as a list of dictionaries distinguished by
100 Each type represents a section. The sections "environment", "debug", "static",
101 "input" and "output" are listed only once in the specification; "table", "file"
102 and "plot" can be there multiple times.
104 Sections "debug", "table", "file" and "plot" are optional.
106 Table(s), files(s) and plot(s) are referred as "elements" in this text. It is
107 possible to define and implement other elements if needed.
113 This section has the following parts:
115 - type: "environment" - says that this is the section "environment".
116 - configuration - configuration of the PAL.
117 - paths - paths used by the PAL.
118 - urls - urls pointing to the data sources.
119 - make-dirs - a list of the directories to be created by the PAL while
120 preparing the environment.
121 - remove-dirs - a list of the directories to be removed while cleaning the
123 - build-dirs - a list of the directories where the results are stored.
125 The structure of the section "Environment" is as follows (example):
133 # - Download of input data files
135 # - Read data from given zip / xml files
136 # - Set the configuration as it is done in normal mode
137 # If the section "type: debug" is missing, CFG[DEBUG] is set to 0.
141 # Top level directories:
145 DIR[BUILD,HTML]: "_build"
146 DIR[BUILD,LATEX]: "_build_latex"
149 DIR[RST]: "../../../docs/report"
151 # Working directories
152 ## Input data files (.zip, .xml)
153 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
154 ## Static source files from git
155 DIR[WORKING,SRC]: "{DIR[WORKING]}/src"
156 DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static"
158 # Static html content
159 DIR[STATIC]: "{DIR[BUILD,HTML]}/_static"
160 DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp"
161 DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk"
162 DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive"
164 # Detailed test results
165 DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results"
166 DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results"
167 DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results"
168 DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results"
169 DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results"
170 DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results"
171 DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results"
172 DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements"
174 # Detailed test configurations
175 DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration"
176 DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration"
177 DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration"
179 # Detailed tests operational data
180 DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data"
181 DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data"
183 # .css patch file to fix tables generated by Sphinx
184 DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css"
185 DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css"
188 URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job"
189 URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job"
192 # List the directories which are created while preparing the environment.
193 # All directories MUST be defined in "paths" section.
194 - "DIR[WORKING,DATA]"
200 - "DIR[WORKING,SRC,STATIC]"
203 # List the directories which are deleted while cleaning the environment.
204 # All directories MUST be defined in "paths" section.
208 # List the directories where the results (build) is stored.
209 # All directories MUST be defined in "paths" section.
213 It is possible to use defined items in the definition of other items, e.g.:
217 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
219 will be automatically changed to
223 DIR[WORKING,DATA]: "_tmp/data"
229 This section is optional as it configures the debug mode. It is used if one
230 does not want to download input data files and use local files instead.
232 If the debug mode is configured, the "input" section is ignored.
234 This section has the following parts:
236 - type: "debug" - says that this is the section "debug".
239 - input-format - xml or zip.
240 - extract - if "zip" is defined as the input format, this file is extracted
241 from the zip file, otherwise this parameter is ignored.
243 - builds - list of builds from which the data is used. Must include a job
244 name as a key and then a list of builds and their output files.
246 The structure of the section "Debug" is as follows (example):
253 input-format: "zip" # zip or xml
254 extract: "robot-plugin/output.xml" # Only for zip
256 # The files must be in the directory DIR[WORKING,DATA]
257 csit-dpdk-perf-1707-all:
260 file: "csit-dpdk-perf-1707-all__10.xml"
263 file: "csit-dpdk-perf-1707-all__9.xml"
264 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
267 file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml"
268 csit-vpp-functional-1707-ubuntu1604-virl:
270 build: lastSuccessfulBuild
271 file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml"
272 hc2vpp-csit-integration-1707-ubuntu1604:
274 build: lastSuccessfulBuild
275 file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml"
276 csit-vpp-perf-1707-all:
279 file: "csit-vpp-perf-1707-all__16__output.xml"
282 file: "csit-vpp-perf-1707-all__17__output.xml"
288 This section defines the static content which is stored in git and will be used
289 as a source to generate the report.
291 This section has these parts:
293 - type: "static" - says that this section is the "static".
294 - src-path - path to the static content.
295 - dst-path - destination path where the static content is copied and then
301 src-path: "{DIR[RST]}"
302 dst-path: "{DIR[WORKING,SRC]}"
308 This section defines the data used to generate elements. It is mandatory
309 if the debug mode is not used.
311 This section has the following parts:
313 - type: "input" - says that this section is the "input".
314 - general - parameters common to all builds:
316 - file-name: file to be downloaded.
317 - file-format: format of the downloaded file, ".zip" or ".xml" are supported.
318 - download-path: path to be added to url pointing to the file, e.g.:
319 "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and
320 {filename} are replaced by proper values defined in this section.
321 - extract: file to be extracted from downloaded zip file, e.g.: "output.xml";
322 if xml file is downloaded, this parameter is ignored.
324 - builds - list of jobs (keys) and numbers of builds which output data will be
327 The structure of the section "Input" is as follows (example from 17.07 report):
332 type: "input" # Ignored in debug mode
334 file-name: "robot-plugin.zip"
336 download-path: "{job}/{build}/robot/report/*zip*/{filename}"
337 extract: "robot-plugin/output.xml"
339 csit-vpp-perf-1707-all:
351 csit-dpdk-perf-1707-all:
362 csit-vpp-functional-1707-ubuntu1604-virl:
363 - lastSuccessfulBuild
364 hc2vpp-csit-perf-master-ubuntu1604:
367 hc2vpp-csit-integration-1707-ubuntu1604:
368 - lastSuccessfulBuild
369 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
376 This section specifies which format(s) will be generated (html, pdf) and which
377 versions will be generated for each format.
379 This section has the following parts:
381 - type: "output" - says that this section is the "output".
382 - format: html or pdf.
383 - version: defined for each format separately.
385 The structure of the section "Output" is as follows (example):
398 TODO: define the names of versions
401 Content of "minimal" version
402 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404 TODO: define the name and content of this version
410 This section defines a table to be generated. There can be 0 or more "table"
413 This section has the following parts:
415 - type: "table" - says that this section defines a table.
416 - title: Title of the table.
417 - algorithm: Algorithm which is used to generate the table. The other
418 parameters in this section must provide all information needed by the used
420 - template: (optional) a .csv file used as a template while generating the
422 - output-file-ext: extension of the output file.
423 - output-file: file which the table will be written to.
424 - columns: specification of table columns:
426 - title: The title used in the table header.
427 - data: Specification of the data, it has two parts - command and arguments:
431 - template - take the data from template, arguments:
433 - number of column in the template.
435 - data - take the data from the input data, arguments:
437 - jobs and builds which data will be used.
439 - operation - performs an operation with the data already in the table,
442 - operation to be done, e.g.: mean, stdev, relative_change (compute
443 the relative change between two columns) and display number of data
444 samples ~= number of test jobs. The operations are implemented in the
446 TODO: Move from utils,py to e.g. operations.py
447 - numbers of columns which data will be used (optional).
449 - data: Specify the jobs and builds which data is used to generate the table.
450 - filter: filter based on tags applied on the input data, if "template" is
451 used, filtering is based on the template.
452 - parameters: Only these parameters will be put to the output data structure.
454 The structure of the section "Table" is as follows (example of
455 "table_performance_improvements"):
461 title: "Performance improvements"
462 algorithm: "table_performance_improvements"
463 template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv"
464 output-file-ext: ".csv"
465 output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements"
468 title: "VPP Functionality"
474 title: "VPP-16.09 mean [Mpps]"
477 title: "VPP-17.01 mean [Mpps]"
480 title: "VPP-17.04 mean [Mpps]"
483 title: "VPP-17.07 mean [Mpps]"
484 data: "data csit-vpp-perf-1707-all mean"
486 title: "VPP-17.07 stdev [Mpps]"
487 data: "data csit-vpp-perf-1707-all stdev"
489 title: "17.04 to 17.07 change [%]"
490 data: "operation relative_change 5 4"
492 csit-vpp-perf-1707-all:
507 Example of "table_details" which generates "Detailed Test Results - VPP
508 Performance Results":
514 title: "Detailed Test Results - VPP Performance Results"
515 algorithm: "table_details"
516 output-file-ext: ".csv"
517 output-file: "{DIR[WORKING]}/vpp_performance_results"
521 data: "data test_name"
523 title: "Documentation"
524 data: "data test_documentation"
527 data: "data test_msg"
529 csit-vpp-perf-1707-all:
537 Example of "table_details" which generates "Test configuration - VPP Performance
544 title: "Test configuration - VPP Performance Test Configs"
545 algorithm: "table_details"
546 output-file-ext: ".csv"
547 output-file: "{DIR[WORKING]}/vpp_test_configuration"
553 title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case"
554 data: "data show-run"
556 csit-vpp-perf-1707-all:
568 This section defines a plot to be generated. There can be 0 or more "plot"
571 This section has these parts:
573 - type: "plot" - says that this section defines a plot.
574 - title: Plot title used in the logs. Title which is displayed is in the
576 - output-file-type: format of the output file.
577 - output-file: file which the plot will be written to.
578 - algorithm: Algorithm used to generate the plot. The other parameters in this
579 section must provide all information needed by plot.ly to generate the plot.
585 - These parameters are transparently passed to plot.ly.
587 - data: Specify the jobs and numbers of builds which data is used to generate
589 - filter: filter applied on the input data.
590 - parameters: Only these parameters will be put to the output data structure.
592 The structure of the section "Plot" is as follows (example of a plot showing
593 throughput in a chart box-with-whiskers):
599 title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
600 algorithm: "plot_performance_box"
601 output-file-type: ".html"
602 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc"
604 csit-vpp-perf-1707-all:
615 # Keep this formatting, the filter is enclosed with " (quotation mark) and
616 # each tag is enclosed with ' (apostrophe).
617 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
623 boxpoints: "outliers"
626 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
631 gridcolor: "rgb(238, 238, 238)"
632 linecolor: "rgb(238, 238, 238)"
637 tickcolor: "rgb(238, 238, 238)"
639 title: "Indexed Test Cases"
642 gridcolor: "rgb(238, 238, 238)'"
644 linecolor: "rgb(238, 238, 238)"
650 tickcolor: "rgb(238, 238, 238)"
651 title: "Packets Per Second [pps]"
667 The structure of the section "Plot" is as follows (example of a plot showing
668 latency in a box chart):
674 title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
675 algorithm: "plot_latency_box"
676 output-file-type: ".html"
677 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50"
679 csit-vpp-perf-1707-all:
690 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
697 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
702 gridcolor: "rgb(238, 238, 238)"
703 linecolor: "rgb(238, 238, 238)"
708 tickcolor: "rgb(238, 238, 238)"
710 title: "Indexed Test Cases"
713 gridcolor: "rgb(238, 238, 238)'"
715 linecolor: "rgb(238, 238, 238)"
721 tickcolor: "rgb(238, 238, 238)"
722 title: "Latency min/avg/max [uSec]"
742 This section defines a file to be generated. There can be 0 or more "file"
745 This section has the following parts:
747 - type: "file" - says that this section defines a file.
748 - title: Title of the table.
749 - algorithm: Algorithm which is used to generate the file. The other
750 parameters in this section must provide all information needed by the used
752 - output-file-ext: extension of the output file.
753 - output-file: file which the file will be written to.
754 - file-header: The header of the generated .rst file.
755 - dir-tables: The directory with the tables.
756 - data: Specify the jobs and builds which data is used to generate the table.
757 - filter: filter based on tags applied on the input data, if "all" is
758 used, no filtering is done.
759 - parameters: Only these parameters will be put to the output data structure.
760 - chapters: the hierarchy of chapters in the generated file.
761 - start-level: the level of the the top-level chapter.
763 The structure of the section "file" is as follows (example):
769 title: "VPP Performance Results"
770 algorithm: "file_test_results"
771 output-file-ext: ".rst"
772 output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results"
773 file-header: "\n.. |br| raw:: html\n\n <br />\n\n\n.. |prein| raw:: html\n\n <pre>\n\n\n.. |preout| raw:: html\n\n </pre>\n\n"
774 dir-tables: "{DIR[DTR,PERF,VPP]}"
776 csit-vpp-perf-1707-all:
783 data-start-level: 2 # 0, 1, 2, ...
784 chapters-start-level: 2 # 0, 1, 2, ...
790 - Manually created / edited files.
791 - .rst files, static .csv files, static pictures (.svg), ...
792 - Stored in CSIT git repository.
794 No more details about the static content in this document.
800 The PAL processes tests results and other information produced by Jenkins jobs.
801 The data are now stored as robot results in Jenkins (TODO: store the data in
802 nexus) either as .zip and / or .xml files.
808 As the first step, the data are downloaded and stored locally (typically on a
809 Jenkins slave). If .zip files are used, the given .xml files are extracted for
812 Parsing of the .xml files is performed by a class derived from
813 "robot.api.ResultVisitor", only necessary methods are overridden. All and only
814 necessary data is extracted from .xml file and stored in a structured form.
816 The parsed data are stored as the multi-indexed pandas.Series data type. Its
817 structure is as follows:
827 "job name", "build", "metadata", "suites", "tests" are indexes to access the
856 Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.:
857 data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with
860 Data will not be accessible directly using indexes, but using getters and
863 **Structure of metadata:**
868 "version": "VPP version",
869 "job": "Jenkins job name"
870 "build": "Information about the build"
873 **Structure of suites:**
879 "doc": "Suite 1 documentation"
880 "parent": "Suite 1 parent"
883 "doc": "Suite N documentation"
884 "parent": "Suite N parent"
887 **Structure of tests:**
896 "parent": "Name of the parent of the test",
897 "doc": "Test documentation"
898 "msg": "Test message"
899 "tags": ["tag 1", "tag 2", "tag n"],
900 "type": "PDR" | "NDR",
903 "unit": "pps" | "bps" | "percentage"
912 "50": { # Only for NDR
917 "10": { # Only for NDR
929 "50": { # Only for NDR
934 "10": { # Only for NDR
941 "lossTolerance": "lossTolerance" # Only for PDR
942 "vat-history": "DUT1 and DUT2 VAT History"
944 "show-run": "Show Run"
957 "parent": "Name of the parent of the test",
958 "doc": "Test documentation"
959 "msg": "Test message"
960 "tags": ["tag 1", "tag 2", "tag n"],
961 "vat-history": "DUT1 and DUT2 VAT History"
962 "show-run": "Show Run"
963 "status": "PASS" | "FAIL"
970 Note: ID is the lowercase full path to the test.
976 The first step when generating an element is getting the data needed to
977 construct the element. The data are filtered from the processed input data.
979 The data filtering is based on:
984 - required data - only this data is included in the output.
986 WARNING: The filtering is based on tags, so be careful with tagging.
988 For example, the element which specification includes:
993 csit-vpp-perf-1707-all:
1005 - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
1007 will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed
1008 builds and the tests with the list of tags matching the filter conditions.
1010 The output data structure for filtered test data is:
1033 Data analytics part implements:
1035 - methods to compute statistical data from the filtered input data.
1040 Advanced data analytics
1041 ```````````````````````
1043 As the next steps, advanced data analytics (ADA) will be implemented using
1044 machine learning (ML) and artificial intelligence (AI).
1048 - describe the concept of ADA.
1049 - add specification.
1055 Generates the plots and tables according to the report models per
1056 specification file. The elements are generated using algorithms and data
1057 specified in their models.
1063 - tables are generated by algorithms implemented in PAL, the model includes the
1064 algorithm and all necessary information.
1065 - output format: csv
1066 - generated tables are stored in specified directories and linked to .rst files.
1072 - `plot.ly <https://plot.ly/>`_ is currently used to generate plots, the model
1073 includes the type of plot and all the necessary information to render it.
1074 - output format: html.
1075 - generated plots are stored in specified directories and linked to .rst files.
1081 Report is generated using Sphinx and Read_the_Docs template. PAL generates html
1082 and pdf formats. It is possible to define the content of the report by specifying
1083 the version (TODO: define the names and content of versions).
1089 1. Read the specification.
1090 2. Read the input data.
1091 3. Process the input data.
1092 4. For element (plot, table, file) defined in specification:
1094 a. Get the data needed to construct the element using a filter.
1095 b. Generate the element.
1096 c. Store the element.
1098 5. Generate the report.
1099 6. Store the report (Nexus).
1101 The process is model driven. The elements’ models (tables, plots, files and
1102 report itself) are defined in the specification file. Script reads the elements’
1103 models from specification file and generates the elements.
1105 It is easy to add elements to be generated, if a new kind of element is
1106 required, only a new algorithm is implemented and integrated.
1112 List of modules, classes, methods and functions
1113 ```````````````````````````````````````````````
1117 specification_parser.py
1140 input_data_parser.py
1181 Functions implementing algorithms to generate particular types of
1182 tables (called by the function "generate_tables"):
1184 table_performance_improvements
1192 Functions implementing algorithms to generate particular types of
1193 plots (called by the function "generate_plots"):
1194 plot_performance_box
1203 Functions implementing algorithms to generate particular types of
1204 files (called by the function "generate_files"):
1213 Functions implementing algorithms to generate particular types of
1214 report (called by the function "generate_report"):
1215 generate_html_report
1218 Other functions called by the function "generate_report":
1223 PAL functional diagram
1224 ``````````````````````
1232 \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_func_diagram}
1233 \label{fig:pal_func_diagram}
1238 .. figure:: pal_func_diagram.svg
1239 :alt: PAL functional diagram
1243 How to add an element
1244 `````````````````````
1246 Element can be added by adding its model to the specification file. If the
1247 element will be generated by an existing algorithm, only its parameters must be
1250 If a brand new type of element will be added, also the algorithm must be
1252 The algorithms are implemented in the files which names start with "generator".
1253 The name of the function implementing the algorithm and the name of algorithm in
1254 the specification file had to be the same.