1 ===================================================
2 Presentation and Analytics Layer - Low Level Design
3 ===================================================
8 The presentation and analytics layer (PAL) is the fourth layer of CSIT
9 hierarchy. The model of presentation and analytics layer consists of four
10 sub-layers, bottom up:
12 - sL1 - Data - input data to be processed:
14 - Static content - .rst text files, .svg static figures, and other files
15 stored in the CSIT git repository.
16 - Data to process - .xml files generated by Jenkins jobs executing tests,
17 stored as robot results files (output.xml).
18 - Specification - .yaml file with the models of report elements (tables,
19 plots, layout, ...) generated by this tool. There is also the configuration
20 of the tool and the specification of input data (jobs and builds).
22 - sL2 - Data processing
24 - The data are read from the specified input files (.xml) and stored as
25 multi-indexed `pandas.Series <https://pandas.pydata.org/pandas-docs/stable/
26 generated/pandas.Series.html>`_.
27 - This layer provides also interface to input data and filtering of the input
30 - sL3 - Data presentation - This layer generates the elements specified in the
33 - Tables: .csv files linked to static .rst files.
34 - Plots: .html files generated using plot.ly linked to static .rst files.
36 - sL4 - Report generation - Sphinx generates required formats and versions:
39 - versions: minimal, full (TODO: define the names and scope of versions)
41 .. figure:: pal_layers.svg
51 The report specification file defines which data is used and which outputs are
52 generated. It is human readable and structured. It is easy to add / remove /
53 change items. The specification includes:
55 - Specification of the environment.
56 - Configuration of debug mode (optional).
57 - Specification of input data (jobs, builds, files, ...).
58 - Specification of the output.
59 - What and how is generated:
60 - What: plots, tables.
61 - How: specification of all properties and parameters.
64 Structure of the specification file
65 '''''''''''''''''''''''''''''''''''
67 The specification file is organized as a list of dictionaries distinguished by
89 Each type represents a section. The sections "environment", "debug", "static",
90 "input" and "output" are listed only once in the specification; "table", "file"
91 and "plot" can be there multiple times.
93 Sections "debug", "table", "file" and "plot" are optional.
95 Table(s), files(s) and plot(s) are referred as "elements" in this text. It is
96 possible to define and implement other elements if needed.
102 This section has the following parts:
104 - type: "environment" - says that this is the section "environment".
105 - configuration - configuration of the PAL.
106 - paths - paths used by the PAL.
107 - urls - urls pointing to the data sources.
108 - make-dirs - a list of the directories to be created by the PAL while
109 preparing the environment.
110 - remove-dirs - a list of the directories to be removed while cleaning the
112 - build-dirs - a list of the directories where the results are stored.
114 The structure of the section "Environment" is as follows (example):
122 # - Download of input data files
124 # - Read data from given zip / xml files
125 # - Set the configuration as it is done in normal mode
126 # If the section "type: debug" is missing, CFG[DEBUG] is set to 0.
130 # Top level directories:
134 DIR[BUILD,HTML]: "_build"
135 DIR[BUILD,LATEX]: "_build_latex"
138 DIR[RST]: "../../../docs/report"
140 # Working directories
141 ## Input data files (.zip, .xml)
142 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
143 ## Static source files from git
144 DIR[WORKING,SRC]: "{DIR[WORKING]}/src"
145 DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static"
147 # Static html content
148 DIR[STATIC]: "{DIR[BUILD,HTML]}/_static"
149 DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp"
150 DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk"
151 DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive"
153 # Detailed test results
154 DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results"
155 DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results"
156 DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results"
157 DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results"
158 DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results"
159 DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results"
160 DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results"
161 DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements"
163 # Detailed test configurations
164 DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration"
165 DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration"
166 DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration"
168 # Detailed tests operational data
169 DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data"
170 DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data"
172 # .css patch file to fix tables generated by Sphinx
173 DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css"
174 DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css"
177 URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job"
178 URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job"
181 # List the directories which are created while preparing the environment.
182 # All directories MUST be defined in "paths" section.
183 - "DIR[WORKING,DATA]"
189 - "DIR[WORKING,SRC,STATIC]"
192 # List the directories which are deleted while cleaning the environment.
193 # All directories MUST be defined in "paths" section.
197 # List the directories where the results (build) is stored.
198 # All directories MUST be defined in "paths" section.
202 It is possible to use defined items in the definition of other items, e.g.:
206 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
208 will be automatically changed to
212 DIR[WORKING,DATA]: "_tmp/data"
218 This section is optional as it configures the debug mode. It is used if one
219 does not want to download input data files and use local files instead.
221 If the debug mode is configured, the "input" section is ignored.
223 This section has the following parts:
225 - type: "debug" - says that this is the section "debug".
228 - input-format - xml or zip.
229 - extract - if "zip" is defined as the input format, this file is extracted
230 from the zip file, otherwise this parameter is ignored.
232 - builds - list of builds from which the data is used. Must include a job
233 name as a key and then a list of builds and their output files.
235 The structure of the section "Debug" is as follows (example):
242 input-format: "zip" # zip or xml
243 extract: "robot-plugin/output.xml" # Only for zip
245 # The files must be in the directory DIR[WORKING,DATA]
246 csit-dpdk-perf-1707-all:
249 file: "csit-dpdk-perf-1707-all__10.xml"
252 file: "csit-dpdk-perf-1707-all__9.xml"
253 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
256 file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml"
257 csit-vpp-functional-1707-ubuntu1604-virl:
259 build: lastSuccessfulBuild
260 file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml"
261 hc2vpp-csit-integration-1707-ubuntu1604:
263 build: lastSuccessfulBuild
264 file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml"
265 csit-vpp-perf-1707-all:
268 file: "csit-vpp-perf-1707-all__16__output.xml"
271 file: "csit-vpp-perf-1707-all__17__output.xml"
277 This section defines the static content which is stored in git and will be used
278 as a source to generate the report.
280 This section has these parts:
282 - type: "static" - says that this section is the "static".
283 - src-path - path to the static content.
284 - dst-path - destination path where the static content is copied and then
290 src-path: "{DIR[RST]}"
291 dst-path: "{DIR[WORKING,SRC]}"
297 This section defines the data used to generate elements. It is mandatory
298 if the debug mode is not used.
300 This section has the following parts:
302 - type: "input" - says that this section is the "input".
303 - general - parameters common to all builds:
305 - file-name: file to be downloaded.
306 - file-format: format of the downloaded file, ".zip" or ".xml" are supported.
307 - download-path: path to be added to url pointing to the file, e.g.:
308 "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and
309 {filename} are replaced by proper values defined in this section.
310 - extract: file to be extracted from downloaded zip file, e.g.: "output.xml";
311 if xml file is downloaded, this parameter is ignored.
313 - builds - list of jobs (keys) and numbers of builds which output data will be
316 The structure of the section "Input" is as follows (example from 17.07 report):
321 type: "input" # Ignored in debug mode
323 file-name: "robot-plugin.zip"
325 download-path: "{job}/{build}/robot/report/*zip*/{filename}"
326 extract: "robot-plugin/output.xml"
328 csit-vpp-perf-1707-all:
340 csit-dpdk-perf-1707-all:
351 csit-vpp-functional-1707-ubuntu1604-virl:
352 - lastSuccessfulBuild
353 hc2vpp-csit-perf-master-ubuntu1604:
356 hc2vpp-csit-integration-1707-ubuntu1604:
357 - lastSuccessfulBuild
358 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
365 This section specifies which format(s) will be generated (html, pdf) and which
366 versions will be generated for each format.
368 This section has the following parts:
370 - type: "output" - says that this section is the "output".
371 - format: html or pdf.
372 - version: defined for each format separately.
374 The structure of the section "Output" is as follows (example):
387 TODO: define the names of versions
390 Content of "minimal" version
391 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
393 TODO: define the name and content of this version
399 This section defines a table to be generated. There can be 0 or more "table"
402 This section has the following parts:
404 - type: "table" - says that this section defines a table.
405 - title: Title of the table.
406 - algorithm: Algorithm which is used to generate the table. The other
407 parameters in this section must provide all information needed by the used
409 - template: (optional) a .csv file used as a template while generating the
411 - output-file-ext: extension of the output file.
412 - output-file: file which the table will be written to.
413 - columns: specification of table columns:
415 - title: The title used in the table header.
416 - data: Specification of the data, it has two parts - command and arguments:
420 - template - take the data from template, arguments:
422 - number of column in the template.
424 - data - take the data from the input data, arguments:
426 - jobs and builds which data will be used.
428 - operation - performs an operation with the data already in the table,
431 - operation to be done, e.g.: mean, stdev, relative_change (compute
432 the relative change between two columns) and display number of data
433 samples ~= number of test jobs. The operations are implemented in the
435 TODO: Move from utils,py to e.g. operations.py
436 - numbers of columns which data will be used (optional).
438 - data: Specify the jobs and builds which data is used to generate the table.
439 - filter: filter based on tags applied on the input data, if "template" is
440 used, filtering is based on the template.
441 - parameters: Only these parameters will be put to the output data structure.
443 The structure of the section "Table" is as follows (example of
444 "table_performance_improvements"):
450 title: "Performance improvements"
451 algorithm: "table_performance_improvements"
452 template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv"
453 output-file-ext: ".csv"
454 output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements"
457 title: "VPP Functionality"
463 title: "VPP-16.09 mean [Mpps]"
466 title: "VPP-17.01 mean [Mpps]"
469 title: "VPP-17.04 mean [Mpps]"
472 title: "VPP-17.07 mean [Mpps]"
473 data: "data csit-vpp-perf-1707-all mean"
475 title: "VPP-17.07 stdev [Mpps]"
476 data: "data csit-vpp-perf-1707-all stdev"
478 title: "17.04 to 17.07 change [%]"
479 data: "operation relative_change 5 4"
481 csit-vpp-perf-1707-all:
496 Example of "table_details" which generates "Detailed Test Results - VPP
497 Performance Results":
503 title: "Detailed Test Results - VPP Performance Results"
504 algorithm: "table_details"
505 output-file-ext: ".csv"
506 output-file: "{DIR[WORKING]}/vpp_performance_results"
510 data: "data test_name"
512 title: "Documentation"
513 data: "data test_documentation"
516 data: "data test_msg"
518 csit-vpp-perf-1707-all:
526 Example of "table_details" which generates "Test configuration - VPP Performance
533 title: "Test configuration - VPP Performance Test Configs"
534 algorithm: "table_details"
535 output-file-ext: ".csv"
536 output-file: "{DIR[WORKING]}/vpp_test_configuration"
542 title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case"
543 data: "data show-run"
545 csit-vpp-perf-1707-all:
557 This section defines a plot to be generated. There can be 0 or more "plot"
560 This section has these parts:
562 - type: "plot" - says that this section defines a plot.
563 - title: Plot title used in the logs. Title which is displayed is in the
565 - output-file-type: format of the output file.
566 - output-file: file which the plot will be written to.
567 - algorithm: Algorithm used to generate the plot. The other parameters in this
568 section must provide all information needed by plot.ly to generate the plot.
574 - These parameters are transparently passed to plot.ly.
576 - data: Specify the jobs and numbers of builds which data is used to generate
578 - filter: filter applied on the input data.
579 - parameters: Only these parameters will be put to the output data structure.
581 The structure of the section "Plot" is as follows (example of a plot showing
582 throughput in a chart box-with-whiskers):
588 title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
589 algorithm: "plot_performance_box"
590 output-file-type: ".html"
591 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc"
593 csit-vpp-perf-1707-all:
604 # Keep this formatting, the filter is enclosed with " (quotation mark) and
605 # each tag is enclosed with ' (apostrophe).
606 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
612 boxpoints: "outliers"
615 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
620 gridcolor: "rgb(238, 238, 238)"
621 linecolor: "rgb(238, 238, 238)"
626 tickcolor: "rgb(238, 238, 238)"
628 title: "Indexed Test Cases"
631 gridcolor: "rgb(238, 238, 238)'"
633 linecolor: "rgb(238, 238, 238)"
639 tickcolor: "rgb(238, 238, 238)"
640 title: "Packets Per Second [pps]"
656 The structure of the section "Plot" is as follows (example of a plot showing
657 latency in a box chart):
663 title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
664 algorithm: "plot_latency_box"
665 output-file-type: ".html"
666 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50"
668 csit-vpp-perf-1707-all:
679 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
686 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
691 gridcolor: "rgb(238, 238, 238)"
692 linecolor: "rgb(238, 238, 238)"
697 tickcolor: "rgb(238, 238, 238)"
699 title: "Indexed Test Cases"
702 gridcolor: "rgb(238, 238, 238)'"
704 linecolor: "rgb(238, 238, 238)"
710 tickcolor: "rgb(238, 238, 238)"
711 title: "Latency min/avg/max [uSec]"
731 This section defines a file to be generated. There can be 0 or more "file"
734 This section has the following parts:
736 - type: "file" - says that this section defines a file.
737 - title: Title of the table.
738 - algorithm: Algorithm which is used to generate the file. The other
739 parameters in this section must provide all information needed by the used
741 - output-file-ext: extension of the output file.
742 - output-file: file which the file will be written to.
743 - file-header: The header of the generated .rst file.
744 - dir-tables: The directory with the tables.
745 - data: Specify the jobs and builds which data is used to generate the table.
746 - filter: filter based on tags applied on the input data, if "all" is
747 used, no filtering is done.
748 - parameters: Only these parameters will be put to the output data structure.
749 - chapters: the hierarchy of chapters in the generated file.
750 - start-level: the level of the the top-level chapter.
752 The structure of the section "file" is as follows (example):
758 title: "VPP Performance Results"
759 algorithm: "file_test_results"
760 output-file-ext: ".rst"
761 output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results"
762 file-header: "\n.. |br| raw:: html\n\n <br />\n\n\n.. |prein| raw:: html\n\n <pre>\n\n\n.. |preout| raw:: html\n\n </pre>\n\n"
763 dir-tables: "{DIR[DTR,PERF,VPP]}"
765 csit-vpp-perf-1707-all:
772 data-start-level: 2 # 0, 1, 2, ...
773 chapters-start-level: 2 # 0, 1, 2, ...
779 - Manually created / edited files.
780 - .rst files, static .csv files, static pictures (.svg), ...
781 - Stored in CSIT git repository.
783 No more details about the static content in this document.
789 The PAL processes tests results and other information produced by Jenkins jobs.
790 The data are now stored as robot results in Jenkins (TODO: store the data in
791 nexus) either as .zip and / or .xml files.
797 As the first step, the data are downloaded and stored locally (typically on a
798 Jenkins slave). If .zip files are used, the given .xml files are extracted for
801 Parsing of the .xml files is performed by a class derived from
802 "robot.api.ResultVisitor", only necessary methods are overridden. All and only
803 necessary data is extracted from .xml file and stored in a structured form.
805 The parsed data are stored as the multi-indexed pandas.Series data type. Its
806 structure is as follows:
816 "job name", "build", "metadata", "suites", "tests" are indexes to access the
845 Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.:
846 data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with
849 Data will not be accessible directly using indexes, but using getters and
852 **Structure of metadata:**
857 "version": "VPP version",
858 "job": "Jenkins job name"
859 "build": "Information about the build"
862 **Structure of suites:**
868 "doc": "Suite 1 documentation"
869 "parent": "Suite 1 parent"
872 "doc": "Suite N documentation"
873 "parent": "Suite N parent"
876 **Structure of tests:**
885 "parent": "Name of the parent of the test",
886 "doc": "Test documentation"
887 "msg": "Test message"
888 "tags": ["tag 1", "tag 2", "tag n"],
889 "type": "PDR" | "NDR",
892 "unit": "pps" | "bps" | "percentage"
901 "50": { # Only for NDR
906 "10": { # Only for NDR
918 "50": { # Only for NDR
923 "10": { # Only for NDR
930 "lossTolerance": "lossTolerance" # Only for PDR
931 "vat-history": "DUT1 and DUT2 VAT History"
933 "show-run": "Show Run"
946 "parent": "Name of the parent of the test",
947 "doc": "Test documentation"
948 "msg": "Test message"
949 "tags": ["tag 1", "tag 2", "tag n"],
950 "vat-history": "DUT1 and DUT2 VAT History"
951 "show-run": "Show Run"
952 "status": "PASS" | "FAIL"
959 Note: ID is the lowercase full path to the test.
965 The first step when generating an element is getting the data needed to
966 construct the element. The data are filtered from the processed input data.
968 The data filtering is based on:
973 - required data - only this data is included in the output.
975 WARNING: The filtering is based on tags, so be careful with tagging.
977 For example, the element which specification includes:
982 csit-vpp-perf-1707-all:
994 - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
996 will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed
997 builds and the tests with the list of tags matching the filter conditions.
999 The output data structure for filtered test data is:
1022 Data analytics part implements:
1024 - methods to compute statistical data from the filtered input data.
1029 Advanced data analytics
1030 ```````````````````````
1032 As the next steps, advanced data analytics (ADA) will be implemented using
1033 machine learning (ML) and artificial intelligence (AI).
1037 - describe the concept of ADA.
1038 - add specification.
1044 Generates the plots and tables according to the report models per
1045 specification file. The elements are generated using algorithms and data
1046 specified in their models.
1052 - tables are generated by algorithms implemented in PAL, the model includes the
1053 algorithm and all necessary information.
1054 - output format: csv
1055 - generated tables are stored in specified directories and linked to .rst files.
1061 - `plot.ly <https://plot.ly/>`_ is currently used to generate plots, the model
1062 includes the type of plot and all the necessary information to render it.
1063 - output format: html.
1064 - generated plots are stored in specified directories and linked to .rst files.
1070 Report is generated using Sphinx and Read_the_Docs template. PAL generates html
1071 and pdf formats. It is possible to define the content of the report by specifying
1072 the version (TODO: define the names and content of versions).
1078 1. Read the specification.
1079 2. Read the input data.
1080 3. Process the input data.
1081 4. For element (plot, table, file) defined in specification:
1083 a. Get the data needed to construct the element using a filter.
1084 b. Generate the element.
1085 c. Store the element.
1087 5. Generate the report.
1088 6. Store the report (Nexus).
1090 The process is model driven. The elements’ models (tables, plots, files and
1091 report itself) are defined in the specification file. Script reads the elements’
1092 models from specification file and generates the elements.
1094 It is easy to add elements to be generated, if a new kind of element is
1095 required, only a new algorithm is implemented and integrated.
1101 List of modules, classes, methods and functions
1102 ```````````````````````````````````````````````
1106 specification_parser.py
1129 input_data_parser.py
1170 Functions implementing algorithms to generate particular types of
1171 tables (called by the function "generate_tables"):
1173 table_performance_improvements
1181 Functions implementing algorithms to generate particular types of
1182 plots (called by the function "generate_plots"):
1183 plot_performance_box
1192 Functions implementing algorithms to generate particular types of
1193 files (called by the function "generate_files"):
1202 Functions implementing algorithms to generate particular types of
1203 report (called by the function "generate_report"):
1204 generate_html_report
1207 Other functions called by the function "generate_report":
1212 PAL functional diagram
1213 ``````````````````````
1215 .. figure:: pal_func_diagram.svg
1216 :alt: PAL functional diagram
1220 How to add an element
1221 `````````````````````
1223 Element can be added by adding its model to the specification file. If the
1224 element will be generated by an existing algorithm, only its parameters must be
1227 If a brand new type of element will be added, also the algorithm must be
1229 The algorithms are implemented in the files which names start with "generator".
1230 The name of the function implementing the algorithm and the name of algorithm in
1231 the specification file had to be the same.