1 ===================================================
2 Presentation and Analytics Layer - Low Level Design
3 ===================================================
8 The presentation and analytics layer (PAL) is the fourth layer of CSIT
9 hierarchy. The model of presentation and analytics layer consists of four
10 sub-layers, bottom up:
12 - sL1 - Data - input data to be processed:
14 - Static content - .rst text files, .svg static figures, and other files
15 stored in the CSIT git repository.
16 - Data to process - .xml files generated by Jenkins jobs executing tests,
17 stored as robot results files (output.xml).
18 - Specification - .yaml file with the models of report elements (tables,
19 plots, layout, ...) generated by this tool. There is also the configuration
20 of the tool and the specification of input data (jobs and builds).
22 - sL2 - Data processing
24 - The data are read from the specified input files (.xml) and stored as
25 multi-indexed `pandas.Series <https://pandas.pydata.org/pandas-docs/stable/
26 generated/pandas.Series.html>`_.
27 - This layer provides also interface to input data and filtering of the input
30 - sL3 - Data presentation - This layer generates the elements specified in the
33 - Tables: .csv files linked to static .rst files.
34 - Plots: .html files generated using plot.ly linked to static .rst files.
36 - sL4 - Report generation - Sphinx generates required formats and versions:
39 - versions: minimal, full (TODO: define the names and scope of versions)
48 The report specification file defines which data is used and which outputs are
49 generated. It is human readable and structured. It is easy to add / remove /
50 change items. The specification includes:
52 - Specification of the environment.
53 - Configuration of debug mode (optional).
54 - Specification of input data (jobs, builds, files, ...).
55 - Specification of the output.
56 - What and how is generated:
57 - What: plots, tables.
58 - How: specification of all properties and parameters.
61 Structure of the specification file
62 '''''''''''''''''''''''''''''''''''
64 The specification file is organized as a list of dictionaries distinguished by
86 Each type represents a section. The sections "environment", "debug", "static",
87 "input" and "output" are listed only once in the specification; "table", "file"
88 and "plot" can be there multiple times.
90 Sections "debug", "table", "file" and "plot" are optional.
92 Table(s), files(s) and plot(s) are referred as "elements" in this text. It is
93 possible to define and implement other elements if needed.
99 This section has the following parts:
101 - type: "environment" - says that this is the section "environment".
102 - configuration - configuration of the PAL.
103 - paths - paths used by the PAL.
104 - urls - urls pointing to the data sources.
105 - make-dirs - a list of the directories to be created by the PAL while
106 preparing the environment.
107 - remove-dirs - a list of the directories to be removed while cleaning the
109 - build-dirs - a list of the directories where the results are stored.
111 The structure of the section "Environment" is as follows (example):
119 # - Download of input data files
121 # - Read data from given zip / xml files
122 # - Set the configuration as it is done in normal mode
123 # If the section "type: debug" is missing, CFG[DEBUG] is set to 0.
127 # Top level directories:
131 DIR[BUILD,HTML]: "_build"
132 DIR[BUILD,LATEX]: "_build_latex"
135 DIR[RST]: "../../../docs/report"
137 # Working directories
138 ## Input data files (.zip, .xml)
139 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
140 ## Static source files from git
141 DIR[WORKING,SRC]: "{DIR[WORKING]}/src"
142 DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static"
144 # Static html content
145 DIR[STATIC]: "{DIR[BUILD,HTML]}/_static"
146 DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp"
147 DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk"
148 DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive"
150 # Detailed test results
151 DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results"
152 DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results"
153 DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results"
154 DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results"
155 DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results"
156 DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results"
157 DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results"
158 DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements"
160 # Detailed test configurations
161 DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration"
162 DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration"
163 DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration"
165 # Detailed tests operational data
166 DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data"
167 DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data"
169 # .css patch file to fix tables generated by Sphinx
170 DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css"
171 DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css"
174 URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job"
175 URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job"
178 # List the directories which are created while preparing the environment.
179 # All directories MUST be defined in "paths" section.
180 - "DIR[WORKING,DATA]"
186 - "DIR[WORKING,SRC,STATIC]"
189 # List the directories which are deleted while cleaning the environment.
190 # All directories MUST be defined in "paths" section.
194 # List the directories where the results (build) is stored.
195 # All directories MUST be defined in "paths" section.
199 It is possible to use defined items in the definition of other items, e.g.:
203 DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
205 will be automatically changed to
209 DIR[WORKING,DATA]: "_tmp/data"
215 This section is optional as it configures the debug mode. It is used if one
216 does not want to download input data files and use local files instead.
218 If the debug mode is configured, the "input" section is ignored.
220 This section has the following parts:
222 - type: "debug" - says that this is the section "debug".
225 - input-format - xml or zip.
226 - extract - if "zip" is defined as the input format, this file is extracted
227 from the zip file, otherwise this parameter is ignored.
229 - builds - list of builds from which the data is used. Must include a job
230 name as a key and then a list of builds and their output files.
232 The structure of the section "Debug" is as follows (example):
239 input-format: "zip" # zip or xml
240 extract: "robot-plugin/output.xml" # Only for zip
242 # The files must be in the directory DIR[WORKING,DATA]
243 csit-dpdk-perf-1707-all:
246 file: "csit-dpdk-perf-1707-all__10.xml"
249 file: "csit-dpdk-perf-1707-all__9.xml"
250 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
253 file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml"
254 csit-vpp-functional-1707-ubuntu1604-virl:
256 build: lastSuccessfulBuild
257 file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml"
258 hc2vpp-csit-integration-1707-ubuntu1604:
260 build: lastSuccessfulBuild
261 file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml"
262 csit-vpp-perf-1707-all:
265 file: "csit-vpp-perf-1707-all__16__output.xml"
268 file: "csit-vpp-perf-1707-all__17__output.xml"
274 This section defines the static content which is stored in git and will be used
275 as a source to generate the report.
277 This section has these parts:
279 - type: "static" - says that this section is the "static".
280 - src-path - path to the static content.
281 - dst-path - destination path where the static content is copied and then
287 src-path: "{DIR[RST]}"
288 dst-path: "{DIR[WORKING,SRC]}"
294 This section defines the data used to generate elements. It is mandatory
295 if the debug mode is not used.
297 This section has the following parts:
299 - type: "input" - says that this section is the "input".
300 - general - parameters common to all builds:
302 - file-name: file to be downloaded.
303 - file-format: format of the downloaded file, ".zip" or ".xml" are supported.
304 - download-path: path to be added to url pointing to the file, e.g.:
305 "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and
306 {filename} are replaced by proper values defined in this section.
307 - extract: file to be extracted from downloaded zip file, e.g.: "output.xml";
308 if xml file is downloaded, this parameter is ignored.
310 - builds - list of jobs (keys) and numbers of builds which output data will be
313 The structure of the section "Input" is as follows (example from 17.07 report):
318 type: "input" # Ignored in debug mode
320 file-name: "robot-plugin.zip"
322 download-path: "{job}/{build}/robot/report/*zip*/{filename}"
323 extract: "robot-plugin/output.xml"
325 csit-vpp-perf-1707-all:
337 csit-dpdk-perf-1707-all:
348 csit-vpp-functional-1707-ubuntu1604-virl:
349 - lastSuccessfulBuild
350 hc2vpp-csit-perf-master-ubuntu1604:
353 hc2vpp-csit-integration-1707-ubuntu1604:
354 - lastSuccessfulBuild
355 csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
362 This section specifies which format(s) will be generated (html, pdf) and which
363 versions will be generated for each format.
365 This section has the following parts:
367 - type: "output" - says that this section is the "output".
368 - format: html or pdf.
369 - version: defined for each format separately.
371 The structure of the section "Output" is as follows (example):
384 TODO: define the names of versions
387 Content of "minimal" version
388 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
390 TODO: define the name and content of this version
396 This section defines a table to be generated. There can be 0 or more "table"
399 This section has the following parts:
401 - type: "table" - says that this section defines a table.
402 - title: Title of the table.
403 - algorithm: Algorithm which is used to generate the table. The other
404 parameters in this section must provide all information needed by the used
406 - template: (optional) a .csv file used as a template while generating the
408 - output-file-ext: extension of the output file.
409 - output-file: file which the table will be written to.
410 - columns: specification of table columns:
412 - title: The title used in the table header.
413 - data: Specification of the data, it has two parts - command and arguments:
417 - template - take the data from template, arguments:
419 - number of column in the template.
421 - data - take the data from the input data, arguments:
423 - jobs and builds which data will be used.
425 - operation - performs an operation with the data already in the table,
428 - operation to be done, e.g.: mean, stdev, relative_change (compute
429 the relative change between two columns) and display number of data
430 samples ~= number of test jobs. The operations are implemented in the
432 TODO: Move from utils,py to e.g. operations.py
433 - numbers of columns which data will be used (optional).
435 - data: Specify the jobs and builds which data is used to generate the table.
436 - filter: filter based on tags applied on the input data, if "template" is
437 used, filtering is based on the template.
438 - parameters: Only these parameters will be put to the output data structure.
440 The structure of the section "Table" is as follows (example of
441 "table_performance_improvements"):
447 title: "Performance improvements"
448 algorithm: "table_performance_improvements"
449 template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv"
450 output-file-ext: ".csv"
451 output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements"
454 title: "VPP Functionality"
460 title: "VPP-16.09 mean [Mpps]"
463 title: "VPP-17.01 mean [Mpps]"
466 title: "VPP-17.04 mean [Mpps]"
469 title: "VPP-17.07 mean [Mpps]"
470 data: "data csit-vpp-perf-1707-all mean"
472 title: "VPP-17.07 stdev [Mpps]"
473 data: "data csit-vpp-perf-1707-all stdev"
475 title: "17.04 to 17.07 change [%]"
476 data: "operation relative_change 5 4"
478 csit-vpp-perf-1707-all:
493 Example of "table_details" which generates "Detailed Test Results - VPP
494 Performance Results":
500 title: "Detailed Test Results - VPP Performance Results"
501 algorithm: "table_details"
502 output-file-ext: ".csv"
503 output-file: "{DIR[WORKING]}/vpp_performance_results"
507 data: "data test_name"
509 title: "Documentation"
510 data: "data test_documentation"
513 data: "data test_msg"
515 csit-vpp-perf-1707-all:
523 Example of "table_details" which generates "Test configuration - VPP Performance
530 title: "Test configuration - VPP Performance Test Configs"
531 algorithm: "table_details"
532 output-file-ext: ".csv"
533 output-file: "{DIR[WORKING]}/vpp_test_configuration"
539 title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case"
540 data: "data show-run"
542 csit-vpp-perf-1707-all:
554 This section defines a plot to be generated. There can be 0 or more "plot"
557 This section has these parts:
559 - type: "plot" - says that this section defines a plot.
560 - title: Plot title used in the logs. Title which is displayed is in the
562 - output-file-type: format of the output file.
563 - output-file: file which the plot will be written to.
564 - algorithm: Algorithm used to generate the plot. The other parameters in this
565 section must provide all information needed by plot.ly to generate the plot.
571 - These parameters are transparently passed to plot.ly.
573 - data: Specify the jobs and numbers of builds which data is used to generate
575 - filter: filter applied on the input data.
576 - parameters: Only these parameters will be put to the output data structure.
578 The structure of the section "Plot" is as follows (example of a plot showing
579 throughput in a chart box-with-whiskers):
585 title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
586 algorithm: "plot_performance_box"
587 output-file-type: ".html"
588 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc"
590 csit-vpp-perf-1707-all:
601 # Keep this formatting, the filter is enclosed with " (quotation mark) and
602 # each tag is enclosed with ' (apostrophe).
603 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
609 boxpoints: "outliers"
612 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
617 gridcolor: "rgb(238, 238, 238)"
618 linecolor: "rgb(238, 238, 238)"
623 tickcolor: "rgb(238, 238, 238)"
625 title: "Indexed Test Cases"
628 gridcolor: "rgb(238, 238, 238)'"
630 linecolor: "rgb(238, 238, 238)"
636 tickcolor: "rgb(238, 238, 238)"
637 title: "Packets Per Second [pps]"
653 The structure of the section "Plot" is as follows (example of a plot showing
654 latency in a box chart):
660 title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
661 algorithm: "plot_latency_box"
662 output-file-type: ".html"
663 output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50"
665 csit-vpp-perf-1707-all:
676 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
683 title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
688 gridcolor: "rgb(238, 238, 238)"
689 linecolor: "rgb(238, 238, 238)"
694 tickcolor: "rgb(238, 238, 238)"
696 title: "Indexed Test Cases"
699 gridcolor: "rgb(238, 238, 238)'"
701 linecolor: "rgb(238, 238, 238)"
707 tickcolor: "rgb(238, 238, 238)"
708 title: "Latency min/avg/max [uSec]"
728 This section defines a file to be generated. There can be 0 or more "file"
731 This section has the following parts:
733 - type: "file" - says that this section defines a file.
734 - title: Title of the table.
735 - algorithm: Algorithm which is used to generate the file. The other
736 parameters in this section must provide all information needed by the used
738 - output-file-ext: extension of the output file.
739 - output-file: file which the file will be written to.
740 - file-header: The header of the generated .rst file.
741 - dir-tables: The directory with the tables.
742 - data: Specify the jobs and builds which data is used to generate the table.
743 - filter: filter based on tags applied on the input data, if "all" is
744 used, no filtering is done.
745 - parameters: Only these parameters will be put to the output data structure.
746 - chapters: the hierarchy of chapters in the generated file.
747 - start-level: the level of the the top-level chapter.
749 The structure of the section "file" is as follows (example):
755 title: "VPP Performance Results"
756 algorithm: "file_test_results"
757 output-file-ext: ".rst"
758 output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results"
759 file-header: "\n.. |br| raw:: html\n\n <br />\n\n\n.. |prein| raw:: html\n\n <pre>\n\n\n.. |preout| raw:: html\n\n </pre>\n\n"
760 dir-tables: "{DIR[DTR,PERF,VPP]}"
762 csit-vpp-perf-1707-all:
769 data-start-level: 2 # 0, 1, 2, ...
770 chapters-start-level: 2 # 0, 1, 2, ...
776 - Manually created / edited files.
777 - .rst files, static .csv files, static pictures (.svg), ...
778 - Stored in CSIT git repository.
780 No more details about the static content in this document.
786 The PAL processes tests results and other information produced by Jenkins jobs.
787 The data are now stored as robot results in Jenkins (TODO: store the data in
788 nexus) either as .zip and / or .xml files.
794 As the first step, the data are downloaded and stored locally (typically on a
795 Jenkins slave). If .zip files are used, the given .xml files are extracted for
798 Parsing of the .xml files is performed by a class derived from
799 "robot.api.ResultVisitor", only necessary methods are overridden. All and only
800 necessary data is extracted from .xml file and stored in a structured form.
802 The parsed data are stored as the multi-indexed pandas.Series data type. Its
803 structure is as follows:
813 "job name", "build", "metadata", "suites", "tests" are indexes to access the
842 Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.:
843 data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with
846 Data will not be accessible directly using indexes, but using getters and
849 **Structure of metadata:**
854 "version": "VPP version",
855 "job": "Jenkins job name"
856 "build": "Information about the build"
859 **Structure of suites:**
865 "doc": "Suite 1 documentation"
866 "parent": "Suite 1 parent"
869 "doc": "Suite N documentation"
870 "parent": "Suite N parent"
873 **Structure of tests:**
882 "parent": "Name of the parent of the test",
883 "doc": "Test documentation"
884 "msg": "Test message"
885 "tags": ["tag 1", "tag 2", "tag n"],
886 "type": "PDR" | "NDR",
889 "unit": "pps" | "bps" | "percentage"
898 "50": { # Only for NDR
903 "10": { # Only for NDR
915 "50": { # Only for NDR
920 "10": { # Only for NDR
927 "lossTolerance": "lossTolerance" # Only for PDR
928 "vat-history": "DUT1 and DUT2 VAT History"
930 "show-run": "Show Run"
943 "parent": "Name of the parent of the test",
944 "doc": "Test documentation"
945 "msg": "Test message"
946 "tags": ["tag 1", "tag 2", "tag n"],
947 "vat-history": "DUT1 and DUT2 VAT History"
948 "show-run": "Show Run"
949 "status": "PASS" | "FAIL"
956 Note: ID is the lowercase full path to the test.
962 The first step when generating an element is getting the data needed to
963 construct the element. The data are filtered from the processed input data.
965 The data filtering is based on:
970 - required data - only this data is included in the output.
972 WARNING: The filtering is based on tags, so be careful with tagging.
974 For example, the element which specification includes:
979 csit-vpp-perf-1707-all:
991 - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
993 will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed
994 builds and the tests with the list of tags matching the filter conditions.
996 The output data structure for filtered test data is:
1019 Data analytics part implements:
1021 - methods to compute statistical data from the filtered input data.
1026 Advanced data analytics
1027 ```````````````````````
1029 As the next steps, advanced data analytics (ADA) will be implemented using
1030 machine learning (ML) and artificial intelligence (AI).
1034 - describe the concept of ADA.
1035 - add specification.
1041 Generates the plots and tables according to the report models per
1042 specification file. The elements are generated using algorithms and data
1043 specified in their models.
1049 - tables are generated by algorithms implemented in PAL, the model includes the
1050 algorithm and all necessary information.
1051 - output format: csv
1052 - generated tables are stored in specified directories and linked to .rst files.
1058 - `plot.ly <https://plot.ly/>`_ is currently used to generate plots, the model
1059 includes the type of plot and all the necessary information to render it.
1060 - output format: html.
1061 - generated plots are stored in specified directories and linked to .rst files.
1067 Report is generated using Sphinx and Read_the_Docs template. PAL generates html
1068 and pdf formats. It is possible to define the content of the report by specifying
1069 the version (TODO: define the names and content of versions).
1075 1. Read the specification.
1076 2. Read the input data.
1077 3. Process the input data.
1078 4. For element (plot, table, file) defined in specification:
1080 a. Get the data needed to construct the element using a filter.
1081 b. Generate the element.
1082 c. Store the element.
1084 5. Generate the report.
1085 6. Store the report (Nexus).
1087 The process is model driven. The elements’ models (tables, plots, files and
1088 report itself) are defined in the specification file. Script reads the elements’
1089 models from specification file and generates the elements.
1091 It is easy to add elements to be generated, if a new kind of element is
1092 required, only a new algorithm is implemented and integrated.
1098 List of modules, classes, methods and functions
1099 ```````````````````````````````````````````````
1103 specification_parser.py
1126 input_data_parser.py
1167 Functions implementing algorithms to generate particular types of
1168 tables (called by the function "generate_tables"):
1170 table_performance_improvements
1178 Functions implementing algorithms to generate particular types of
1179 plots (called by the function "generate_plots"):
1180 plot_performance_box
1189 Functions implementing algorithms to generate particular types of
1190 files (called by the function "generate_files"):
1199 Functions implementing algorithms to generate particular types of
1200 report (called by the function "generate_report"):
1201 generate_html_report
1204 Other functions called by the function "generate_report":
1209 PAL functional diagram
1210 ``````````````````````
1212 .. figure:: pal_func_diagram.svg
1213 :alt: PAL functional diagram
1217 How to add an element
1218 `````````````````````
1220 Element can be added by adding its model to the specification file. If the
1221 element will be generated by an existing algorithm, only its parameters must be
1224 If a brand new type of element will be added, also the algorithm must be
1226 The algorithms are implemented in the files which names start with "generator".
1227 The name of the function implementing the algorithm and the name of algorithm in
1228 the specification file had to be the same.