X-Git-Url: https://gerrit.fd.io/r/gitweb?p=csit.git;a=blobdiff_plain;f=docs%2Fcpta%2Fintroduction%2Findex.rst;h=76aed6bbcd35058480e7b43a1ea1adcc7836112d;hp=aad683b390d6caae673e6feec231b452cbc86795;hb=c298d66734d2d40e343ac4c60703b9838bdd6301;hpb=efdcf6470f6e15dcc918c70e5a61d10e10653f1e

diff --git a/docs/cpta/introduction/index.rst b/docs/cpta/introduction/index.rst
index aad683b390..76aed6bbcd 100644
--- a/docs/cpta/introduction/index.rst
+++ b/docs/cpta/introduction/index.rst
@@ -1,182 +1,54 @@
-Introduction
-============
-
-Purpose
--------
-
-With increasing number of features and code changes in the FD.io VPP data plane
-codebase, it is increasingly difficult to measure and detect VPP data plane
-performance changes. Similarly, once degradation is detected, it is getting
-harder to bisect the source code in search of the Bad code change or addition.
-The problem is further escalated by a large combination of compute platforms
-that VPP is running and used on, including Intel Xeon, Intel Atom, ARM Aarch64.
-
-Existing FD.io CSIT continuous performance trending test jobs help, but they
-rely on human factors for anomaly detection, and as such are error prone and
-unreliable, as the volume of data generated by these jobs is growing
-exponentially.
-
-Proposed solution is to eliminate human factor and fully automate performance
-trending, regression and progression detection, as well as bisecting.
-
-This document describes a high-level design of a system for continuous
-measuring, trending and performance change detection for FD.io VPP SW data
-plane. It builds upon the existing CSIT framework with extensions to its
-throughput testing methodology, CSIT data analytics engine
-(PAL â Presentation-and-Analytics-Layer) and associated Jenkins jobs
-definitions.
-
-Continuous Performance Trending and Analysis
---------------------------------------------
-
-Proposed design replaces existing CSIT performance trending jobs and tests with
-new Performance Trending (PT) CSIT module and separate Performance Analysis (PA)
-module ingesting results from PT and analysing, detecting and reporting any
-performance anomalies using historical trending data and statistical metrics.
-PA does also produce trending graphs with summary and drill-down views across
-all specified tests that can be reviewed and inspected regularly by FD.io
-developers and users community.
-
-Trend Analysis
-``````````````
-
-All measured performance trend data is treated as time-series data that can be
-modelled using normal distribution. After trimming the outliers, the average and
-deviations from average are used for detecting performance change anomalies
-following the three-sigma rule of thumb (a.k.a. 68-95-99.7 rule).
-
-Analysis Metrics
-````````````````
-
-Following statistical metrics are proposed as performance trend indicators over
-the rolling window of last <N> sets of historical measurement data:
-
-    #. Quartiles Q1, Q2, Q3 â three points dividing a ranked set of data set
-       into four equal parts, Q2 is the median of the data.
-    #. Inter Quartile Range IQR=Q3-Q1 â measure of variability, used here to
-       eliminate outliers.
-    #. Outliers â extreme values that are at least 1.5*IQR below Q1, or at
-       least 1.5*IQR above Q3.
-    #. Trimmed Moving Average (TMA) â average across the data set of the rolling
-       window of <N> values without the outliers. Used here to calculate TMSD.
-    #. Trimmed Moving Standard Deviation (TMSD) â standard deviation over the
-       data set of the rolling window of <N> values without the outliers,
-       requires calculating TMA. Used here for anomaly detection.
-    #. Moving Median (MM) - median across the data set of the rolling window of
-       <N> values with all data points, including the outliers. Used here for
-       anomaly detection.
-
-Anomaly Detection
-`````````````````
-
-Based on the assumption that all performance measurements can be modelled using
-normal distribution, a three-sigma rule of thumb is proposed as the main
-criteria for anomaly detection.
-
-Three-sigma rule of thumb, aka 68â95â99.7 rule, is a shorthand used to capture
-the percentage of values that lie within a band around the average (mean) in a
-normal distribution within a width of two, four and six standard deviations.
-More accurately 68.27%, 95.45% and 99.73% of the result values should lie within
-one, two or three standard deviations of the mean, see figure below.
-
-To verify compliance of test result with value X against defined trend analysis
-metric and detect anomalies, three simple evaluation criteria are proposed:
-
-::
-
-    Test Result Evaluation      Reported Result     Reported Reason     Trending Graph Markers
-    ==========================================================================================
-          Normal                      Pass              Normal            Part of plot line
-          Regression                  Fail              Regression        Red circle
-          Progression                 Pass              Progression       Green circle
-
-Jenkins job cumulative results:
-
-    #. Pass - if all detection results are Pass or Warning.
-    #. Fail - if any detection result is Fail.
-
-Performance Trending (PT)
-`````````````````````````
-
-CSIT PT runs regular performance test jobs finding MRR, PDR and NDR per test
-cases. PT is designed as follows:
-
-    #. PT job triggers:
-
-        #. Periodic e.g. daily.
-        #. On-demand gerrit triggered.
-        #. Other periodic TBD.
-
-    #. Measurements and calculations per test case:
-
-        #. MRR Max Received Rate
-
-            #. Measured: Unlimited tolerance of packet loss.
-            #. Send packets at link rate, count total received packets, divide
-               by test trial period.
-
-        #. Optimized binary search bounds for PDR and NDR tests:
-
-            #. Calculated: High and low bounds for binary search based on MRR
-               and pre-defined Packet Loss Ratio (PLR).
-            #. HighBound=MRR, LowBound=to-be-determined.
-            #. PLR â acceptable loss ratio for PDR tests, currently set to 0.5%
-               for all performance tests.
-
-        #. PDR and NDR:
-
-            #. Run binary search within the calculated bounds, find PDR and NDR.
-            #. Measured: PDR Partial Drop Rate â limited non-zero tolerance of
-               packet loss.
-            #. Measured: NDR Non Drop Rate - zero packet loss.
-
-    #. Archive MRR, PDR and NDR per test case.
-    #. Archive counters collected at MRR, PDR and NDR.
-
-Performance Analysis (PA)
-`````````````````````````
-
-CSIT PA runs performance analysis, change detection and trending using specified
-trend analysis metrics over the rolling window of last <N> sets of historical
-measurement data. PA is defined as follows:
-
-    #. PA job triggers:
-
-        #. By PT job at its completion.
-        #. On-demand gerrit triggered.
-        #. Other periodic TBD.
-
-    #. Download and parse archived historical data and the new data:
-
-        #. New data from latest PT job is evaluated against the rolling window
-           of <N> sets of historical data.
-        #. Download RF output.xml files and compressed archived data.
-        #. Parse out the data filtering test cases listed in PA specification
-           (part of CSIT PAL specification file).
-
-    #. Calculate trend metrics for the rolling window of <N> sets of historical data:
-
-        #. Calculate quartiles Q1, Q2, Q3.
-        #. Trim outliers using IQR.
-        #. Calculate TMA and TMSD.
-        #. Calculate normal trending range per test case based on TMA and TMSD.
-
-    #. Evaluate new test data against trend metrics:
-
-        #. If within the range of (TMA +/- 3*TMSD) => Result = Pass,
-           Reason = Normal.
-        #. If below the range => Result = Fail, Reason = Regression.
-        #. If above the range => Result = Pass, Reason = Progression.
-
-    #. Generate and publish results
-
-        #. Relay evaluation result to job result.
-        #. Generate a new set of trend analysis summary graphs and drill-down
-           graphs.
-
-            #. Summary graphs to include measured values with Normal,
-               Progression and Regression markers. MM shown in the background if
-               possible.
-            #. Drill-down graphs to include MM, TMA and TMSD.
-
-        #. Publish trend analysis graphs in html format.
+VPP Performance Dashboard
+=========================
+
+Description
+-----------
+
+Performance dashboard tables provide the latest VPP throughput trend,
+trend compliance and detected anomalies, all on a per VPP test case
+basis.  Linked trendline graphs enable further drill-down into the
+trendline compliance, sequence and nature of anomalies, as well as
+pointers to performance test builds/logs and VPP (or DPDK) builds.
+Performance trending is currently based on the Maximum Receive Rate (MRR) tests.
+MRR tests measure the packet forwarding rate under the maximum load offered
+by traffic generator over a set trial duration, regardless of packet
+loss. See :ref:`trending_methodology` section for more detail including
+trend and anomaly calculations.
+
+Data samples are generated by the CSIT VPP (and DPDK) performance trending jobs
+executed twice a day (target start: every 12 hrs, 02:00, 14:00 UTC). All
+trend and anomaly evaluation is based on an algorithm which divides test runs
+into groups according to minimum description length principle.
+The trend value is the population average of the results within a group.
+
+Legend to the tables:
+
+    - **Test Case**: name of FD.io CSIT test case, naming convention
+      `here <https://wiki.fd.io/view/CSIT/csit-test-naming>`_.
+    - **Trend [Mpps]**: last value of performance trend.
+    - **Short-Term Change [%]**: Relative change of last trend value
+      vs. last week trend value.
+    - **Long-Term Change [%]**: Relative change of last trend value vs.
+      maximum of trend values over the last quarter except last week.
+    - **Regressions [#]**: Number of regressions detected.
+    - **Progressions [#]**: Number of progressions detected.
+
+Tested VPP worker-thread-core combinations (1t1c, 2t2c, 4t4c) are listed
+in separate tables in section 1.x. Followed by trending methodology in
+section 2. and trendline graphs in sections 3.x. Performance test  data
+used for trendline graphs is provided in sections 4.x.
+
+VPP worker on 1t1c
+------------------
+
+.. include:: ../../../_build/_static/vpp/performance-trending-dashboard-1t1c.rst
+
+VPP worker on 2t2c
+------------------
+
+.. include:: ../../../_build/_static/vpp/performance-trending-dashboard-2t2c.rst
+
+VPP worker on 4t4c
+------------------
+
+.. include:: ../../../_build/_static/vpp/performance-trending-dashboard-4t4c.rst