6 OpenMetrics specifies the de-facto standard for transmitting cloud-native
7 metrics at scale, with support for both text representation and Protocol
16 - draft-richih-opsawg-openmetrics-00
21 `OpenMetrics <https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md>`_
35 Telemetry module in CSIT currently support only Gauge, Counter and Info.
42 # HELP calls_total Number of calls total
43 # TYPE calls_total counter
44 calls_total{name="api-rx-from-ring",state="active",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
45 calls_total{name="fib-walk",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
46 calls_total{name="ip6-mld-process",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
47 calls_total{name="ip6-ra-process",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
48 calls_total{name="unix-epoll-input",state="polling",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 39584.0
49 calls_total{name="avf-0/18/6/0-output",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
50 calls_total{name="avf-0/18/6/0-tx",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
51 calls_total{name="avf-input",state="polling",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
52 calls_total{name="ethernet-input",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
53 calls_total{name="ip4-input-no-checksum",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
54 calls_total{name="ip4-lookup",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
55 calls_total{name="ip4-rewrite",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
56 calls_total{name="avf-0/18/2/0-output",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
57 calls_total{name="avf-0/18/2/0-tx",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
58 calls_total{name="avf-input",state="polling",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
59 calls_total{name="ethernet-input",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
60 calls_total{name="ip4-input-no-checksum",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
61 calls_total{name="ip4-lookup",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
62 calls_total{name="ip4-rewrite",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
63 calls_total{name="unix-epoll-input",state="polling",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 1.0
66 Anatomy of existing CSIT telemetry implementation
67 -------------------------------------------------
69 Existing implementation consists of several measurment building blocks:
70 the main measuring block running search algorithms (MLR, PLR, SOAK, MRR, ...),
71 the latency measuring block and the several telemetry blocks with or without
72 traffic running on a background.
74 The main measuring block must not be interrupted by any read operation that can
75 impact data plane traffic processing during throughput search algorithm. Thus
76 operational reads are done before (pre-stat) and after (post-stat) that block.
78 Some operational reads must be done while traffic is running and usually
79 consists of two reads (pre-run-stat, post-run-stat) with defined delay between
87 traffic_start(r=mrr) traffic_stop |< measure >|
89 | pre_run_stat post_run_stat | pre_stat | | post_stat
91 --o--------o---------------o---------o-------o--------+-------------------+------o------------>
99 - bash-perf-stat // if extended_debug == True
102 - vpp-enable-packettrace // if extended_debug == True
106 - vpp-show-packettrace // if extended_debug == True
114 |< traffic_trial0 >|< traffic_trial1 >|< traffic_trialN >|
115 | (i=0,t=duration) | (i=1,t=duration) | (i=N,t=duration) |
117 --o------------------------o------------------------o------------------------o--->
126 |< measure >| traffic_start(r=pdr) traffic_stop traffic_start(r=ndr) traffic_stop |< [ latency ] >|
127 | (r=mlr) | | | | | | .9/.5/.1/.0 |
128 | | | pre_run_stat post_run_stat | | pre_run_stat post_run_stat | | |
129 | | | | | | | | | | | |
130 --+-------------------+----o--------o---------------o---------o--------------o--------o---------------o---------o------------[---------------------]--->
138 - bash-perf-stat // if extended_debug == True
141 - vpp-enable-packettrace // if extended_debug == True
145 - vpp-show-packettrace // if extended_debug == True
149 Improving existing solution
150 ---------------------------
152 Improving existing CSIT telemetry implementaion including these areas.
154 - telemetry optimization
155 - reducing ssh overhead
156 - removing stats without added value
157 - telemetry scheduling
159 - improve configuration
163 Exesting stats implementation was abstracted to having pre-/post-run-stats
164 phases. Improvement will be done by merging pre-/post- logic implementation into
165 separated stat-runtime block configurable and locally executed on SUT.
167 This will increase precision, remove complexity and move implementation into
170 OpenMetric format for cloud native metric capturing will be used to ensure
171 integration with post processing module.
178 traffic_start(r=mrr) traffic_stop |< measure >|
180 | |< stat_runtime >| | stat_pre_trial | | stat_post_trial
182 ----o---+--------------------------+---o-------------o------------+-------------------+-----o------------->
190 - vpp-enable-packettrace // if extended_debug == True
193 - vpp-show-packettrace // if extended_debug == True
201 |< traffic_trial0 >|< traffic_trial1 >|< traffic_trialN >|
202 | (i=0,t=duration) | (i=1,t=duration) | (i=N,t=duration) |
204 --o------------------------o------------------------o------------------------o--->
211 |< program0 >|< program1 >|< programN >|
212 | (@=params) | (@=params) | (@=params) |
214 --o------------------------o------------------------o------------------------o--->
223 |< measure >| traffic_start(r=pdr) traffic_stop traffic_start(r=ndr) traffic_stop |< [ latency ] >|
224 | (r=mlr) | | | | | | .9/.5/.1/.0 |
225 | | | |< stat_runtime >| | | |< stat_runtime >| | | |
226 | | | | | | | | | | | |
227 --+-------------------+-----o---+--------------------------+---o--------------o---+--------------------------+---o-----------[---------------------]--->
235 - vpp-enable-packettrace // if extended_debug == True
238 - vpp-show-packettrace // if extended_debug == True