Report: Add RC1 data
[csit.git] / docs / report / introduction / methodology_telemetry.rst
1 .. _telemetry:
2
3 OpenMetrics
4 -----------
5
6 OpenMetrics specifies the de-facto standard for transmitting cloud-native
7 metrics at scale, with support for both text representation and Protocol
8 Buffers.
9
10 RFC
11 ~~~
12
13 - RFC2119
14 - RFC5234
15 - RFC8174
16 - draft-richih-opsawg-openmetrics-00
17
18 Reference
19 ~~~~~~~~~
20
21 `OpenMetrics <https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md>`_
22
23 Metric Types
24 ~~~~~~~~~~~~
25
26 - Gauge
27 - Counter
28 - StateSet
29 - Info
30 - Histogram
31 - GaugeHistogram
32 - Summary
33 - Unknown
34
35 Telemetry module in CSIT currently support only Gauge, Counter and Info.
36
37 Example metric file
38 ~~~~~~~~~~~~~~~~~~~
39
40 ```
41   # HELP calls_total Number of calls total
42   # TYPE calls_total counter
43   calls_total{name="api-rx-from-ring",state="active",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
44   calls_total{name="fib-walk",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
45   calls_total{name="ip6-mld-process",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
46   calls_total{name="ip6-ra-process",state="any wait",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 0.0
47   calls_total{name="unix-epoll-input",state="polling",thread_id="0",thread_lcore="1",thread_name="vpp_main"} 39584.0
48   calls_total{name="avf-0/18/6/0-output",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
49   calls_total{name="avf-0/18/6/0-tx",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
50   calls_total{name="avf-input",state="polling",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
51   calls_total{name="ethernet-input",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
52   calls_total{name="ip4-input-no-checksum",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
53   calls_total{name="ip4-lookup",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
54   calls_total{name="ip4-rewrite",state="active",thread_id="1",thread_lcore="2",thread_name="vpp_wk_0"} 91.0
55   calls_total{name="avf-0/18/2/0-output",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
56   calls_total{name="avf-0/18/2/0-tx",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
57   calls_total{name="avf-input",state="polling",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
58   calls_total{name="ethernet-input",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
59   calls_total{name="ip4-input-no-checksum",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
60   calls_total{name="ip4-lookup",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
61   calls_total{name="ip4-rewrite",state="active",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 91.0
62   calls_total{name="unix-epoll-input",state="polling",thread_id="2",thread_lcore="0",thread_name="vpp_wk_1"} 1.0
63 ```
64
65 Anatomy of existing CSIT telemetry implementation
66 -------------------------------------------------
67
68 Existing implementation consists of several measurment building blocks:
69 the main measuring block running search algorithms (MLR, PLR, SOAK, MRR, ...),
70 the latency measuring block and the several telemetry blocks with or without
71 traffic running on a background.
72
73 The main measuring block must not be interrupted by any read operation that can
74 impact data plane traffic processing during throughput search algorithm. Thus
75 operational reads are done before (pre-stat) and after (post-stat) that block.
76
77 Some operational reads must be done while traffic is running and usually
78 consists of two reads (pre-run-stat, post-run-stat) with defined delay between
79 them.
80
81 MRR measurement
82 ~~~~~~~~~~~~~~~
83
84 ```
85   traffic_start(r=mrr)               traffic_stop       |<     measure     >|
86     |                                  |                |      (r=mrr)      |
87     |   pre_run_stat   post_run_stat   |    pre_stat    |                   |  post_stat
88     |        |               |         |       |        |                   |      |
89   --o--------o---------------o---------o-------o--------+-------------------+------o------------>
90                                                                                               t
91
92 Legend:
93   - pre_run_stat
94     - vpp-clear-runtime
95   - post_run_stat
96     - vpp-show-runtime
97     - bash-perf-stat            // if extended_debug == True
98   - pre_stat
99     - vpp-clear-stats
100     - vpp-enable-packettrace    // if extended_debug == True
101     - vpp-enable-elog
102   - post_stat
103     - vpp-show-stats
104     - vpp-show-packettrace      // if extended_debug == True
105     - vpp-show-elog
106 ```
107
108 ```
109     |<                                measure                                 >|
110     |                                 (r=mrr)                                  |
111     |                                                                          |
112     |<    traffic_trial0    >|<    traffic_trial1    >|<    traffic_trialN    >|
113     |    (i=0,t=duration)    |    (i=1,t=duration)    |    (i=N,t=duration)    |
114     |                        |                        |                        |
115   --o------------------------o------------------------o------------------------o--->
116                                                                                  t
117 ```
118
119 MLR measurement
120 ~~~~~~~~~~~~~~~
121
122 ```
123     |<     measure     >|   traffic_start(r=pdr)               traffic_stop   traffic_start(r=ndr)               traffic_stop  |< [    latency    ] >|
124     |      (r=mlr)      |    |                                  |              |                                  |            |     .9/.5/.1/.0     |
125     |                   |    |   pre_run_stat   post_run_stat   |              |   pre_run_stat   post_run_stat   |            |                     |
126     |                   |    |        |               |         |              |        |               |         |            |                     |
127   --+-------------------+----o--------o---------------o---------o--------------o--------o---------------o---------o------------[---------------------]--->
128                                                                                                                                                        t
129
130 Legend:
131   - pre_run_stat
132     - vpp-clear-runtime
133   - post_run_stat
134     - vpp-show-runtime
135     - bash-perf-stat          // if extended_debug == True
136   - pre_stat
137     - vpp-clear-stats
138     - vpp-enable-packettrace  // if extended_debug == True
139     - vpp-enable-elog
140   - post_stat
141     - vpp-show-stats
142     - vpp-show-packettrace    // if extended_debug == True
143     - vpp-show-elog
144 ```
145
146
147 Improving existing solution
148 ---------------------------
149
150 Improving existing CSIT telemetry implementaion including these areas.
151
152 - telemetry optimization
153   - reducing ssh overhead
154   - removing stats without added value
155 - telemetry scheduling
156   - improve accuracy
157   - improve configuration
158 - telemetry output
159   - standardize output
160
161 Exesting stats implementation was abstracted to having pre-/post-run-stats
162 phases. Improvement will be done by merging pre-/post- logic implementation into
163 separated stat-runtime block configurable and locally executed on SUT.
164
165 This will increase precision, remove complexity and move implementation into
166 spearated module.
167
168 OpenMetric format for cloud native metric capturing will be used to ensure
169 integration with post processing module.
170
171 MRR measurement
172 ~~~~~~~~~~~~~~~
173
174 ```
175     traffic_start(r=mrr)               traffic_stop                 |<     measure     >|
176       |                                  |                          |      (r=mrr)      |
177       |   |<      stat_runtime      >|   |          stat_pre_trial  |                   |  stat_post_trial
178       |   |                          |   |             |            |                   |     |
179   ----o---+--------------------------+---o-------------o------------+-------------------+-----o------------->
180                                                                                                           t
181
182 Legend:
183   - stat_runtime
184     - vpp-runtime
185   - stat_pre_trial
186     - vpp-clear-stats
187     - vpp-enable-packettrace  // if extended_debug == True
188   - stat_post_trial
189     - vpp-show-stats
190     - vpp-show-packettrace    // if extended_debug == True
191 ```
192
193 ```
194     |<                                measure                                 >|
195     |                                 (r=mrr)                                  |
196     |                                                                          |
197     |<    traffic_trial0    >|<    traffic_trial1    >|<    traffic_trialN    >|
198     |    (i=0,t=duration)    |    (i=1,t=duration)    |    (i=N,t=duration)    |
199     |                        |                        |                        |
200   --o------------------------o------------------------o------------------------o--->
201                                                                                  t
202 ```
203
204 ```
205     |<                              stat_runtime                              >|
206     |                                                                          |
207     |<       program0       >|<       program1       >|<       programN       >|
208     |       (@=params)       |       (@=params)       |       (@=params)       |
209     |                        |                        |                        |
210   --o------------------------o------------------------o------------------------o--->
211                                                                                  t
212 ```
213
214
215 MLR measurement
216 ~~~~~~~~~~~~~~~
217
218 ```
219     |<     measure     >|   traffic_start(r=pdr)               traffic_stop   traffic_start(r=ndr)               traffic_stop  |< [    latency    ] >|
220     |      (r=mlr)      |     |                                  |              |                                  |           |     .9/.5/.1/.0     |
221     |                   |     |   |<      stat_runtime      >|   |              |   |<      stat_runtime      >|   |           |                     |
222     |                   |     |   |                          |   |              |   |                          |   |           |                     |
223   --+-------------------+-----o---+--------------------------+---o--------------o---+--------------------------+---o-----------[---------------------]--->
224                                                                                                                                                        t
225
226 Legend:
227   - stat_runtime
228     - vpp-runtime
229   - stat_pre_trial
230     - vpp-clear-stats
231     - vpp-enable-packettrace  // if extended_debug == True
232   - stat_post_trial
233     - vpp-show-stats
234     - vpp-show-packettrace    // if extended_debug == True
235 ```
236
237
238 Tooling
239 -------
240
241 Prereqisities:
242 - bpfcc-tools
243 - python-bpfcc
244 - libbpfcc
245 - libbpfcc-dev
246 - libclang1-9 libllvm9
247
248 ```bash
249   $ sudo apt install bpfcc-tools python3-bpfcc libbpfcc libbpfcc-dev libclang1-9 libllvm9
250 ```
251
252
253 Configuration
254 -------------
255
256 ```yaml
257   logging:
258     version: 1
259     formatters:
260       console:
261         format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
262       prom:
263         format: '%(message)s'
264     handlers:
265       console:
266         class: logging.StreamHandler
267         level: INFO
268         formatter: console
269         stream: ext://sys.stdout
270       prom:
271         class: logging.handlers.RotatingFileHandler
272         level: INFO
273         formatter: prom
274         filename: /tmp/metric.prom
275         mode: w
276     loggers:
277       prom:
278         handlers: [prom]
279         level: INFO
280         propagate: False
281     root:
282       level: INFO
283       handlers: [console]
284   scheduler:
285     duration: 1
286   programs:
287     - name: bundle_bpf
288       metrics:
289         counter:
290           - name: cpu_cycle
291             documentation: Cycles processed by CPUs
292             namespace: bpf
293             labelnames:
294               - name
295               - cpu
296               - pid
297           - name: cpu_instruction
298             documentation: Instructions retired by CPUs
299             namespace: bpf
300             labelnames:
301               - name
302               - cpu
303               - pid
304           - name: llc_reference
305             documentation: Last level cache operations by type
306             namespace: bpf
307             labelnames:
308               - name
309               - cpu
310               - pid
311           - name: llc_miss
312             documentation: Last level cache operations by type
313             namespace: bpf
314             labelnames:
315               - name
316               - cpu
317               - pid
318       events:
319         - type: 0x0 # HARDWARE
320           name: 0x0 # PERF_COUNT_HW_CPU_CYCLES
321           target: on_cpu_cycle
322           table: cpu_cycle
323         - type: 0x0 # HARDWARE
324           name: 0x1 # PERF_COUNT_HW_INSTRUCTIONS
325           target: on_cpu_instruction
326           table: cpu_instruction
327         - type: 0x0 # HARDWARE
328           name: 0x2 # PERF_COUNT_HW_CACHE_REFERENCES
329           target: on_cache_reference
330           table: llc_reference
331         - type: 0x0 # HARDWARE
332           name: 0x3 # PERF_COUNT_HW_CACHE_MISSES
333           target: on_cache_miss
334           table: llc_miss
335       code: |
336         #include <linux/ptrace.h>
337         #include <uapi/linux/bpf_perf_event.h>
338
339         const int max_cpus = 256;
340
341         struct key_t {
342             int cpu;
343             int pid;
344             char name[TASK_COMM_LEN];
345         };
346
347         BPF_HASH(llc_miss, struct key_t);
348         BPF_HASH(llc_reference, struct key_t);
349         BPF_HASH(cpu_instruction, struct key_t);
350         BPF_HASH(cpu_cycle, struct key_t);
351
352         static inline __attribute__((always_inline)) void get_key(struct key_t* key) {
353             key->cpu = bpf_get_smp_processor_id();
354             key->pid = bpf_get_current_pid_tgid();
355             bpf_get_current_comm(&(key->name), sizeof(key->name));
356         }
357
358         int on_cpu_cycle(struct bpf_perf_event_data *ctx) {
359             struct key_t key = {};
360             get_key(&key);
361
362             cpu_cycle.increment(key, ctx->sample_period);
363             return 0;
364         }
365         int on_cpu_instruction(struct bpf_perf_event_data *ctx) {
366             struct key_t key = {};
367             get_key(&key);
368
369             cpu_instruction.increment(key, ctx->sample_period);
370             return 0;
371         }
372         int on_cache_reference(struct bpf_perf_event_data *ctx) {
373             struct key_t key = {};
374             get_key(&key);
375
376             llc_reference.increment(key, ctx->sample_period);
377             return 0;
378         }
379         int on_cache_miss(struct bpf_perf_event_data *ctx) {
380             struct key_t key = {};
381             get_key(&key);
382
383             llc_miss.increment(key, ctx->sample_period);
384             return 0;
385         }
386 ```
387
388 CSIT captured metrics
389 ---------------------
390
391 SUT
392 ~~~
393
394 Compute resource
395 ________________
396
397 - BPF /process
398   - BPF_HASH(llc_miss, struct key_t);
399   - BPF_HASH(llc_reference, struct key_t);
400   - BPF_HASH(cpu_instruction, struct key_t);
401   - BPF_HASH(cpu_cycle, struct key_t);
402
403 Memory resource
404 _______________
405
406 - BPF /process
407   - tbd
408
409 Network resource
410 ________________
411
412 - BPF /process
413   - tbd
414
415 DUT VPP metrics
416 ~~~~~~~~~~~~~~~
417
418 Compute resource
419 ________________
420
421 - runtime /node `show runtime`
422   - calls
423   - vectors
424   - suspends
425   - clocks
426   - vectors_calls
427 - perfmon /bundle
428   - inst-and-clock      node      intel-core          instructions/packet, cycles/packet and IPC
429   - cache-hierarchy     node      intel-core          cache hits and misses
430   - context-switches    thread    linux               per-thread context switches
431   - branch-mispred      node      intel-core          Branches, branches taken and mis-predictions
432   - page-faults         thread    linux               per-thread page faults
433   - load-blocks         node      intel-core          load operations blocked due to various uarch reasons
434   - power-licensing     node      intel-core          Thread power licensing
435   - memory-bandwidth    system    intel-uncore        memory reads and writes per memory controller channel
436
437 Memory resource - tbd
438 _____________________
439
440 - memory /segment `show memory verbose api-segment stats-segment main-heap`
441   - total
442   - used
443   - free
444   - trimmable
445   - free-chunks
446   - free-fastbin-blks
447   - max-total-allocated
448 - physmem `show physmem`
449   - pages
450   - subpage-size
451
452 Network resource
453 ________________
454
455 - counters /node `show node counters`
456   - count
457   - severity
458 - hardware /interface `show interface`
459   - rx_stats
460   - tx_stats
461 - packets /interface `show hardware`
462   - rx_packets
463   - rx_bytes
464   - rx_errors
465   - tx_packets
466   - tx_bytes
467   - tx_errors
468   - drops
469   - punt
470   - ip4
471   - ip6
472   - rx_no_buf
473   - rx_miss
474
475
476 DUT DPDK metrics - tbd
477 ~~~~~~~~~~~~~~~~~~~~~~
478
479 Compute resource
480 ________________
481
482 - BPF /process
483   - BPF_HASH(llc_miss, struct key_t);
484   - BPF_HASH(llc_reference, struct key_t);
485   - BPF_HASH(cpu_instruction, struct key_t);
486   - BPF_HASH(cpu_cycle, struct key_t);
487
488 Memory resource
489 _______________
490
491 - BPF /process
492   - tbd
493
494 Network resource
495 ________________
496
497 - packets /interface
498   - inPackets
499   - outPackets
500   - inBytes
501   - outBytes
502   - outErrorPackets
503   - dropPackets