vmagent: updated dashboard and alert for stream aggregation (#6427)

### Describe Your Changes

Added streaming aggregation section to vmagent dashboards
Added alert for streaming aggregation and deduplication flush timeouts
Removed deprecated compose versions from compose files

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 2da45a8368)
This commit is contained in:
Andrii Chubatiuk 2024-06-10 12:49:00 +03:00 committed by hagen1778
parent d6b56a1460
commit 6fd314d8ba
No known key found for this signature in database
GPG key ID: 3BF75F3741CA9640
9 changed files with 1456 additions and 205 deletions

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -135,3 +135,23 @@ groups:
summary: "Configuration reload failed for vmagent instance {{ $labels.instance }}"
description: "Configuration hot-reload failed for vmagent on instance {{ $labels.instance }}.
Check vmagent's logs for detailed error message."
- alert: StreamAggrFlushTimeout
expr: |
increase(vm_streamaggr_flush_timeouts_total[5m]) > 0
labels:
severity: warning
annotations:
summary: "Streaming aggregation at \"{{ $labels.job }}\" (instance {{ $labels.instance }}) can't be finished within the configured aggregation interval."
description: "Stream aggregation process can't keep up with the load and might produce incorrect aggregation results. Check logs for more details.
Possible solutions: increase aggregation interval; aggregate smaller number of series; reduce samples' ingestion rate to stream aggregation."
- alert: StreamAggrDedupFlushTimeout
expr: |
increase(vm_streamaggr_dedup_flush_timeouts_total[5m]) > 0
labels:
severity: warning
annotations:
summary: "Deduplication \"{{ $labels.job }}\" (instance {{ $labels.instance }}) can't be finished within configured deduplication interval."
description: "Deduplication process can't keep up with the load and might produce incorrect results. Check docs https://docs.victoriametrics.com/stream-aggregation/#deduplication and logs for more details.
Possible solutions: increase deduplication interval; deduplicate smaller number of series; reduce samples' ingestion rate."

View file

@ -1,4 +1,3 @@
version: '3.5'
services:
# Metrics collector.
# It scrapes targets defined in --promscrape.config

View file

@ -1,4 +1,3 @@
version: "3.5"
services:
# Grafana instance configured with VictoriaLogs as datasource
grafana:

View file

@ -1,4 +1,3 @@
version: "3.5"
services:
# Metrics collector.
# It scrapes targets defined in --promscrape.config

View file

@ -1,4 +1,3 @@
version: "3.5"
services:
grafana:
container_name: grafana

View file

@ -1,4 +1,3 @@
version: "3.5"
services:
grafana:
container_name: grafana

View file

@ -30,6 +30,10 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
## tip
* FEATURE: [alerts-vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-vmagent.yml): add new alerting rules `StreamAggrFlushTimeout` and `StreamAggrDedupFlushTimeout` to notify about issues during stream aggregation.
* FEATURE: [dashboards/vmagent](https://grafana.com/grafana/dashboards/12683): add row `Streaming aggregation` with panels related to [streaming aggregation](https://docs.victoriametrics.com/stream-aggregation/) process.
## [v1.102.0-rc1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.102.0-rc1)
Released at 2024-06-07