deployment/alerts: add RemoteWriteDroppingData to vmalert rules (#7393)

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
Roman Khavronenko 2024-10-31 14:03:08 +01:00 committed by GitHub
parent bfb55d5f2f
commit 3f0e2ab3b2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 12 additions and 0 deletions

View file

@ -74,6 +74,17 @@ groups:
description: "vmalert instance {{ $labels.instance }} is failing to push metrics generated via alerting description: "vmalert instance {{ $labels.instance }} is failing to push metrics generated via alerting
or recording rules to the configured remote write URL. Check vmalert's logs for detailed error message." or recording rules to the configured remote write URL. Check vmalert's logs for detailed error message."
- alert: RemoteWriteDroppingData
expr: increase(vmalert_remotewrite_dropped_rows_total[5m]) > 0
for: 5m
labels:
severity: critical
annotations:
summary: "vmalert instance {{ $labels.instance }} is dropping data sent to remote write URL"
description: "vmalert instance {{ $labels.instance }} is failing to send results of alerting or recording rules
to the configured remote write URL. This may result into gaps in recording rules or alerts state.
Check vmalert's logs for detailed error message."
- alert: AlertmanagerErrors - alert: AlertmanagerErrors
expr: increase(vmalert_alerts_send_errors_total[5m]) > 0 expr: increase(vmalert_alerts_send_errors_total[5m]) > 0
for: 15m for: 15m

View file

@ -23,6 +23,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/): support scraping from Kubernetes Native Sidecars. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7287). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/): support scraping from Kubernetes Native Sidecars. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7287).
* FEATURE: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/) and `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add a separate cache type for storing sparse entries when performing large index scans. This significantly reduces memory usage when applying [downsampling filters](https://docs.victoriametrics.com/#downsampling) and [retention filters](https://docs.victoriametrics.com/#retention-filters) during background merge. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182) for the details. * FEATURE: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/) and `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add a separate cache type for storing sparse entries when performing large index scans. This significantly reduces memory usage when applying [downsampling filters](https://docs.victoriametrics.com/#downsampling) and [retention filters](https://docs.victoriametrics.com/#retention-filters) during background merge. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182) for the details.
* FEATURE: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for VM single-node, cluster, vmalert, vmagent, VictoriaLogs: add `Restarts` panel to show the events of process restarts. This panel should help correlate events of restart with unexpected behavior of processes. * FEATURE: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for VM single-node, cluster, vmalert, vmagent, VictoriaLogs: add `Restarts` panel to show the events of process restarts. This panel should help correlate events of restart with unexpected behavior of processes.
* FEATURE: [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts-vmalert.yml): add alerting rule `RemoteWriteDroppingData` to track number of dropped samples that weren't sent to remote write URL.
* BUGFIX: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for Single-node VictoriaMetrics, cluster: The free disk space calculation now will subtract the size of the `-storage.minFreeDiskSpaceBytes` flag to correctly display the remaining available space of Single-node VictoriaMetrics/vmstorage rather than the actual available disk space, as well as the full ETA. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7334) for the details. * BUGFIX: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for Single-node VictoriaMetrics, cluster: The free disk space calculation now will subtract the size of the `-storage.minFreeDiskSpaceBytes` flag to correctly display the remaining available space of Single-node VictoriaMetrics/vmstorage rather than the actual available disk space, as well as the full ETA. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7334) for the details.
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert): properly set `group_name` and `file` fields for recording rules in `/api/v1/rules`. * BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert): properly set `group_name` and `file` fields for recording rules in `/api/v1/rules`.