Commit graph

12 commits

Author SHA1 Message Date
hagen1778
383dce7201
alerts: simplify aggregation of alerting rules
This is follow-up after
75196d7234

It updates some of the alerting rules to remove unnecessary aggregations.
It keeps aggregations for expressions which are using multiple time series
filters to make sure their label will match.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8fb68152e6)
2023-12-11 15:38:16 +01:00
hagen1778
d9118cdaab
deployment/alerts: update TooHighMemoryUsage annotation
The memory usage isn't measured on 5m interval anymore.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4e0a779efe)
2023-10-25 14:39:48 +02:00
hagen1778
d349d6a9ce
deployment/alerts: make TooHighMemoryUsage more tolerable to spikes
Using `min_over_time` should reduce the amount of false positives when
component is running in near-the-threshold state. Now it should trigger
only if all collected samples were above the threshold on 10m interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 003ef3a518)
2023-10-25 14:39:48 +02:00
hagen1778
297f63a01e
alerting: account for vmauth component for alerts ServiceDown and TooManyRestarts
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-03 17:52:43 +02:00
hagen1778
f2b06484f2
alerts: move ConcurrentFlushesHitTheLimit alert to health alerts
The `ConcurrentFlushesHitTheLimit` could be related to components like
vminsert, vmstorage, vm-single-node and vmagent. Moving this alert
to the `health` section of alerts will be benefitial for all components
and will remove the duplicates from single/cluster alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-11 04:39:28 -07:00
Haleygo
1794a97ebe
add vmalertmanager filter for health alerts (#4665) 2023-07-19 14:50:06 -07:00
Roman Khavronenko
3049754575
alerts: update TooHighMemoryUsage threshold (#4256)
It appears that 90% usage for anonymous mem usage
is already concerning. So we lowering the threshold to 80%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 21:41:40 -07:00
Max Golionko
ef11e49d7b
add vmsingle filter for health alerts (#4238) 2023-05-08 22:00:43 -07:00
Max Golionko
0f14beff58
alerts: relax job filter to support job names created by VMOperator (#4203) 2023-05-08 15:52:31 -07:00
Roman Khavronenko
66c5ddf2ad
alerts: add TooManyTSIDMisses alerting rule (#3959)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502#issuecomment-1358374954

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-17 16:14:50 -07:00
Zakhar Bessarab
434b00cee8
docker-compose: move TooManyLogs into vm-health alerts set (#3199) 2022-10-05 22:42:31 +03:00
Roman Khavronenko
f772ee8326
deployment/docker: move cluster compose env to master branch (#3130)
* deployment/docker: move cluster compose env to master branch

The change supposed to simplify the process of maintaining for
single/cluster docker-compose envs, alerts, dashboards. It also
supposes to reduce confusion for users when looking for cluster
related alerts/configs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* deployment/docker: move cluster compose env to master branch

Review updates.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 12:03:10 +03:00