Commit graph

65 commits

Author SHA1 Message Date
Haleygo
45c0e4bb31
vmalert: add eval_offset for group (#4693)
Adds `eval_offset` attribute for Groups. 
If specified, Group will be evaluated at the exact time offset on the range of [0...evaluationInterval]. 
The setting might be useful for cron-like rules which must be evaluated at specific moments of time. 

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3409

Signed-off-by: Haley Wang <pipilong.25@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-09-06 16:29:59 +02:00
Haleygo
ae0e4a8c90
vmalert: add keep_firing_for field for alerting rule (#4669)
vmalert: support `keep_firing_for` field for alerting rule

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4529

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-27 15:13:13 +02:00
Roman Khavronenko
51196739af
Vmalert UI updates (#4276)
* vmalert: expand rule groups on anchor click

before, anchor click was only updating the URL.
To expand the group, user had to click on rule's block.
Now, group will toggle automatically.

* vmalert: allow filtering group in web UI

The new filter allows to filter groups and rules within
groups by: errors only or noMatch only.

The filtering supposed to help navigating big numbers of groups/rules.
Filtering is reflected in URL, so can be shared as a link.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 14:38:13 +02:00
Roman Khavronenko
fa3a17938e
vmalert: follow-up after cae87da (#4269)
* vmalert: follow-up after cae87da

cae87da4bb
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: update struct comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rm typo

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 13:31:54 +02:00
Haleygo
cae87da4bb
vmalert: support reading rule from http url (#4212)
vmalert: support reading rule's config from HTTP URL
2023-05-08 09:52:57 +02:00
Roman Khavronenko
29e059e49c
app/vmalert: follow-up after 6c322b4a00 (#4214)
6c322b4a00

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:21 +02:00
Haleygo
6c322b4a00
vmalert: allow configuring custom notifier headers per group (#4088)
vmalert: allow configuring custom notifier headers per group
2023-04-27 12:17:26 +02:00
Zakhar Bessarab
b21a55febf
app/vmalert: add support of recursive path globs for rules and templates (#4148)
Supports using `**` for `-rule` and `-rule.templates`: `dir/**/*.tpl` loads contents of dir and all subdirectories recursively.

See: #4041

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-04-26 19:20:22 +02:00
Roman Khavronenko
3283f0dae4
vmalert: support logs suppressing during config reloads (#3973)
* vmalert: support logs suppressing during config reloads

The change is mostly required for ENT version of vmalert,
since it supports object-storage for config files.
Reading data from object storage could be time-consuming,
so vmalert emits logs to track the progress.

However, these logs are mostly needed on start or on
manual config reload. Printing these logs each time
`rule.configCheckInterval` is triggered would too verbose.
So the change allows to control logs emitting during
config reloads.

Now, logs are emitted during start up or when SIGHUP is receieved.
For periodicall config checks logs emitted by config pkg are suppressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-20 16:08:30 +01:00
Roman Khavronenko
d66bae212b
app/vmalert: log number of configration files found for each specified -rule (#3936)
The change also introduces `List` method to `FS` interface.
The `List` method can be used for wildcard support in object storage FS.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-03-09 14:46:19 +01:00
Roman Khavronenko
14c20c1843
vmalert: support object storage for rules (#519)
* vmalert: support object storage for rules

Support loading of alerting and recording rules from object
storages `gcs://`, `gs://`, `s3://`.

* review fixes
2023-02-09 18:50:48 -08:00
Roman Khavronenko
6588fcbfca
vmalert: allow configuring the default number of stored rule's update states (#3556)
Allow configuring the default number of stored rule's update states in memory
 via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
 `update_entries_limit` configuration param.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 12:36:44 +01:00
Aliaksandr Valialkin
518c340ae3
lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:49:33 +03:00
Aliaksandr Valialkin
069401a304
all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:47:16 +03:00
Roman Khavronenko
39f559d22b
vmalert: allow using extra labels in annotations (#3181)
According to Ruler specification, only labels returned within time series
should be available for use in annotations.

For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013

This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-29 18:22:50 +02:00
Roman Khavronenko
d61cce06fd
vmalert: prodvide more details on duplicates (#3136)
Now vmalert will print the following messages on dupliсates:
```
"recording rule \"record\"; expr: \"up == 1\"; labels: summary={{ value|query }}" is a duplicate within the group "test"
"alerting rule \"alert\"; expr: \"up == 1\"; labels: description={{ value|query }}" is a duplicate within the group "test"
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3127
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-20 12:52:46 +02:00
Roman Khavronenko
8441375da2
vmalert: add debug mode for alerting rules (#3055)
* vmalert: add `debug` mode for alerting rules

Debug information includes alerts state changes and requests
sent to the datasource. Debug can be enabled only on rule's
level. It might be useful for debugging unexpected
behaviour of alerting rule.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/alerting.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* vmalert: go fmt

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-13 16:25:43 +03:00
Aliaksandr Valialkin
9f94c295ab
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:52:35 +03:00
Aliaksandr Valialkin
2e0b6d680e
app/vmalert/config: add missing docs for ValidateTplFn 2022-07-25 09:23:13 +03:00
Roman Khavronenko
1aa5112771
vmalert: remove notifier dependency from config (#2906)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 13:50:41 +02:00
Roman Khavronenko
2914ce5ca5
vmalert: remove dependency on datasource pkg from config (#2905)
* vmalert: remove dependency on datasource pkg from config

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 10:44:55 +02:00
Roman Khavronenko
88edb3f6cf
vmalert: allow configuring custom headers per group (#2901)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2860

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-21 15:59:55 +02:00
Roman Khavronenko
f07bfcf0c9
vmalert: drop support of deprecated extra_filter_labels param (#2870)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-14 15:45:08 +03:00
Roman Khavronenko
48a60eb593
vmalert: followup for 76f05f8670 (#2706)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-09 08:58:25 +02:00
Howie
76f05f8670
feat: rule limit (#2676)
vmalert: support `limit` param in groups definition

`limit` param limits number of time series samples produced by a single rule
during execution.
On reaching the limit rule will return an err.

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-06-09 08:21:30 +02:00
Roman Khavronenko
34116882b4
vmalert: support scalar type in response (#2610)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-18 09:50:46 +02:00
Andrii Chubatiuk
a531a96193
added reusable templates support (#2532)
Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2022-05-14 11:38:44 +02:00
Aliaksandr Valialkin
ebaa1c7ad5
lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:25:54 +03:00
Dmytro Kozlov
11ae1ae924
Added resendDelay for alerts (#2296)
* vmalert: add support of `resendDelay` flag for alerts

Co-authored-by: dmitryk-dk <dmitry.kozlov@brightlocal.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-03-16 15:26:33 +00:00
hagen1778
2efa46a11c vmalert: support $externalLabels and $externalURL in templates
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2193
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 17:33:52 +03:00
Roman Khavronenko
e3adcbec6e
lib/promscrape: support prometheus-like duration in scrape configs (#2169)
* lib/promscrape: support prometheus-like duration in scrape configs

The change allows to specify duration values like `1d`, `1w`
for fields `scrape_interval`, `scrape_timeout`, etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/blockcache: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/promscrape: support prometheus-like duration in scrape configs

* add support for extra fields `scrape_align_interval` and `scrape_offset`;
* support Prometheus duration parsing for `__scrape_interval__`
and `__scrape_duration__` labels;

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

* docs/CHANGELOG.md: document the feature

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 16:17:00 +02:00
Aliaksandr Valialkin
896fa9bb7c
app/vmalert/config: sort extra_filter labels before passing them to query args in order to get consistent order of query args across runs
This fixes TestGroupParams test - see https://github.com/VictoriaMetrics/VictoriaMetrics/runs/4432510244?check_suite_focus=true#step:5:288
2021-12-08 13:02:49 +02:00
Roman Khavronenko
0afd14a14a
vmalert: introduce additional HTTP URL params per-group configuration (#1892)
* vmalert: introduce additional HTTP URL params per-group configuration

The new group field `params` allows to configure custom HTTP URL params
per each group. These params will be applied to every request before
executing rule's expression. Hot config reload is also supported.

Field `extra_filter_labels` was deprecated in favour of `params` field.
vmalert will print deprecation log message if config file contains
the deprecated field.

`params` fields are supported by both Prometheus and Graphite datasource types.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: provide more examples for `params` field

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: set higher priority for `params` setting

If there would be a conflict between URL params set in `datasource.url` flag
and params in group definition the latter will have higher priority.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:45:08 +02:00
Aliaksandr Valialkin
bf814320b0
app/vmalert: remove rule.type config, since it doesnt play well with the upcoming default tenants for -clusterMode
It is better from the consistency point of view to set up rule types at group level where tenant config is set up.
2021-11-05 19:52:32 +02:00
Roman Khavronenko
3dbdf1632e
vmalert: allow groups with empty rules for compatibility reasons (#1742)
Prometheus allows to have groups with no rules, so we should support
it in vmalert as well for compatibility reasons.
It is also allowed to hot-reload empty groups by adding or removing rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-25 12:15:02 +03:00
Roman Khavronenko
0e35fc9538
app/vmalert: remove unnecessary omitempty tag for interval param (#1649)
`omitempty` tag resulted into skipping this param on marshaling,
which was used as a checksum for groups configuration. Since on
config reload checksums are compared before applying changes,
any change to `interval` only didn't trigger config reload.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-23 17:55:59 +03:00
Nikolay
7c70dcbe3b
adds external_labels per group for vmalert (#1485)
* adds external_label per group for vmalert
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1471
2021-08-31 14:52:34 +03:00
Aliaksandr Valialkin
bfba4c28a4 app/vmalert: accept Prometheus-like durations in interval config option inside group section 2021-07-12 12:35:17 +03:00
Roman Khavronenko
2a259ef5e7
vmalert: support rules backfilling (aka replay) (#1358)
* vmalert: support rules backfilling (aka `replay`)

vmalert can `replay` configured rules in the past
and backfill results via remote write protocol.
It supports MetricsQL/PromQL storage as data source,
and can backfill data to remote write compatible
storage.

Supports recording and alerting rules `replay`. See more
details in README.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/836

* vmalert: review fixes

* vmalert: readme fixes
2021-06-09 12:20:38 +03:00
Roman Khavronenko
84cc0513e1
vmalert: support extra_filter_labels setting per-group (#1319)
The new setting `extra_filter_labels` may be assigned to group.
If it is, then all rules within a group will automatically filter
for configured labels. The feature is well-described here
https://docs.victoriametrics.com#prometheus-querying-api-enhancements

New setting is compatible only with VM datasource.
2021-05-23 00:26:01 +03:00
Nikolay
d626c5c2a9
changes vmalert query function (#1307)
* changes vmalert query function
for prometheus rules compatibility its better to use labels as map.
it simplifies template evaluation and allow to ignore can't evaluate field error
because map will return default value.
fixes https://github.com/VictoriaMetrics/operator/issues/243
2021-05-21 13:55:43 +03:00
Nikolay
195341a7cf Graphite vmalert wip (#112)
* init implementation for graphite alerts

* adds graphite support for vmalert

* small fix

* changes vmalert graphite api with type

* updates tests

* small fix

* fixes graphite parse

* Fixes graphite from time
2021-02-01 15:05:32 +02:00
Roman Khavronenko
2e2e4f7e21
vmalert-989: return non-empty result in template func query stub to pass validation (#1002)
On templates validation stage vmalert does not acutally send queries, so for complex
chained expression validation may fail. To avoid this, we add a blank sample in response
so validation can pass successfully. Later, during the rule execution, stub will be replaced
with real `query` function.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989
2021-01-10 02:56:11 +03:00
Roman Khavronenko
404cbd1522
vmalert-974: fix order for labels templating (#975)
The change fixes bug caused by 3adf8c5a6f.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974
2020-12-19 14:10:59 +02:00
Roman Khavronenko
6247884057
vmalert: add function "query", "first" and "value" to alert templates functions (#960)
The commit adds a support for template function `query`,
`first` and `value`. The function `query` executes
a MetricsQL query for active alerts. In vmalert we
update templates on every evaluation for active alerts
to keep them up to date. With `query` func it may become
a perf issue since it will fire a query on every execution.
We should keep it in mind for now.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539
2020-12-14 20:11:45 +02:00
Nikolay
0463cb5550
fixes checksum calculation (#928)
* fixes checksum calculation,
'for' rule param wasnt marshal properly during checksum calculation

* fixes error
2020-11-29 09:48:42 +02:00
kreedom
ef77120170
vmalert - add dryRun (#842)
vmalert: add `dryRun` flag for rules validation without running the service
2020-10-20 08:15:21 +01:00
Aliaksandr Valialkin
f4e8687c88 app/vmalert: accept days, weeks and years in for: part of config like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
2020-10-08 20:13:15 +03:00
Aliaksandr Valialkin
2985077c35 all: consistently use "%w" formatting in fmt.Errorf for wrapped errors 2020-09-23 22:46:34 +03:00
Roman Khavronenko
4cdffb04a4
vmalert: update groups on config reload only if changes detected (#759)
On config reload event `vmalert` reloads configuration for every group. While
it works for simple configurations, the more complex and heavy installations may
suffer from frequent config reloads.
The change introduces the `checksum` field for every group and is set to md5 hash
of yaml configuration. The checksum will change if on any change to group
definition like rules order or annotation change. Comparing the `checksum` field
on config reload event helps to detect if group should be updated.
The groups update is now done concurrently, so reload duration will be limited by
the slowest group now.

Partially solves #691 by improving config reload speed.
2020-09-11 20:14:30 +01:00