Commit graph

193 commits

Author SHA1 Message Date
Roman Khavronenko
e92c987dc3
vmalert: check if remoteWrite is configured for replay mode (#1990)
* vmalert: check if remoteWrite is configured for replay mode

The purpose of `replay` mode is to backfill results of recording
or alerting rules. So `remoteWrite.url` should be required.
Otherwise, process can fail on attempt to send data.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 20:31:10 +02:00
Roman Khavronenko
1f0301c809
vmalert: always convert step value to seconds for better compatibility (#1955)
When using `vmalert` with older Prometheus versions, the passed
`step=2m` may be parsed by Prometheus with an err: "cannot parse \"2m0s\" to a valid duration".
In order to improve compatibility vmalert will always convert step duration to seconds.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-17 20:17:09 +02:00
Roman Khavronenko
e7d7f6c94e
vmalert: update the order of service labels attaching (#1922)
Service labels like `alertname` or `alertgroup` were attached
after template expanding for `labels` section. Because of this,
labels `alertname` or `alertgroup` weren't available for templating
in `labels` section of alert's definition.
This commit changes the order of labels attaching and adds a test
for verifying these labels availability.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1921
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-10 12:14:16 +02:00
Aliaksandr Valialkin
7d8c481960
app/vmalert/config: sort extra_filter labels before passing them to query args in order to get consistent order of query args across runs
This fixes TestGroupParams test - see https://github.com/VictoriaMetrics/VictoriaMetrics/runs/4432510244?check_suite_focus=true#step:5:288
2021-12-08 13:04:08 +02:00
Roman Khavronenko
582c063698
vmalert: introduce additional HTTP URL params per-group configuration (#1892)
* vmalert: introduce additional HTTP URL params per-group configuration

The new group field `params` allows to configure custom HTTP URL params
per each group. These params will be applied to every request before
executing rule's expression. Hot config reload is also supported.

Field `extra_filter_labels` was deprecated in favour of `params` field.
vmalert will print deprecation log message if config file contains
the deprecated field.

`params` fields are supported by both Prometheus and Graphite datasource types.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: provide more examples for `params` field

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: set higher priority for `params` setting

If there would be a conflict between URL params set in `datasource.url` flag
and params in group definition the latter will have higher priority.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:51:54 +02:00
Roman Khavronenko
8b67168609
ci: bump go version to 1.17 (#1895)
The bump was required for `vmalert` package.
`vmalert` docs now also contain an updated description.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:51:45 +02:00
Roman Khavronenko
fc4e59ec1d
vmalert: adjust topologies docs in README (#1893)
Commit changes images width and order in topologies section
for better readability.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 10:33:47 +02:00
Roman Khavronenko
6823818843
Vmalert docs upd (#1890)
* vmalert: add topology examples in docs

* vmalert: docs typo fix
2021-12-02 10:33:47 +02:00
Roman Khavronenko
83c46eaf1a
vmalert: continue to print errors for bad config during hot reload (#1871)
Previously, vmalert would print an err message and set vmalert_config_last_reload_successful=0
only once during a hot reload of a bad config. Such behaviour may result into non noticed
event of a bad config reload attempt

Now, it continues to print error messages and keep vmalert_config_last_reload_successful state
until successful attempt will be made or config state will be rolled back to prev state.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-11-30 01:24:20 +02:00
Roman Khavronenko
48fb972ddc
vmalert: make notifier.Addr optional (#1870)
For a long time notifier.Addr flag was required. The assumption was that vmalert will
be always used for alerting. However, practice shows that some users need only
recording rules. In this case, requirement of notifier.Addr is ambigious.

The change verifies if loaded config contains recording or alerting rules and
if there are corresponding flags set. This is true for initial config load
and hot reload.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-11-30 01:23:00 +02:00
Aliaksandr Valialkin
7b88daf2ec
docs/CHANGELOG.md: document 852a895b70 2021-11-30 01:22:59 +02:00
Aliaksandr Valialkin
aa01bb9349
app/vmalert/README.md: sync with docs/vmalert.md
This is a follow-up after d8c70903ec
2021-11-17 00:56:53 +02:00
Aliaksandr Valialkin
4fb19fe34b
all: consistently return application/json content-type without charset=utf-8
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:07:22 +02:00
Aliaksandr Valialkin
bc357d7132
docs/vmalert.md: improve wording in Multitenancy chapter 2021-11-09 14:20:30 +02:00
Aliaksandr Valialkin
57aaf914ac
docs: mention that graphs on the official dashboards contain useful hints 2021-11-08 19:54:25 +02:00
Aliaksandr Valialkin
e58630401e
app/{vmalert,vmbackup}/README.md: sync with docs after the commit 47d1612bf8 2021-11-05 20:46:07 +02:00
Aliaksandr Valialkin
9aa40fc0c3
docs/vmalert.md: document the addition of -defaultTenant.prometheus and -defaultTenant.graphite command-line options to enterprise version of vmalert 2021-11-05 20:04:53 +02:00
Aliaksandr Valialkin
56a25146cb
app/vmalert/datasource: use plain string literals instead of constants
This removes the unneeded level of indirection and improves code readability.

The "prometheus" and "graphite" constants aren't going to change in the future, so there is no sense in hiding them behind constants.
2021-11-05 19:58:15 +02:00
Aliaksandr Valialkin
31ac507bee
app/vmalert: remove rule.type config, since it doesnt play well with the upcoming default tenants for -clusterMode
It is better from the consistency point of view to set up rule types at group level where tenant config is set up.
2021-11-05 19:52:41 +02:00
Aliaksandr Valialkin
847004fa77
app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:42:13 +02:00
Aliaksandr Valialkin
2ebee4e741
app/{vmalert,vmagent}: improve the distribution of scrape offsets among targets / rules
Previously only the lower part of 64-bit hash was used for calculating the offset.
This may give uneven distribution in some cases. So let's use all the available 64 bits from the hash
for calculating the offset.
2021-10-27 20:04:02 +03:00
Roman Khavronenko
5321127add
vmalert: allow groups with empty rules for compatibility reasons (#1742)
Prometheus allows to have groups with no rules, so we should support
it in vmalert as well for compatibility reasons.
It is also allowed to hot-reload empty groups by adding or removing rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-25 14:46:51 +03:00
Roman Khavronenko
e0f21d6000
vmalert: correctly calculate alert ID including extra labels (#1734)
Previously, ID for alert entity was generated without alertname or groupname.
This led to collision, when multiple alerting rules within the same group
producing same labelsets. E.g. expr: `sum(metric1) by (job) > 0` and
expr: `sum(metric2) by (job) > 0` could result into same labelset `job: "job"`.

The issue affects only UI and Web API parts of vmalert, because alert ID is used
only for displaying and finding active alerts. It does not affect state restore
procedure, since this label was added right before pushing to remote storage.

The change now adds all extra labels right after receiving response from the datasource.
And removes adding extra labels before pushing to remote storage.

Additionally, change introduces a new flag `Restored` which will be displayed in UI
for alerts which have been restored from remote storage on restart.
2021-10-22 12:31:35 +03:00
Aliaksandr Valialkin
5705f4b6d1
lib/httpserver: expose command-line flags at /flags page
This should simplify debugging.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:46:54 +03:00
Roman Khavronenko
ca49853664
vmalert: make group.ID() thread-safe (#1726)
Commit fixes potential race condition when group update
and generating of ID() happens simultaneously.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:45:52 +03:00
Roman Khavronenko
627224d493
vmalert: properly init SIGHUP listener before starting group manager (#1725)
Regression was introduced during code refactoring. It potentially
could lead to situation when SIGHUP signals were ignored while
vmalert was still busy with initing group manager.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:36:08 +03:00
Roman Khavronenko
44a88b7458
vmalert: remove extra / from path in WEB interface (#1717)
The extra `/` may cause issues when additional path prefixes
are configured. Also, removing it makes it consistent
with the rest of declarations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:19:01 +03:00
Aliaksandr Valialkin
9273b83cd0
app/vmalert: follow-up after 0e2486df56 2021-10-18 15:08:42 +03:00
Alexander Rickardsson
0e1dbcd039
vmalert: add disablePathAppend to remote read (#1712)
* vmalert: add disablePathAppend to remoteRead

* docs: add docs for remoteRead.disablePathAppend
2021-10-18 14:59:17 +03:00
Alexander Rickardsson
63571e1334
vmalert: Redact passwords from error messages (#1713) 2021-10-18 14:59:17 +03:00
Roman Khavronenko
f393145843
Adjust http.Transport.MaxIdleConns setting for vmauth/vmalert services (#1704)
* vmalert: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `datasource.maxIdleConnections` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmauth: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `maxIdleConnsPerBackend` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-13 19:23:52 +03:00
Roman Khavronenko
14995eece6
vmalert: add Source link to alerts UI (#1701)
The source link is controlled by `external.url` and `external.alert.source`
flags, in the same way as for alertmanager notifications.
The source link is added to Alerts list view, and specific Alert view.
2021-10-13 15:26:57 +03:00
Roman Khavronenko
1bdccfc2bf
app/vmalert: remove unnecessary omitempty tag for interval param (#1649)
`omitempty` tag resulted into skipping this param on marshaling,
which was used as a checksum for groups configuration. Since on
config reload checksums are compared before applying changes,
any change to `interval` only didn't trigger config reload.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-23 20:47:56 +03:00
Roman Khavronenko
aa052097cf
app/vmalert: support http.pathPrefix flag in UI (#1636)
The change makes UI to respect `http.pathPrefix` flag
for API or navigation items links.
2021-09-21 18:15:08 +03:00
Roman Khavronenko
a07fa92ef2 vmalert: add new metric vmalert_remotewrite_flush_duration_seconds (#1622) 2021-09-16 14:01:24 +03:00
Roman Khavronenko
286b79dc48 vmalert: create basic auth config only if args aren't empty (#1618)
* vmalert: create basic auth config only if args aren't empty

follow-up after 68721f6

* vmalert: make lint happy
2021-09-15 09:35:38 +03:00
Aliaksandr Valialkin
63fb5a5ca5 docs/vmalert.md: follow-up after 68721f6e7d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1608
2021-09-14 14:48:45 +03:00
Roman Khavronenko
eea6317d40 vmalert: support bearer token for datasource, remotewrite and remoteread (#1614)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1608
2021-09-14 14:48:43 +03:00
Aliaksandr Valialkin
2932f495f3 docs/CHANGELOG.md: document 5494bc02a6 2021-09-13 17:14:45 +03:00
Roman Khavronenko
7df731ea91 vmalert: add flag to limit the max value for auto-resovle duration for alerts (#1609)
* vmalert: add flag to limit the max value for auto-resovle duration for alerts

The new flag `rule.maxResolveDuration` suppose to limit max value for
alert.End param, which is used by notifiers like Alertmanager for alerts auto resolve.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1586
2021-09-13 17:14:45 +03:00
Roman Khavronenko
9dd121db36 vmalert: display extra filter labels in UI (#1613) 2021-09-13 14:17:51 +03:00
Aliaksandr Valialkin
e1632dc0df docs/vmalert.md: typo fix in Multitenancy chapter 2021-09-10 17:57:43 +03:00
Aliaksandr Valialkin
d72593cca2 app/vmalert: document GroupAlerts
This makes golint happy
2021-09-07 22:50:48 +03:00
Aliaksandr Valialkin
608ed19655 app/vmalert: follow-up after 21f022e5f0 2021-09-07 22:45:47 +03:00
Roman Khavronenko
bb26aa9b1f vmalert: add initial UI implementation (#1602)
New UI pages:
/ - welcome page with API handlers list;
/groups - list of all rules per group;
/alerts - list of all active alerts;
/groupID/alertID/status - status of the active alert;
2021-09-07 22:45:45 +03:00
Roman Khavronenko
1cf4f5a715 Vmalert extra params (#1587)
* vmalert: allow extra GET params in datasource package

ExtraParams will be added as GET params to every HTTP request made by datasource.
The `roundDigits` param, for example, was substituted by corresponding extra param.

* vmalert: add nocache=1 param for replay process

The `nocache=1` param is VictoriaMetrics specific parameter which prevents it
from caching and boundaries aligning for queries. We set it to avoid cache
pollution in `replay` mode and also to avoid unnecessary time range boundaries
alignment.

* vmalert: mention nocache=1 in replay description

* vmalert: fix bug with unused param
2021-09-01 12:20:01 +03:00
Nikolay
210c76b51e adds external_labels per group for vmalert (#1485)
* adds external_label per group for vmalert
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1471
2021-09-01 12:19:49 +03:00
Roman Khavronenko
1cb7037fc8 Vmalert metrics update (#1580)
* vmalert: remove `vmalert_execution_duration_seconds` metric

The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.

* vmalert: update config reload success metric properly

Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.

* vmalert: add Grafana dashboard to overview application metrics

* docker: include vmalert target into list for scraping

* vmalert: extend notifier metrics with addr label

The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.

* vmalert: update documentation and docker environment for vmalert's dashboard

Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.

Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-09-01 12:19:34 +03:00
Aliaksandr Valialkin
ff4c7c1a3d docs/vmalert.md: run make docs-sync after 9ee3d0378f 2021-08-21 20:25:26 +03:00
Roman Khavronenko
0c2284b95f vmalert: add flag disableAlertgroupLabel for disabling extra label added to series (#1534)
The new label added in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611
may negatively impact deduplication in Alertmanager. The new flag supposed to give
an option to disable adding this label.

To enable flag just add `-disableAlertgroupLabel` to binary execution command.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1532
2021-08-21 20:23:22 +03:00