Commit graph

399 commits

Author SHA1 Message Date
hagen1778
978faf571d
app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 23:43:40 +02:00
Roman Khavronenko
f0099c4db7
app/vmalert: sanitize label names before sending to Alertmanager (#5442)
Before, vmalert would send notifications with labels containing characters
  not supported by Alertmanager validator, resulting into validation errors
  like `msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"`

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-08 18:12:42 +02:00
Aliaksandr Valialkin
7227dca1df
app/vmalert/datasource: test fix after 8a1b93a49d 2023-11-14 22:06:51 +01:00
Roman Khavronenko
ada9afc74c
vmalert: correctly add duplicated params to the query (#4955)
Fix the bug when Group's `params` fields with multiple values were
overriding each other instead of adding up.
The bug was introduced in this commit eccecdf177
 starting from v1.91.1 https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.91.1

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4908

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6351d07da8)
2023-09-08 09:37:54 +02:00
Roman Khavronenko
9b83737a75
vmalert: check for negative offset for missed rounds (#4628)
It could happen for low evaluation intervals and irregular
delays during execution that evaluation time would get
a negative offset. This could result into cumulative
discrepancy between the actual time and evaluation time for rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-12 13:36:13 -07:00
Haleygo
22954607ba
vmalert: fix evalTS after modify group interval (#4629) 2023-08-12 13:34:33 -07:00
Roman Khavronenko
3d820c0da8
vmalert: do not return nil rules for /api/v1/rules (#4344)
The fix addresses a case when vmalert is configured with a group
which has `name`, but doesn't have `rules` configured. In this
case it still returns a `nil` instead of `[]` slice.

Fixing this via current commit.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 66ed6fe62f)
2023-06-05 11:45:52 +02:00
Roman Khavronenko
ea920edd32
vmalert: properly form assets address if httpPrefix set (#4351)
Properly form path to static assets in WEB UI
 if `http.pathPrefix` set.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4349

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 51cea6cad4)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:45:51 +02:00
Roman Khavronenko
aeb386c98a
vmalert: fix nil map assignment (#4392)
* vmalert: fix nil map assignment

The storage instance with nil map params was created for remote-read purposes.
And before change 7a9ae9de0d this map was ignored in ApplyParams.
Now, it started to be used and vmalert panics in runtime.

The fix properly inits map for at `NewVMStorage` and verifies it is not nil
on assignment in `ApplyParams`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add to changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit de94812088)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:45:50 +02:00
Roman Khavronenko
1e8562fbd2
app/vmalert: follow-up after 7a9ae9de0d (#4381)
7a9ae9de0d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit eccecdf177)
2023-06-05 11:45:50 +02:00
gsakun
283e2873ed
app/vmalert: fix datasource.roundDigits Parameter (#4341)
app/vmalert: fix querybuild clone and extraParams merge logic

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4340

(cherry picked from commit 20dc3db71e)
2023-06-05 11:45:50 +02:00
Roman Khavronenko
07e6b74dfe
vmalert: fix API to return non-nil values (#4222)
Properly return empty slices instead of nil for `/api/v1/rules` and `/api/v1/alerts` API handlers.
This improves compatibility with Grafana.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 21:49:25 -07:00
Aliaksandr Valialkin
bc144e2b05
all: follow-up for 7a3e16e774
- Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth,
  so it is consistent with the description at vmauth and victoria-metrics
- Add a sample of panic text to docs/CHANGELOG.md, so it could be googled
- Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-12 01:19:55 -08:00
Aliaksandr Valialkin
34379d4cf1
all: run apk update && apk upgrade in base Alpine Docker image in order to get all the recent security fixes 2023-02-09 14:03:02 -08:00
Roman Khavronenko
4e922eb93b
Vmalert fixes (#3788)
* vmalert: use group's ID in UI to avoid collisions

Identical group names are allowed. So we should used IDs
for various groupings and aggregations in UI.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: prevent disabling state updates tracking

The minimum number of update states to track is now set to 1.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly update `debug` and `update_entries_limit` params on hot-reload

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: display `debug` field for rule in UI

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: exclude `updates` field from json marhsaling

This field isn't correctly marshaled right now.
And implementing the correct marshaling for it doesn't
seem right, since json representation is mostly used
by systems like Grafana. And Grafana doesn't expect this
field to be present.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* fix test for disabled state

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* fix test for disabled state

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-08 08:45:25 -08:00
Roman Khavronenko
80bf0bcf8c
vmalert: update docs (#3770)
vmalert: update flags description

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-07 09:28:59 -08:00
Roman Khavronenko
96db7ac52c
vmalert: speed up state restore procedure on start (#3758)
* vmalert: speed up state restore procedure on start

Alerts state restore procedure has been changed to become asynchronous.
It doesn't block groups start anymore which significantly improves vmalert's startup time.
Instead, state restore is called by each group in their goroutines after the first rules
evaluation.

While previously state restore attempt was made for all loaded alerting rules,
now it is called only for alerts which became active after the first evaluation.
This reduces the amount of API calls to the configured remote read URL.

This also means that `remoteRead.ignoreRestoreErrors` command-line flag becomes deprecated now
and will have no effect if configured.

See relevant issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2608

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* make lint happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-03 19:46:41 -08:00
Roman Khavronenko
d93ac2b1ea
docs: mention -vmalert.proxyURL in vmalert docs (#3730)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-31 10:49:49 -08:00
Aliaksandr Valialkin
4cf4c307ea
docs: update command-line descriptions after 73256fe438 2023-01-27 00:01:14 -08:00
Nikolay
ebebaecd94
lib/netutil: init implimentation of proxy protocol (#3687)
* lib/netutil: init implimentation of proxy protocol
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-26 23:25:22 -08:00
Aliaksandr Valialkin
bd809db4d9
docs: update the list of command-line flags according to the latest changes 2023-01-25 09:22:23 -08:00
Aliaksandr Valialkin
ef7683f2e0
app/vmalert: use consistent randomizer in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:25:32 -08:00
Aliaksandr Valialkin
ac890b3081
docs: update -help outputs for vm* tools 2023-01-03 23:27:31 -08:00
Aliaksandr Valialkin
3369371636
app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:22:07 -08:00
Roman Khavronenko
dde750e7c1
vmalert: mention specifics of Alertmanager HA mode (#3573)
Stress the importance of specifying of all Alertmanager
URLs in vmalert's `-notifier.url` or `notifier.config`
if it runs in cluster mode.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3547

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-03 21:46:00 -08:00
Roman Khavronenko
5cf2998af8
vmalert: allow configuring the default number of stored rule's update states (#3556)
Allow configuring the default number of stored rule's update states in memory
 via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
 `update_entries_limit` configuration param.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 10:41:51 -08:00
Artem Navoiev
393f4ab86f
update links to grafana dashboards (#3534)
docs: update links to grafana dashboards

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-28 11:22:02 -08:00
Zakhar Bessarab
decf46d72b
app/vmbackupmanager: add metrics for better observability (#488)
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:43 -08:00
Aliaksandr Valialkin
2a229a319e
docs/vmalert.md: mention latency_offset query arg, which has been added in 86dae56bd0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481
2022-12-16 17:20:51 -08:00
Aliaksandr Valialkin
3a28a52667
lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:53:18 -08:00
Roman Khavronenko
a44af871d3
vmalert: support $for or .For template variables (#3474)
support `$for` or `.For` template variables  in alert's annotations.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3246

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 14:42:16 -08:00
Aliaksandr Valialkin
97b41e727c
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:25:56 -08:00
Aliaksandr Valialkin
8fce069c7b
app/vmalert: properly handle nil req passed to requestToCurl()
This fixes a panic in the TestAlertingRule_Exec_Negative test.
The panic has been introduced in the commit b97bd01605
2022-12-10 02:05:20 -08:00
Aliaksandr Valialkin
650d1d1ae5
app/vmalert: do not show system links at http://vmalert:8880/ page when it is requested via proxy
The system links are absolute, e.g. they start from `/`, so there are high chances
they won't work as expected when requested via proxy such as vmselect with -vmalert.proxyURL
command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3424
2022-12-09 11:49:53 -08:00
Roman Khavronenko
385d082bca
vmalert: do not hold pointer to http.Request (#3467)
http.Request was used as a part of state struct
for generating the curl command when viewing the rule's
state changes.
It appears, that holding a referencing is far more expensive
than generating the curl command immediately.
On the test with 40k rules, this change reduces memory
and CPU usage by 50%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-09 11:49:53 -08:00
Aliaksandr Valialkin
676de127aa
all: update Go builder from v1.19.3 to v1.19.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
2022-12-08 17:04:41 -08:00
Roman Khavronenko
5bbb88902e
vmalert: correctly return error for RW failures (#3452)
* vmalert: correctly return error for RW failures

By mistake, in 0989649ad0 the error
for remote write failures weren't return to user.
This change fixes it.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-06 16:31:12 -08:00
Roman Khavronenko
a922308438
vmalert: reduce allocations for Prometheus resp parse (#3435)
Method `metrics()` now pre-allocates slices for labels
and results from query responses. This reduces the number 
of allocations on the hot path for instant requests.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 00:18:11 -08:00
Roman Khavronenko
31ca22109e
vmalert: fix replay step param (#3428)
The recent change in modifying default value
of `datasource.queryStep` flag resulted in situation
where replay mode was always running queries with
step=`datasource.queryStep`. When it should always
use rule's evaluation interval.

The fix is related not to replay mode only, but
for all Range requests. Now step param is set
individually for each mode.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 19:09:30 -08:00
Zakhar Bessarab
59f889cd3f
app/vmalert: add remoteWrite.sendTimeout command-line flag to configure timeout for sending data to remoteWrite.url (#3423)
* app/vmalert: add `remoteWrite.sendTimeout` command-line flag to configure timeout for sending data to `remoteWrite.url`

* vmalert: remove WriteTimeout from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the WriteTimeout.

* vmalert: remove DisablePathAppend from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the DisablePathAppend.

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 19:03:34 -08:00
Roman Khavronenko
435f6f3add
vmalert: properly pass headers during the restore procedure (#3420)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3418

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 18:53:44 -08:00
Aliaksandr Valialkin
be6da5053f
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:23 -08:00
Aliaksandr Valialkin
2a107cc8a7
app/vmalert: substitute -datasource.disablePathAppend with -remoteRead.disablePathAppend in the description for -datasource.url command-line flag
This is a follow-up for 959f06d175
2022-11-29 21:11:18 -08:00
Max Golionko
d272a8270b
vmalert: flag reference update (#3415)
* flag reference update

there is no flag `-datasource.disablePathAppend` and datasource actually checking for `-remoteRead.disablePathAppend`

* update source for doc as well
2022-11-29 20:38:02 -08:00
Roman Khavronenko
0475f8a38e
vmalert: add default list of alerting rules (#3373)
The default list of alerting rules contains the basic
rules for checking vmalert's health state and is recommended
to use for monitoring vmalert deployments.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 16:09:47 +02:00
Aliaksandr Valialkin
6fe8eec745
all: add a link to https://docs.victoriametrics.com/enterprise.html into description for enterprise flags 2022-11-21 15:44:54 +02:00
Roman Khavronenko
8ee464b22b
bump go version to 1.19.3 (#3327)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 11:56:38 +02:00
Aliaksandr Valialkin
7ae038766c
app/vmalert/templates: properly escape all the special chars in quotesEscape function
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
  was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
  complications in user templates, while not fixing various corner cases
  such as backslash chars in the input string.
  See 1de15ad490

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
  was "fixed" by urlencoding the whole string passed to -external.alert.source
  command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
  See 00c838353d
  and 4bd0244599

This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.

This commit deprecates crlfEscape template function and adds the following new template functions:

- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
  for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
  and for html-escaping the input string, so it could be safely embedded as a plaintext
  into html.

This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
2022-10-28 00:08:50 +03:00
Aliaksandr Valialkin
8a6898b625
Revert "vmalert: escape query params if external alert source defined (#3267)"
This reverts commit 00c838353d.

Reason for revert: it incorrectly fixes the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 .
Now `-external.alert.source=explore?orgId=1&left=...` is converted to the following invalid url, which cannot be handled by Grafana:

https://grafana.example.com/explore%3ForgId%3D1%26left%3D...

The next commit will contain the correct fix of the issue - the `quotesEscape` function must
properly escape the string, so it could be embedded into JSON string. This function must
properly escape \n\r chars too. In this case the `crlfEscape` function becomes unnecessary.
Actually, the next commit makes the `crlfEscape` function deprecated.
2022-10-28 00:08:50 +03:00
Dmytro Kozlov
3123059407
vmalert: escape query params if external alert source defined (#3267)
vmalert: escape query args if external alert source defined
2022-10-28 00:08:50 +03:00