github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Hui Wang	18afeff742	app/vmalert: fix flaky ut `TestRecordingRule_Exec` The order of stale metrics can't be controlled in recording rule, only use two time series then.	2024-11-14 15:30:39 +01:00
Hui Wang	b09272ccac	app/vmalert: improve performances when rules produce large volumes of results 1. Avoid storing the last evaluation results outside of rules, check for stale time series as soon as possible; 2. remove duplicated template `Clone()`. This pull request is primarily reducing memory usage when rules produce large volumes of results, as seen in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6894. The CPU time spent on garbage collection remains high and may be addressed in a separate PR.	2024-11-14 12:23:39 +01:00
Aliaksandr Valialkin	e5537bc64d	lib/logstorage: properly take into account the `end` query arg when calculating time range for _time:duration filters	2024-11-08 16:43:54 +01:00
Hui Wang	68bad22fd2	vmalert: integrate with victorialogs (#7255 ) address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md. Related fix https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254. Note: in this pull request, vmalert doesn't support [backfilling](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md#rules-backfilling) for rules with a customized time filter. It might be added in the future, see [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7289) for details. Feature can be tested with image `victoriametrics/vmalert:heads-vmalert-support-vlog-ds-0-g420629c-scratch`. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-29 16:30:39 +01:00
Hui Wang	c4fe23794a	vmalert: fix blocking hot-reload process if the old rule group hasn't started yet (#7258 ) Group [sleeps](`daa7183749/app/vmalert/rule/group.go (L320)`) random duration before start the evaluation, and during the sleep, `g.updateCh <- new` will be blocked since there is no `<-g.updateCh` waiting. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-18 11:18:24 +02:00
Hui Wang	d6d02d7aeb	vmalert: fix variable `$activeAt` value when templating rule annotation in replay mode Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-09-20 11:07:40 +02:00
Dima Lazerka	8207879fa3	docs: fixes misspelled typos Also tried to make it catch "Authorisation" in the future, fixed a lot of other misspells along the way, but didn't make it catch "Authorisation" anyway. - Fix misspelled "Authorization" header name - Fix misspelled "organization" - Fix more misspells	2024-09-13 12:14:24 +02:00
dufucun	95bafc8caf	tests: fix slice init length (#6897 ) ### Describe Your Changes fix slice init length ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: dufucun <dufuchun@sohu.com>	2024-08-30 10:55:25 +02:00
hagen1778	9726e6c1a2	app/vmalert: rm unnecessary err check The error check was needed before `a84491324d` It was kept by mistake and makes no sense to have rn. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-07 09:09:24 +02:00
Aliaksandr Valialkin	0078399788	app/vmalert: switch from table-driven tests to f-tests This makes test code more clear and reduces the number of code lines by 500. This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error* requires more boilerplate code, which can result in additional bugs inside tests. While t.Error* allows writing logging errors for the same, this doesn't simplify fixing broken tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-12 22:41:11 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
Roman Khavronenko	b0c1f3d819	app/vmalert/rule: reduce number of allocations for getStaleSeries fn (#6269 ) Allocations are reduced by re-using the byte buffer when converting labels to string keys. ``` name old allocs/op new allocs/op delta GetStaleSeries-10 703 ± 0% 203 ± 0% ~ (p=1.000 n=1+1) ``` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-14 14:43:39 +02:00
Hui Wang	e3c226cf92	docs: update vmalert and vmagent docs (#6207 ) * restore and actualize doc section explaining duplicated labels error * rm misleading comment about post-aggregation in stream aggregation	2024-04-30 10:27:06 +02:00
Hui Wang	a84491324d	vmalert: avoid blocking APIs when alerting rule uses template functio… (#6129 ) * vmalert: avoid blocking APIs when alerting rule uses template function `query` * app/vmalert: small refactoring * simplify labels and templates expanding * simplify `newAlert` interface * fix `TestGroupStart` which mistakenly skipped annotations and response labels check Signed-off-by: hagen1778 <roman@victoriametrics.com> * reduce alerts lock time when restore --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-19 09:16:26 +02:00
Roman Khavronenko	316b19a5d1	app/vmalert: make `TestGroupStart` more reliable (#6130 ) There was a sleep statement in the test, waiting for Group to perform a couple of evaluation. But looks like it worked unreliable for some CI tests like the one below https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/8718213844/job/23915007958?pr=6115 This commit changes the sleep statement on a function that waits for a specific number of evaluations. It should make this test faster in general case, and more reliable for slow environemnts.	2024-04-19 09:06:40 +02:00
Aliaksandr Valialkin	b4fac26360	all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/	2024-04-18 01:44:12 +02:00
wanshuangcheng	83216e956c	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 01:11:12 -07:00
Aliaksandr Valialkin	918cccaddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-02 23:16:29 +03:00
Hui Wang	d7224b2d1c	vmalert: fix sending alert messages (#6028 ) * vmalert: fix sending alert messages 1. fix `endsAt` field in messages that send to alertmanager, previously rule with small interval could never be triggered; 2. fix behavior of `-rule.resendDelay`, before it could prevent sending firing message when rule state is volatile. * docs: update changelog notes Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-03-28 08:55:10 +01:00
Roman Khavronenko	24eb1ad0c8	vmalert: set `ActiveAt` to evaluation timestamp in `newAlert` fn (#5657 ) The change fixes flaky test `TestAlertingRule_Exec` which has dependency on the actual timestamps, which resulted into inaccurate test states: https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7608452967/job/20717699688 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 12:02:02 +01:00
Roman Khavronenko	b11f4ef5ea	app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` (#5680 ) * app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` Previously, `ALERTS_FOR_STATE` was generated only for alerts with `for > 0`. This behavior differs from Prometheus behavior - it generates ALERTS_FOR_STATE time series for alerting rules with `for: 0` as well. Such time series can be useful for tracking the moment when alerting rule became active. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5648 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3056 Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: support ALERTS_FOR_STATE in `replay` mode Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-25 15:42:57 +01:00
Hui Wang	1f477aba41	vmalert: automatically add `exported_` prefix for original evaluation… (#5398 ) automatically add `exported_` prefix for original evaluation result label if it's conflicted with external or reserved one, previously it was overridden. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5161 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-12-22 16:07:47 +01:00
Dmytro Kozlov	935bec447b	app/vmalert: replace error metrics for gauges with counter metrics (#5217 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-12-06 19:39:35 +01:00
Dmytro Kozlov	a28cc6ebec	app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format (#5397 ) * app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format * app/vmalert: hide updates if query param not set * app/vmalert: fix panic (recursion call) * app/vmalert: add needed group name and file name * app/vmalert: fix comment, update behavior * app/vmalert: fix description * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-12-04 18:40:33 +03:00
Roman Khavronenko	bffd30b57a	app/vmalert: update remote-write process (#5284 ) * app/vmalert: update remote-write process * automatically retry remote-write requests on closed connections. The change should reduce the amount of logs produced in environments with short-living connections or environments without support of keep-alive on network balancers. * increment `vmalert_remotewrite_errors_total` metric if all retries to send remote-write request failed. Before, this metric was incremented only if remote-write client's buffer is overloaded. * increment `vmalert_remotewrite_dropped_rows_total` amd `vmalert_remotewrite_dropped_bytes_total` metrics if remote-write client's buffer is overloaded. Before, these metrics were incremented only after unsuccessful HTTP calls. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Hui Wang <haley@victoriametrics.com>	2023-11-08 14:53:07 +08:00
Aliaksandr Valialkin	815fda8995	docs: update -help output after recent changes to VictoriaMetrics components	2023-11-02 20:27:10 +01:00
Roman Khavronenko	b5254199c6	app/vmalert: add label `file` pointing to the group's filename to metrics (#5281 ) The filename should help identifying alerting rules belonging to specific groups with identical names but different filenames. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5267 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-02 16:01:31 +01:00
hagen1778	6eb205f8b0	app/vmalert: verify alert name correctness in restore test Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-02 15:28:39 +01:00
Hui Wang	90d45574bf	vmalert: reduce restore query request for each alerting rule (#5265 ) reduce the number of queries for restoring alerts state on start-up. The change should speed up the restore process and reduce pressure on `remoteRead.url`.	2023-11-02 15:22:13 +01:00
Hui Wang	abcb21aa5e	vmalert: fix alert firing state in replay mode (#5192 ) fix possible missing firing states for alerting rules in replay mode Before if one firing stage is bigger than single query request range, like rule with a big `for`, alerting rule won't able to be detected as firing. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 13:54:18 +01:00
hagen1778	3aec7eb44f	app/vmalert: remove unclear comment The timestamp alignment should be applied as a last step to keep the timestamp consistent. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-26 15:41:35 +02:00
Aliaksandr Valialkin	42dd71bb63	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-25 21:24:03 +02:00
hagen1778	c07909a20b	app/vmalert: fix typo in tests Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 16:28:27 +02:00
hagen1778	eed0c3c6b0	app/vmalert: fix tests after `a216fe6728` `a216fe6728` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 16:25:26 +02:00
hagen1778	a216fe6728	app/vmalert: follow-up after `c9375cac5e` `c9375cac5e` Descriptions were updated in attempt to make it more clear for readers, re-phrasing and linking missing docs. `eval_delay` was added to tests to verify it can be unmarshalled. `eval_delay` is now applied before timestamp alignment to make it more predictable. Before, if delay < interval the timestamp won't be aligned. `eval_delay` and `eval_offset` was added to API output. `PreviouslySentSeriesToRW` converted to private `previouslySentSeriesToRW`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 13:07:13 +02:00
Hui Wang	c9375cac5e	vmalert: add `-rule.evalDelay` flag and `eval_delay` as group attribute (#5185 ) Also mark `-datasource.lookback` as will be deprecated, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5155.	2023-10-25 11:54:18 +02:00
Haleygo	dc28196237	vmalert-tool: implement unittest (#4789 ) 1. split package rule under /app/vmalert, expose needed objects 2. add vmalert-tool with unittest subcmd https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945	2023-10-13 13:54:33 +02:00

37 commits