github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	0078399788	app/vmalert: switch from table-driven tests to f-tests This makes test code more clear and reduces the number of code lines by 500. This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error* requires more boilerplate code, which can result in additional bugs inside tests. While t.Error* allows writing logging errors for the same, this doesn't simplify fixing broken tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-12 22:41:11 +02:00
Zhu Jiekun	cadf1eb5ab	vmalert: [bug] fixed System hyperlink 404 redirect (#6620 ) ### Describe Your Changes As mentioned in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6603, some hyperlinks under `vmalert` -> `System` section is not working as expected. Pages and redirection: - For page `http://127.0.0.1:8880/`: `flags` button will redirect to `http://127.0.0.1:8880/flags` - For page `http://127.0.0.1:8880/vmalert`: `http://127.0.0.1:8880/flags` - For page `http://127.0.0.1:8880/vmalert/`: `http://127.0.0.1:8880/vmalert/flags` (page not exists) - Similar redirection could be observed with `-http.pathPrefix` Two potential ways to avoid 404 redirection: 1. avoid visiting `/vmalert/` (I'm trying to do this). 2. provide support for `/vmalert/flags`. `/vmalert/` could be visit only when user click other navigator (e.g. Group) and click vmalert again: ![Peek 2024-07-10 10-07](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/13d7b147-a1b6-4e93-9ee0-26f881a16bef) Because: `http://127.0.0.1:8880/vmalert/groups?search=` + `<a class="nav-link" href=".">` = `http://127.0.0.1:8880/vmalert/` So I'm trying to change the `href="."` to `href="../vmalert"`. ### Checklist The following checks are mandatory: - [X] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-07-11 11:43:00 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
Hui Wang	3169524fb7	vmalert: allow omitting `-replay.timeTo` in replay mode, default valu… (#6575 ) …e is the current timestamp address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6492 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-07-05 09:27:34 +02:00
Roman Khavronenko	c429bbf889	app/vmalert: add examples for `source` override (#6561 ) The change adds a new docs section with examples on how source can be overridden. It should address questions like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6536 While there, fix the example in `external.alert.source` cmd-line flag and docker-compose examples. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-05 08:47:59 +02:00
Andrii Chubatiuk	6b128da811	deployment: build image for vmagent streamaggr benchmark (#6515 ) ### Describe Your Changes optionally build vmagent image for benchmark needed for https://github.com/VictoriaMetrics/ops/pull/1297 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-06-24 16:28:50 +02:00
hagen1778	279815818c	app/vmalert: fix typo in replay error handling Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-20 15:15:34 +02:00
hagen1778	4ef76eed7b	app/vmalert: follow-up `bc37b279aa` * rm extra interface method for rw Client, as it has low applicability and doesn't fit multitenancy well * add `GetDroppedRows` method instead Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-20 15:12:53 +02:00
Hui Wang	bc37b279aa	vmalert: exit replay mode with non-zero code if generated samples are… (#6513 ) … not successfully written into remoteWrite url address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6512	2024-06-20 13:20:40 +02:00
Hui Wang	3b8970802e	vmalert-tool: support file path with hierarchical patterns and regexp… (#6501 ) …es, and http url in unittest cmd-line flag `-files`	2024-06-18 14:14:30 +02:00
jackyin	5223981fed	app/vmalert: fix VMAlert oauth2 error (#6478 ) Properly set ClientSecret param for notifier. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6471 --------- Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-06-14 15:06:14 +02:00
Andrii Chubatiuk	eea361defb	app/vmalert: fixed path prefixes for system routes (#6435 ) Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6433 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-06-14 13:34:23 +02:00
Hui Wang	61dce6f2a1	lib/httpserver: allow reloadAuthKey and configAuthKey to override htt… (#6338 ) …pAuth.* address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329, makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey` behavior the same way, but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still protected by httpAuth.*. All the available key are listed in https://docs.victoriametrics.com/single-server-victoriametrics/#security. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-06-10 12:09:47 +02:00
hagen1778	a5f81f67fd	app/vmalert: rm extra response for unsupported path Unsupported path is already handled by `lib/httpserver`. This prevents from misleading errors in logs caused by double-writing response headers. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-03 12:52:02 +02:00
hagen1778	6d8e02f278	chore: follow-up after `c740a8042e` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-03 10:26:57 +02:00
Nikolay	b97916276f	app/vmalert: adds idleConnTimeout flags and retry trivial network errors (#6382 ) * ".idleConnTimeout" flags must reduce probability of `write: broken pipe` and `read: connection reset by peer` errors Those errors may occur if remote server closes TCP socket for connection, while it's still exist at client. single time retries for `write: broken pipe` and `read: connection reset by peer` must handle a case for incorrectly configured timeouts at middleware proxies, mitigate minor network issues. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5661 ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-05-30 17:54:42 +02:00
Hui Wang	d7b5062917	app/vmalert: support DNS SRV record in `-remoteWrite.url` (#6299 ) part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053, supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in `-remoteWrite.url` command-line option.	2024-05-22 10:52:51 +02:00
Roman Khavronenko	4f0525852f	app/vmalert/datasource: reduce number of allocations when parsing instant responses (#6272 ) Allocations are reduced by implementing custom json parser via fastjson lib. The change also re-uses `promInstant` object in attempt to reduce number of allocations when parsing big responses, as usually happens with heavy recording rules. ``` name old allocs/op new allocs/op delta ParsePrometheusResponse/Instant-10 9.65k ± 0% 5.60k ± 0% ~ (p=1.000 n=1+1) ``` Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-15 15:18:33 +02:00
Roman Khavronenko	b0c1f3d819	app/vmalert/rule: reduce number of allocations for getStaleSeries fn (#6269 ) Allocations are reduced by re-using the byte buffer when converting labels to string keys. ``` name old allocs/op new allocs/op delta GetStaleSeries-10 703 ± 0% 203 ± 0% ~ (p=1.000 n=1+1) ``` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-14 14:43:39 +02:00
qiangxuhui	80f3644ee3	Add build support for loong64 (#6222 ) ### Describe Your Changes Added makefile rule for `GOARCH=loong64` to support building all VictoriaMetrics components on the `loongarch64` platform. ### Checklist The following checks are mandatory: * [X] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: qiangxuhui <qiangxuhui@loongson.cn>	2024-05-09 14:22:03 +02:00
Hui Wang	e3c226cf92	docs: update vmalert and vmagent docs (#6207 ) * restore and actualize doc section explaining duplicated labels error * rm misleading comment about post-aggregation in stream aggregation	2024-04-30 10:27:06 +02:00
Roman Khavronenko	5f487c7090	app/vmalert: fix links with anchors in vmalert's UI (#6146 ) Starting from v1.99.0 vmalert could ignore anchors pointing to specific rule groups if `search` param was present in URL. This change makes anchors compatible with `search` param in UI. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 15:02:10 +02:00
Hui Wang	a84491324d	vmalert: avoid blocking APIs when alerting rule uses template functio… (#6129 ) * vmalert: avoid blocking APIs when alerting rule uses template function `query` * app/vmalert: small refactoring * simplify labels and templates expanding * simplify `newAlert` interface * fix `TestGroupStart` which mistakenly skipped annotations and response labels check Signed-off-by: hagen1778 <roman@victoriametrics.com> * reduce alerts lock time when restore --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-19 09:16:26 +02:00
Roman Khavronenko	316b19a5d1	app/vmalert: make `TestGroupStart` more reliable (#6130 ) There was a sleep statement in the test, waiting for Group to perform a couple of evaluation. But looks like it worked unreliable for some CI tests like the one below https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/8718213844/job/23915007958?pr=6115 This commit changes the sleep statement on a function that waits for a specific number of evaluations. It should make this test faster in general case, and more reliable for slow environemnts.	2024-04-19 09:06:40 +02:00
Aliaksandr Valialkin	b4fac26360	all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/	2024-04-18 01:44:12 +02:00
Alexander Marshalov	7308bad777	vmalert: support any status code from the range 200-299 from alertmanager as successful (#6111 ) * any status code from the range 200-299 from alertmanager to vmalert is not considered an error from now on (#6110) * add changelog	2024-04-16 09:33:11 +02:00
wanshuangcheng	83216e956c	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 01:11:12 -07:00
Aliaksandr Valialkin	918cccaddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-02 23:16:29 +03:00
Jiekun	623d257faf	app/vmalert: respect batch size limit for remote write on shutdown (#6039 ) During shutdown period of vmalert, remotewrite client retrieve all pending time series from buffer queue, compose them into 1 batch and execute remote write. This final batch may exceed the limit of -remoteWrite.maxBatchSize, and be rejected by the receiver (gateway, vmcluster or others). This changes ensures that even during shutdown vmalert won't exceed the max batch size limit for remote write destination. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6025	2024-03-29 14:27:50 +01:00
Hui Wang	d7224b2d1c	vmalert: fix sending alert messages (#6028 ) * vmalert: fix sending alert messages 1. fix `endsAt` field in messages that send to alertmanager, previously rule with small interval could never be triggered; 2. fix behavior of `-rule.resendDelay`, before it could prevent sending firing message when rule state is volatile. * docs: update changelog notes Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-03-28 08:55:10 +01:00
Hui Wang	e80b44f19d	vmalert: deprecate cmd-line flag `-datasource.lookback` (#5877 ) * vmalert: deprecate cmd-line flag `-datasource.lookback` * fix lint * review fixes Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-03-12 16:16:50 +01:00
Tien M. Nguyen	f5115c8f1b	feat: include cluster info in alert CPUThrottlingHigh (#5956 )	2024-03-12 14:51:32 +04:00
Aliaksandr Valialkin	e22836c636	app/{vmalert,vmctl}: consistently use http.NewRequestWithContext() instead of http.NewRequest() + req.WithContext()	2024-02-29 15:25:43 +02:00
Aliaksandr Valialkin	6697da73e5	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:44:24 +02:00
hagen1778	e2dad3a2ac	app/vmalert: consistently sort groups by name and filename on `/groups` page This should prevent non-deterministic sorting for groups with identical names. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-20 13:50:57 +01:00
hagen1778	11b03d9fc8	app/vmalert: follow-up after `b60dcbe11f` * support case-insensitive search * reflect search condition in URL, so link can be sharable * support filtering on /alerts page * fix collapseAll/expandAll logic to respect only shown entries * add changelog `b60dcbe11f` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-20 13:07:05 +01:00
Victor Amorim dos Santos	b60dcbe11f	vmalert: add filter by group or rule name to UI (#5791 ) Co-authored-by: Yury Molodov <yurymolodov@gmail.com>	2024-02-20 12:31:41 +01:00
Roman Khavronenko	8850c7431d	app/vmalert: support filtering for /api/v1/rule like Prometheus does (#5787 ) Follow-up after `62e5e2a4c8` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-09 14:35:31 +01:00
Victor Amorim dos Santos	62e5e2a4c8	app/vmalert: support `type` param for filtering /api/v1/rules response by rule type (#5749 ) Co-authored-by: Hui Wang <haley@victoriametrics.com>	2024-02-09 09:02:35 +01:00
Aliaksandr Valialkin	ae8a867924	all: add support for specifying multiple -httpListenAddr options	2024-02-09 03:15:04 +02:00
Khushi Jain	83e55456e2	app/vmbackup: support client-side TLS configuration for create/delete snapshot API (#5738 )	2024-02-08 15:52:00 +01:00
Roman Khavronenko	24eb1ad0c8	vmalert: set `ActiveAt` to evaluation timestamp in `newAlert` fn (#5657 ) The change fixes flaky test `TestAlertingRule_Exec` which has dependency on the actual timestamps, which resulted into inaccurate test states: https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7608452967/job/20717699688 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 12:02:02 +01:00
Roman Khavronenko	df59ac7f0e	app/vmalert: fix data race during hot-config reload (#5698 ) * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix caches the cancel context function into local variable first. And only after performs the group update. With cached cancel function we can safely call it without worrying that we cancel the evaluation for already updated group. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Revert "app/vmalert: fix data race during hot-config reload" This reverts commit `a4bb7e8932`. * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix cancels the evaulation context before applying the update, making sure that the context will be cancelled for old group always. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:42:21 +01:00
Roman Khavronenko	b11f4ef5ea	app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` (#5680 ) * app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` Previously, `ALERTS_FOR_STATE` was generated only for alerts with `for > 0`. This behavior differs from Prometheus behavior - it generates ALERTS_FOR_STATE time series for alerting rules with `for: 0` as well. Such time series can be useful for tracking the moment when alerting rule became active. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5648 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3056 Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: support ALERTS_FOR_STATE in `replay` mode Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-25 15:42:57 +01:00
hagen1778	da556cc329	docs: fix Grafana link example for vmalert Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-22 09:35:18 +01:00
Aliaksandr Valialkin	1f105dde98	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-21 22:03:38 +02:00
Aliaksandr Valialkin	be509b3995	lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop() Previously the was a race condition when the background goroutine still could try collecting metrics from already stopped resources after returning from pushmetrics.Stop(). Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning. This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549 and the commit `fe2d9f6646` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548	2024-01-15 13:50:36 +02:00
Aliaksandr Valialkin	d2c94a0663	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin	a47127c1a6	app/vmalert/remotewrite: properly calculate vmalert_remotewrite_dropped_rows_total It was calculating the number of dropped time series instead of the number of dropped samples. While at it, drop vmalert_remotewrite_dropped_bytes_total metric, since it was inconsistently calculated - at one place it was calculating raw protobuf-encoded sample sizes, while at another place it was calculating the size of snappy-compressed prompbmarshal.WriteRequest protobuf message. Additionally, this metric has zero practical sense, so just drop it in order to reduce the level of confusion.	2024-01-14 22:55:11 +02:00
Aliaksandr Valialkin	c005245741	lib/prompb: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 22:46:06 +02:00

1 2 3 4 5 ...

593 commits