vmalert-tool: fix alert_rule_test case when eval_time is not multiple of evaluation_interval (#5387)

Co-authored-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
Hui Wang 2023-12-01 19:17:24 +08:00 committed by GitHub
parent 8eddccfbb4
commit 1911320c86
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 34 additions and 3 deletions

View file

@ -21,6 +21,11 @@ tests:
groupname: group2
alertname: SameAlertNameWithDifferentGroup
exp_alerts: []
- eval_time: 150s
groupname: group1
alertname: SameAlertNameWithDifferentGroup
exp_alerts:
- {}
- eval_time: 6m
groupname: group1
alertname: SameAlertNameWithDifferentGroup

View file

@ -336,7 +336,7 @@ func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]i
if disableAlertgroupLabel {
g.Name = ""
}
if _, ok := alertExpResultMap[time.Duration(ts.UnixNano())][g.Name]; !ok {
if _, ok := alertExpResultMap[alertEvalTimes[evalIndex]][g.Name]; !ok {
continue
}
if _, ok := gotAlertsMap[g.Name]; !ok {
@ -347,7 +347,7 @@ func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]i
if !isAlertRule {
continue
}
if _, ok := alertExpResultMap[time.Duration(ts.UnixNano())][g.Name][ar.Name]; ok {
if _, ok := alertExpResultMap[alertEvalTimes[evalIndex]][g.Name][ar.Name]; ok {
for _, got := range ar.GetAlerts() {
if got.State != notifier.StateFiring {
continue

View file

@ -45,7 +45,8 @@ The sandbox cluster installation is running under the constant load generated by
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): prevent from `FATAL: cannot flush metainfo` panic when [`-remoteWrite.multitenantURL`](https://docs.victoriametrics.com/vmagent.html#multitenancy) command-line flag is set. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly decode zstd-encoded data blocks received via [VictoriaMetrics remote_write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol). See [this issue comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301#issuecomment-1815871992).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly add new labels at `output_relabel_configs` during [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html). Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing [detailed report for this bug](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly add new labels at `output_relabel_configs` during [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html). Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing [detailed report for this bug](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402).
* BUGFIX: [vmalert-tool](https://docs.victoriametrics.com/#vmalert-tool): allow using arbitrary `eval_time` in [alert_rule_test](https://docs.victoriametrics.com/vmalert-tool.html#alert_test_case) case. Previously, test cases with `eval_time` not being a multiple of `evaluation_interval` would fail.
## [v1.95.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1)

View file

@ -33,6 +33,31 @@ except `promql_expr_test` field. Use `metricsql_expr_test` field name instead. T
validates and executes [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions,
which aren't always backward compatible with [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/).
### Limitations
* vmalert-tool evaluates all the groups defined in `rule_files` using `evaluation_interval`(default `1m`) instead of `interval` under each rule group.
* vmalert-tool shares the same limitation with [vmalert](https://docs.victoriametrics.com/vmalert.html#limitations) on chaining rules under one group:
>by default, rules execution is sequential within one group, but persistence of execution results to remote storage is asynchronous. Hence, user shouldnt rely on chaining of recording rules when result of previous recording rule is reused in the next one;
For example, you have recording rule A and alerting rule B in the same group, and rule B's expression is based on A's results.
Rule B won't get the latest data of A, since data didn't persist to remote storage yet.
The workaround is to divide them in two groups and put groupA in front of groupB (or use `group_eval_order` to define the evaluation order).
In this way, vmalert-tool makes sure that the results of groupA must be written to storage before evaluating groupB:
```yaml
groups:
- name: groupA
rules:
- record: A
expr: sum(xxx)
- name: groupB
rules:
- alert: B
expr: A >= 0.75
for: 1m
```
### Test file format
The configuration format for files specified in `--files` cmd-line flag is the following: