diff --git a/app/vmalert/README.md b/app/vmalert/README.md
index 7d1c4a150..f268cf959 100644
--- a/app/vmalert/README.md
+++ b/app/vmalert/README.md
@@ -638,6 +638,61 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
+## Troubleshooting
+
+vmalert executes configured rules within certain intervals. It is expected that at the moment when rule is executed,
+the data is already present in configured `-datasource.url`:
+
+
+
+Usually, troubles start to appear when data in `-datasource.url` is delayed or absent. In such cases, evaluations
+may get empty response from datasource and produce empty recording rules or reset alerts state:
+
+
+
+Try the following recommendations in such cases:
+
+* Always configure group's `evaluationInterval` to be bigger or equal to `scrape_interval` at which metrics
+are delivered to the datasource;
+* If you know in advance, that data in datasource is delayed - try changing vmalert's `-datasource.lookback`
+command-line flag to add a time shift for evaluations;
+* If time intervals between datapoints in datasource are irregular - try changing vmalert's `-datasource.queryStep`
+command-line flag to specify how far search query can lookback for the recent datapoint. By default, this value
+is equal to group's `evaluationInterval`.
+
+Sometimes, it is not clear why some specific alert fired or didn't fire. It is very important to remember, that
+alerts with `for: 0` fire immediately when their expression becomes true. And alerts with `for > 0` will fire only
+after multiple consecutive evaluations, and at each evaluation their expression must be true. If at least one evaluation
+becomes false, then alert's state resets to the initial state.
+
+If `-remoteWrite.url` command-line flag is configured, vmalert will persist alert's state in form of time series
+`ALERTS` and `ALERTS_FOR_STATE` to the specified destination. Such time series can be then queried via
+[vmui](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) or Grafana to track how alerts state
+changed in time.
+
+vmalert also stores last N state updates for each rule. To check updates, click on `Details` link next to rule's name
+on `/vmalert/groups` page and check the `Last updates` section:
+
+
+
+Rows in the section represent ordered rule evaluations and their results. The column `curl` contains an example of
+HTTP request sent by vmalert to the `-datasource.url` during evaluation. If specific state shows that there were
+no samples returned and curl command returns data - then it is very likely there was no data in datasource on the
+moment when rule was evaluated.
+
+vmalert also alows configuring more detailed logging for specific rule. Just set `debug: true` in rule's configuration
+and vmalert will start printing additional log messages:
+```terminal
+2022-09-15T13:35:41.155Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:41+02:00: query returned 0 samples (elapsed: 5.896041ms)
+2022-09-15T13:35:56.149Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29+by%28instance%29+%3E+0&step=15s&time=1663248945"
+2022-09-15T13:35:56.178Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: query returned 1 samples (elapsed: 28.368208ms)
+2022-09-15T13:35:56.178Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29&step=15s&time=1663248945"
+2022-09-15T13:35:56.179Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} created in state PENDING
+...
+2022-09-15T13:36:56.153Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:36:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} PENDING => FIRING: 1m0s since becoming active at 2022-09-15 15:35:56.126006 +0200 CEST m=+39.384575417
+```
+
+
## Profiling
`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
diff --git a/app/vmalert/vmalert_state.png b/app/vmalert/vmalert_state.png
new file mode 100644
index 000000000..5bf656b01
Binary files /dev/null and b/app/vmalert/vmalert_state.png differ
diff --git a/app/vmalert/vmalert_ts_data_delay.gif b/app/vmalert/vmalert_ts_data_delay.gif
new file mode 100644
index 000000000..2da024b46
Binary files /dev/null and b/app/vmalert/vmalert_ts_data_delay.gif differ
diff --git a/app/vmalert/vmalert_ts_normal.gif b/app/vmalert/vmalert_ts_normal.gif
new file mode 100644
index 000000000..a05c74061
Binary files /dev/null and b/app/vmalert/vmalert_ts_normal.gif differ
diff --git a/app/vmalert/web.qtpl b/app/vmalert/web.qtpl
index 2d35e5880..7346a2705 100644
--- a/app/vmalert/web.qtpl
+++ b/app/vmalert/web.qtpl
@@ -384,6 +384,7 @@
+ {% if rule.Type == "alerting" %}
+ {% endif %}
+ {% if rule.Type == "alerting" %}
+ {% endif %}
diff --git a/app/vmalert/web.qtpl.go b/app/vmalert/web.qtpl.go
index b4648bbbd..08baf5333 100644
--- a/app/vmalert/web.qtpl.go
+++ b/app/vmalert/web.qtpl.go
@@ -1187,6 +1187,11 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
+ `)
+//line app/vmalert/web.qtpl:387
+ if rule.Type == "alerting" {
+//line app/vmalert/web.qtpl:387
+ qw422016.N().S(`
@@ -1194,13 +1199,18 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
`)
-//line app/vmalert/web.qtpl:393
- qw422016.E().V(rule.Duration)
-//line app/vmalert/web.qtpl:393
- qw422016.N().S(` seconds
+//line app/vmalert/web.qtpl:394
+ qw422016.E().V(rule.Duration)
+//line app/vmalert/web.qtpl:394
+ qw422016.N().S(` seconds
+ `)
+//line app/vmalert/web.qtpl:398
+ }
+//line app/vmalert/web.qtpl:398
+ qw422016.N().S(`
@@ -1208,27 +1218,32 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
`)
-//line app/vmalert/web.qtpl:403
+//line app/vmalert/web.qtpl:405
for _, k := range labelKeys {
-//line app/vmalert/web.qtpl:403
+//line app/vmalert/web.qtpl:405
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:404
+//line app/vmalert/web.qtpl:406
qw422016.E().S(k)
-//line app/vmalert/web.qtpl:404
+//line app/vmalert/web.qtpl:406
qw422016.N().S(`=`)
-//line app/vmalert/web.qtpl:404
+//line app/vmalert/web.qtpl:406
qw422016.E().S(rule.Labels[k])
-//line app/vmalert/web.qtpl:404
+//line app/vmalert/web.qtpl:406
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:405
+//line app/vmalert/web.qtpl:407
}
-//line app/vmalert/web.qtpl:405
+//line app/vmalert/web.qtpl:407
qw422016.N().S(`
+ `)
+//line app/vmalert/web.qtpl:411
+ if rule.Type == "alerting" {
+//line app/vmalert/web.qtpl:411
+ qw422016.N().S(`
@@ -1236,28 +1251,33 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
`)
-//line app/vmalert/web.qtpl:415
- for _, k := range annotationKeys {
-//line app/vmalert/web.qtpl:415
- qw422016.N().S(`
+//line app/vmalert/web.qtpl:418
+ for _, k := range annotationKeys {
+//line app/vmalert/web.qtpl:418
+ qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:416
- qw422016.E().S(k)
-//line app/vmalert/web.qtpl:416
- qw422016.N().S(`:
+//line app/vmalert/web.qtpl:419
+ qw422016.E().S(k)
+//line app/vmalert/web.qtpl:419
+ qw422016.N().S(`:
`)
-//line app/vmalert/web.qtpl:417
- qw422016.E().S(rule.Annotations[k])
-//line app/vmalert/web.qtpl:417
- qw422016.N().S(`
+//line app/vmalert/web.qtpl:420
+ qw422016.E().S(rule.Annotations[k])
+//line app/vmalert/web.qtpl:420
+ qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:418
- }
-//line app/vmalert/web.qtpl:418
- qw422016.N().S(`
+//line app/vmalert/web.qtpl:421
+ }
+//line app/vmalert/web.qtpl:421
+ qw422016.N().S(`
+ `)
+//line app/vmalert/web.qtpl:425
+ }
+//line app/vmalert/web.qtpl:425
+ qw422016.N().S(`
@@ -1265,17 +1285,17 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
@@ -1283,9 +1303,9 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
Last `)
-//line app/vmalert/web.qtpl:434
+//line app/vmalert/web.qtpl:438
qw422016.N().D(len(rule.Updates))
-//line app/vmalert/web.qtpl:434
+//line app/vmalert/web.qtpl:438
qw422016.N().S(` updates:
@@ -1300,201 +1320,201 @@ func StreamRuleDetails(qw422016 *qt422016.Writer, r *http.Request, rule APIRule)
`)
-//line app/vmalert/web.qtpl:447
+//line app/vmalert/web.qtpl:451
for _, u := range rule.Updates {
-//line app/vmalert/web.qtpl:447
+//line app/vmalert/web.qtpl:451
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:450
+//line app/vmalert/web.qtpl:454
qw422016.E().S(u.time.Format(time.RFC3339))
-//line app/vmalert/web.qtpl:450
+//line app/vmalert/web.qtpl:454
qw422016.N().S(`
|
`)
-//line app/vmalert/web.qtpl:452
+//line app/vmalert/web.qtpl:456
qw422016.N().D(u.samples)
-//line app/vmalert/web.qtpl:452
+//line app/vmalert/web.qtpl:456
qw422016.N().S(` |
`)
-//line app/vmalert/web.qtpl:453
+//line app/vmalert/web.qtpl:457
qw422016.N().FPrec(u.duration.Seconds(), 3)
-//line app/vmalert/web.qtpl:453
+//line app/vmalert/web.qtpl:457
qw422016.N().S(`s |
`)
-//line app/vmalert/web.qtpl:454
+//line app/vmalert/web.qtpl:458
qw422016.E().S(u.at.Format(time.RFC3339))
-//line app/vmalert/web.qtpl:454
+//line app/vmalert/web.qtpl:458
qw422016.N().S(` |
|
`)
-//line app/vmalert/web.qtpl:460
+//line app/vmalert/web.qtpl:464
if u.err != nil {
-//line app/vmalert/web.qtpl:460
+//line app/vmalert/web.qtpl:464
qw422016.N().S(`
-
+ |
`)
-//line app/vmalert/web.qtpl:463
+//line app/vmalert/web.qtpl:467
qw422016.E().V(u.err)
-//line app/vmalert/web.qtpl:463
+//line app/vmalert/web.qtpl:467
qw422016.N().S(`
|
`)
-//line app/vmalert/web.qtpl:466
+//line app/vmalert/web.qtpl:470
}
-//line app/vmalert/web.qtpl:466
+//line app/vmalert/web.qtpl:470
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:467
+//line app/vmalert/web.qtpl:471
}
-//line app/vmalert/web.qtpl:467
+//line app/vmalert/web.qtpl:471
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:469
+//line app/vmalert/web.qtpl:473
tpl.StreamFooter(qw422016, r)
-//line app/vmalert/web.qtpl:469
+//line app/vmalert/web.qtpl:473
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
}
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
func WriteRuleDetails(qq422016 qtio422016.Writer, r *http.Request, rule APIRule) {
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
qw422016 := qt422016.AcquireWriter(qq422016)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
StreamRuleDetails(qw422016, r, rule)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
qt422016.ReleaseWriter(qw422016)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
}
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
func RuleDetails(r *http.Request, rule APIRule) string {
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
qb422016 := qt422016.AcquireByteBuffer()
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
WriteRuleDetails(qb422016, r, rule)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
qs422016 := string(qb422016.B)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
qt422016.ReleaseByteBuffer(qb422016)
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
return qs422016
-//line app/vmalert/web.qtpl:470
+//line app/vmalert/web.qtpl:474
}
-//line app/vmalert/web.qtpl:474
+//line app/vmalert/web.qtpl:478
func streambadgeState(qw422016 *qt422016.Writer, state string) {
-//line app/vmalert/web.qtpl:474
+//line app/vmalert/web.qtpl:478
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:476
+//line app/vmalert/web.qtpl:480
badgeClass := "bg-warning text-dark"
if state == "firing" {
badgeClass = "bg-danger"
}
-//line app/vmalert/web.qtpl:480
+//line app/vmalert/web.qtpl:484
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:481
+//line app/vmalert/web.qtpl:485
qw422016.E().S(state)
-//line app/vmalert/web.qtpl:481
+//line app/vmalert/web.qtpl:485
qw422016.N().S(`
`)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
}
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
func writebadgeState(qq422016 qtio422016.Writer, state string) {
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
qw422016 := qt422016.AcquireWriter(qq422016)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
streambadgeState(qw422016, state)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
qt422016.ReleaseWriter(qw422016)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
}
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
func badgeState(state string) string {
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
qb422016 := qt422016.AcquireByteBuffer()
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
writebadgeState(qb422016, state)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
qs422016 := string(qb422016.B)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
qt422016.ReleaseByteBuffer(qb422016)
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
return qs422016
-//line app/vmalert/web.qtpl:482
+//line app/vmalert/web.qtpl:486
}
-//line app/vmalert/web.qtpl:484
+//line app/vmalert/web.qtpl:488
func streambadgeRestored(qw422016 *qt422016.Writer) {
-//line app/vmalert/web.qtpl:484
+//line app/vmalert/web.qtpl:488
qw422016.N().S(`
restored
`)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
}
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
func writebadgeRestored(qq422016 qtio422016.Writer) {
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
qw422016 := qt422016.AcquireWriter(qq422016)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
streambadgeRestored(qw422016)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
qt422016.ReleaseWriter(qw422016)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
}
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
func badgeRestored() string {
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
qb422016 := qt422016.AcquireByteBuffer()
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
writebadgeRestored(qb422016)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
qs422016 := string(qb422016.B)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
qt422016.ReleaseByteBuffer(qb422016)
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
return qs422016
-//line app/vmalert/web.qtpl:486
+//line app/vmalert/web.qtpl:490
}
diff --git a/docs/vmalert.md b/docs/vmalert.md
index 2f7e19fc5..8d6f18159 100644
--- a/docs/vmalert.md
+++ b/docs/vmalert.md
@@ -642,6 +642,61 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
+## Troubleshooting
+
+vmalert executes configured rules within certain intervals. It is expected that at the moment when rule is executed,
+the data is already present in configured `-datasource.url`:
+
+
+
+Usually, troubles start to appear when data in `-datasource.url` is delayed or absent. In such cases, evaluations
+may get empty response from datasource and produce empty recording rules or reset alerts state:
+
+
+
+Try the following recommendations in such cases:
+
+* Always configure group's `evaluationInterval` to be bigger or equal to `scrape_interval` at which metrics
+are delivered to the datasource;
+* If you know in advance, that data in datasource is delayed - try changing vmalert's `-datasource.lookback`
+command-line flag to add a time shift for evaluations;
+* If time intervals between datapoints in datasource are irregular - try changing vmalert's `-datasource.queryStep`
+command-line flag to specify how far search query can lookback for the recent datapoint. By default, this value
+is equal to group's `evaluationInterval`.
+
+Sometimes, it is not clear why some specific alert fired or didn't fire. It is very important to remember, that
+alerts with `for: 0` fire immediately when their expression becomes true. And alerts with `for > 0` will fire only
+after multiple consecutive evaluations, and at each evaluation their expression must be true. If at least one evaluation
+becomes false, then alert's state resets to the initial state.
+
+If `-remoteWrite.url` command-line flag is configured, vmalert will persist alert's state in form of time series
+`ALERTS` and `ALERTS_FOR_STATE` to the specified destination. Such time series can be then queried via
+[vmui](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) or Grafana to track how alerts state
+changed in time.
+
+vmalert also stores last N state updates for each rule. To check updates, click on `Details` link next to rule's name
+on `/vmalert/groups` page and check the `Last updates` section:
+
+
+
+Rows in the section represent ordered rule evaluations and their results. The column `curl` contains an example of
+HTTP request sent by vmalert to the `-datasource.url` during evaluation. If specific state shows that there were
+no samples returned and curl command returns data - then it is very likely there was no data in datasource on the
+moment when rule was evaluated.
+
+vmalert also alows configuring more detailed logging for specific rule. Just set `debug: true` in rule's configuration
+and vmalert will start printing additional log messages:
+```terminal
+2022-09-15T13:35:41.155Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:41+02:00: query returned 0 samples (elapsed: 5.896041ms)
+2022-09-15T13:35:56.149Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29+by%28instance%29+%3E+0&step=15s&time=1663248945"
+2022-09-15T13:35:56.178Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: query returned 1 samples (elapsed: 28.368208ms)
+2022-09-15T13:35:56.178Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29&step=15s&time=1663248945"
+2022-09-15T13:35:56.179Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} created in state PENDING
+...
+2022-09-15T13:36:56.153Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:36:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} PENDING => FIRING: 1m0s since becoming active at 2022-09-15 15:35:56.126006 +0200 CEST m=+39.384575417
+```
+
+
## Profiling
`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
diff --git a/docs/vmalert_state.png b/docs/vmalert_state.png
new file mode 100644
index 000000000..5bf656b01
Binary files /dev/null and b/docs/vmalert_state.png differ
diff --git a/docs/vmalert_ts_data_delay.gif b/docs/vmalert_ts_data_delay.gif
new file mode 100644
index 000000000..2da024b46
Binary files /dev/null and b/docs/vmalert_ts_data_delay.gif differ
diff --git a/docs/vmalert_ts_normal.gif b/docs/vmalert_ts_normal.gif
new file mode 100644
index 000000000..a05c74061
Binary files /dev/null and b/docs/vmalert_ts_normal.gif differ