Uniqueness of rule is now defined by combination of its name, expression and
labels. The hash of the combination is now used as rule ID and identifies rule within the group.
Set of rules from coreos/kube-prometheus was added for testing purposes to
verify compatibility. The check also showed that `vmalert` doesn't support
`query` template function that was mentioned as limitation in README.
The feature allows to speed up group rules execution by
executing them concurrently.
Change also contains README changes to reflect configuration
details.
Previously it was possible that the connection is served after the server is closed if the following
steps are performed:
1) Server accepts new connection.
2) Server.MustClose() is called and successfully finished.
3) Server starts processing the connection accepted at step 1. There could be various crashes
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534 since the storage may be already closed.
Now the server closes the connection at step 3 without processing it.
* feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total`
* refactor(vmselect): use free space instead of used space in tmp block file metrics
* fix: add `bytes` suffix to tmp dir free space metric
* vmalert: Add recording rules support.
Recording rules support required additional service refactoring since
it wasn't planned to support them from the very beginning. The list
of changes is following:
* new entity RecordingRule was added for writing results of MetricsQL
expressions into remote storage;
* interface Rule now unites both recording and alerting rules;
* configuration parser was moved to separate package and now performs
more strict validation;
* new endpoint for listing all groups and rules in json format was added;
* evaluation interval may be set to every particular group;
* vmalert: uncomment tests
* vmalert: rm outdated TODO
* vmalert: fix typos in README
Previously such rows were distributed to the original storage node or to the next storage node.
This may result to uneven load among the remaining storage nodes.
Previously vminsert could write multiple copies of the data to a single vmstorage node when the ingestion rate
exceeds the maximum throughput for connections to vmstorage nodes.
Previously the ingestion performance has been limited by the slowest vmstorage node.
Now vminsert should re-route data from the slowest vmstorage node to the remaining nodes.
Before the change we were sending notifications to notifier
if following conditions are met:
* alert is in Fire state
* alert is in Inactive state
We were sending Inactive notifications to resolve alert ASAP.
Unfortunately, we were sending resolves for Pending alerts that become
Inactive, which is wrong.
In this change we delete alert from the active list if
it was Pending and become Inactive. In this way we now
have Inactive alerts only if they were in state Fire before.
See test change for example.