Stress the importance of specifying of all Alertmanager
URLs in vmalert's `-notifier.url` or `notifier.config`
if it runs in cluster mode.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3547
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Allow configuring the default number of stored rule's update states in memory
via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
`update_entries_limit` configuration param.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response
* app/vmbackupmanager: drop old metrics before creating new ones
* app/vmbackupmanager: use `_total` postfix for counter metrics
* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics
* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention
* app/vmbackupmanager: address review feedback
* app/vmbackupmanager: fix metric name
* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage
* app/vmbackupmanager: improve performance for backup size calculation
* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic
* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package
* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones
* lit/formatutil: add comment to make linter happy
* app/vmbackupmanager: address review feedback
The system links are absolute, e.g. they start from `/`, so there are high chances
they won't work as expected when requested via proxy such as vmselect with -vmalert.proxyURL
command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3424
http.Request was used as a part of state struct
for generating the curl command when viewing the rule's
state changes.
It appears, that holding a referencing is far more expensive
than generating the curl command immediately.
On the test with 40k rules, this change reduces memory
and CPU usage by 50%.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* vmalert: correctly return error for RW failures
By mistake, in 0989649ad0 the error
for remote write failures weren't return to user.
This change fixes it.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Method `metrics()` now pre-allocates slices for labels
and results from query responses. This reduces the number
of allocations on the hot path for instant requests.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The recent change in modifying default value
of `datasource.queryStep` flag resulted in situation
where replay mode was always running queries with
step=`datasource.queryStep`. When it should always
use rule's evaluation interval.
The fix is related not to replay mode only, but
for all Range requests. Now step param is set
individually for each mode.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* app/vmalert: add `remoteWrite.sendTimeout` command-line flag to configure timeout for sending data to `remoteWrite.url`
* vmalert: remove WriteTimeout from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the WriteTimeout.
* vmalert: remove DisablePathAppend from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the DisablePathAppend.
Co-authored-by: hagen1778 <roman@victoriametrics.com>
- Return meta-labels for the discovered targets via promutils.Labels
instead of map[string]string. This improves the speed of generating
meta-labels for discovered targets by up to 5x.
- Remove memory allocations in hot paths during ScrapeWork generation.
The ScrapeWork contains scrape settings for a single discovered target.
This improves the service discovery speed by up to 2x.
* flag reference update
there is no flag `-datasource.disablePathAppend` and datasource actually checking for `-remoteRead.disablePathAppend`
* update source for doc as well
The default list of alerting rules contains the basic
rules for checking vmalert's health state and is recommended
to use for monitoring vmalert deployments.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
complications in user templates, while not fixing various corner cases
such as backslash chars in the input string.
See 1de15ad490
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
was "fixed" by urlencoding the whole string passed to -external.alert.source
command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
See 00c838353d
and 4bd0244599
This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.
This commit deprecates crlfEscape template function and adds the following new template functions:
- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
and for html-escaping the input string, so it could be safely embedded as a plaintext
into html.
This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
This reverts commit 00c838353d.
Reason for revert: it incorrectly fixes the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 .
Now `-external.alert.source=explore?orgId=1&left=...` is converted to the following invalid url, which cannot be handled by Grafana:
https://grafana.example.com/explore%3ForgId%3D1%26left%3D...
The next commit will contain the correct fix of the issue - the `quotesEscape` function must
properly escape the string, so it could be embedded into JSON string. This function must
properly escape \n\r chars too. In this case the `crlfEscape` function becomes unnecessary.
Actually, the next commit makes the `crlfEscape` function deprecated.
The message about dropped data still remains at `error` level.
The change supposed to make log message more clear about how
serious it is.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The default value of `-datasource.queryStep` has changed, so we update
the troubleshooting docs accordingly.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Due to auto-refactoring, the filed `state` was automatically
renamed to `ruleState` when the entity with the same name
was renamed in other file. Reverting the change.
https://github.com/VictoriaMetrics/helm-charts/issues/391
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
vmalert: prevent duplicating label `alertname` for notifications
The issue has no impact on alerting procedure. But still needs to be fixed
for clarity.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3053
Signed-off-by: lihaowei <haoweili35@gmail.com>
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.
This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
* Use vm_account_id and vm_project_id labels to be consistent with https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels
* Document the feature that vmalert now exposes vm_account_id and vm_project_id
labels if -clusterMode is set.
* Use literal strings instead of string constants for vm_account_id and vm_project_id.
This improves code readability.
The change is supposed to provide additional flexibility for generating alert's
source link based on label values.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Incorrect 301 redirects can be cached by user agents such as web browsers.
This can complicate recovery procedure after the incorrect redirect is fixed,
e.g. web browser cache must be reset.
The related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
Allow configuring authorization params per list of targets
in vmalert's notifier config for `static_configs`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2690
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
According to Ruler specification, only labels returned within time series
should be available for use in annotations.
For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013
This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* lib/{httpserver,netutil}: allow to define min and max TLS version of the http server
* lib/httpserver: added descriptions about tls supported versions
* lib/netutil: check minimal tls version, added supported tls versions to error
* wip
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
- Clarify the description for -datasource.queryStep command-line flag
- Consistently use a single dash in front of -datasource.queryStep command-line flag
- Update -help output at docs/vmalert.md
- Consistently use single dash in front of command-line flags instead of double dashes.
- Add a warning that too small -search.latencyOffset may lead to incomplete query results.
Change default value for command-line flag `datasource.queryStep` from `0s` to `5m`.
Param `step` is added by vmalert to every rule evaluation request sent to datasource.
Before this change, `step` was equal to group's evaluation interval by default.
Param `step` for instant queries defines how far VM can look back for the last written data point.
The change supposed to improve reliability of the rules evaluation when evaluation interval
is lower than scraping interval.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>