Commit graph

7198 commits

Author SHA1 Message Date
hagen1778
3773510e8f
app/vmalert: verify alert name correctness in restore test
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6eb205f8b0)
2023-11-02 16:02:29 +01:00
Hui Wang
44fcdf0cf0
vmalert: reduce restore query request for each alerting rule (#5265)
reduce the number of queries for restoring alerts state on start-up.
The change should speed up the restore process and reduce pressure on `remoteRead.url`.

(cherry picked from commit 90d45574bf)
2023-11-02 16:02:28 +01:00
Aliaksandr Valialkin
7fc5178a4b
app/vmselect/promql: add missing trace message in rollupResultCache.GetSeries() 2023-11-02 09:17:13 +01:00
Aliaksandr Valialkin
44227c0287
docs/CHANGELOG.md: typo fix: tis -> this 2023-11-02 08:33:48 +01:00
Aliaksandr Valialkin
95f5984aae
docs/Single-server-VictoriaMetrics.md: document why data inside <-storageDataPath>/snapshots directory should be manipulated only via snapshot API 2023-11-02 08:31:08 +01:00
Aliaksandr Valialkin
c04e667f9d
docs/CHANGELOG.md: document v1.93.7 LTS release 2023-11-02 08:21:10 +01:00
Aliaksandr Valialkin
369d37749d
app/vmagent/remotewrite: add -remoteWrite.shardByURL.labels command-line flag
This command-line flag can be used for specifying a list of labels used for sharding
among -remoteWrite.url entries when -remoteWrite.shardByURL command-line flag is set.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942
2023-11-01 23:09:08 +01:00
Alexander Marshalov
ffeec24811
vmauth: add browser authorization request for http requests without… (#5234)
* vmauth: add browser authorization request for http requests without credentials to a route that is not in the `unauthorized_user` section (when `unauthorized_user` is specified).

* add link to issue in CHANGELOG

* Extend vmauth docs

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-11-01 21:00:52 +01:00
Aliaksandr Valialkin
ece7024f11
app/vmselect/promql: reduce the minimum lookbehind window for enabling SLO/SLI optimizations from 24 hours to 6 hours
This reduction is based on production testing.

Also expose -search.minWindowForInstantRollupOptimization command-line flag, so users could fine-tune this arg for their needs
2023-11-01 20:19:19 +01:00
Aliaksandr Valialkin
e4365dbe3e
app/vmselect: run make quicktemplate-gen after b8739bc00b 2023-11-01 17:53:30 +01:00
Github Actions
86aae00a60
Automatic update operator docs from VictoriaMetrics/operator@49826be (#5270)
Co-authored-by: Alexander Marshalov <_@marshalov.org>
2023-11-01 17:50:50 +01:00
Artem Navoiev
23d09684a6
add Try new Docs button in the current docs
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-11-01 17:50:27 +01:00
Aliaksandr Valialkin
ae9b4c94bc
app/vmselect: return stats.seriesFetched as string instead of number
vmalert expects string value for stats.seriesFetched, so it is impossible
switching to number without breaking compatibility with old vmalert releases :(

It is still unclear why stats.seriesFetched has string type in the first place...
2023-11-01 17:49:28 +01:00
Aliaksandr Valialkin
6a98f9df54
app/vmui: show query execution duration in the header of query input field
This should simplify the process of query optimization
2023-11-01 16:46:42 +01:00
Hui Wang
4fafdda13e
vmalert: support specifying full http url in notifier static_configs target (#5261)
* vmalert: support specifying full http or https urls in notifier static_configs target address
* show right label results in ui
2023-11-01 16:44:54 +01:00
Aliaksandr Valialkin
c5e3b11762
app/vmselect/promql: apply SLO-like optimization to all the count_*_over_time() functions
This is a follow-up for 41a0fdaf39
2023-11-01 09:58:50 +01:00
Aliaksandr Valialkin
b96d55e1e4
app/vmselect/promql: typo fix, which could lead to panic during range query execution
The panic is:

  BUG: unexpected values after merging new values

This is a follow-up for 41a0fdaf39
2023-11-01 09:58:50 +01:00
Github Actions
51afab3c7f
Automatic update operator docs from VictoriaMetrics/operator@57c1bf6 (#5266) 2023-11-01 09:58:50 +01:00
Aliaksandr Valialkin
28f0610e14
app/vmui: fix non-working Disable cache checkbox at JSON and Table views 2023-10-31 22:58:15 +01:00
Aliaksandr Valialkin
7b7ad44e84
app/vmselect/promql: properly calculate rollup result if lookbehind window isn't set
This is a follow-up for 41a0fdaf39
2023-10-31 22:23:04 +01:00
Aliaksandr Valialkin
744f8c3fe7
app/vmselect/promql: add outliers_iqr(q) and outlier_iqr_over_time(m[d]) functions
These functions allow detecting anomalies in series and samples using Interquartile range method.
See Outliers section at https://en.wikipedia.org/wiki/Interquartile_range for more details.
2023-10-31 22:14:14 +01:00
Aliaksandr Valialkin
48b842d2ad
vendor: run make vendor-update 2023-10-31 20:20:07 +01:00
Aliaksandr Valialkin
9661918bb4
app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d
Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions,
are optimized via partial result caching:

- sum_over_time()
- count_over_time()
- avg_over_time()
- increase()
- rate()

The basic idea of optimization is to calculate

  rf(m[d] @ t)

as

  rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d))

where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously

The offset may be in the range of up to 1 hour.
2023-10-31 20:08:38 +01:00
Aliaksandr Valialkin
9ba007a636
app/vmselect/promql: wrap too long line after a950873fff 2023-10-31 19:11:05 +01:00
Aliaksandr Valialkin
5e7d495eb1
lib/httpserver: follow-up for 0638bbe69c
- Replace spaces with underscores in the `reason` label value for the vm_http_request_errors_total metric
  in order be consistent with Prometheus-like naming

- Clarify the description for the change at docs/CHANGELOG.md

Updates https://github.com/victoriaMetrics/victoriaMetrics/issues/4590
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5166
2023-10-31 19:10:48 +01:00
Aliaksandr Valialkin
2288f81c5b
lib/persistentqueue: properly re-create flock.lock file inside directory if persistent queue is broken.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5249

Thanks to @Sniper91 for the bugreport and initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5233
2023-10-31 19:10:26 +01:00
Aliaksandr Valialkin
09c5ac238a
lib/httpserver: call Request.Header() only once instead of calling it each time a new request header is set
This is a follow-up for ad839aa492
2023-10-31 19:10:09 +01:00
Artem Navoiev
fc7e2e887e
github actions: fix typo in hugo version
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-10-31 19:09:48 +01:00
Artem Navoiev
0293a7033c
github actions: use 0.119 hugo version as far latest contains bug
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-10-31 19:09:19 +01:00
Aliaksandr Valialkin
a7a73b9845
docs/Cluster-VictoriaMetrics.md: clarify the description on why -dosnwampling.period must be set at both vmstorage and vmselect
This is a follow-up for ca7457d906
2023-10-31 19:08:21 +01:00
Aliaksandr Valialkin
ffcd757533
docs/Single-server-VictoriaMetrics.md: cosmetic fixes after 23369321f1 2023-10-31 19:05:45 +01:00
Aliaksandr Valialkin
40a53b516d
docs/CHANGELOG.md: move the description for -http.header.* command-line flags from SECURITY to FEATURE
The SECURITY label should be applied only to changes, which fix security issues.
The change at ad839aa492 adds new command-line flags, which can be used
for improving security in some cases. They do not fix any security issues.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5111
2023-10-31 19:05:01 +01:00
Aliaksandr Valialkin
c22b63af04
lib/storage: follow-up for 29cebd82fb
Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32().
This makes the code more clear.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159
2023-10-31 19:03:50 +01:00
hagen1778
8c3bac8f40
dashboards/cluster: fix description about max threshold for Concurrent selects panel.
Before, it was mistakenly implying that `max` is equal to the double of available CPUs.

Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5214

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 19:03:21 +01:00
Roman Khavronenko
9d8f93050c
app/vmselect: expose vm_memory_intensive_queries_total counter metric (#5208)
The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit
is exceeded by a query. This metric should help to identify expensive and heavy queries
without inspecting the logs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 19:02:22 +01:00
hagen1778
f9c7822588
docs: follow-up for 0638bbe69c
0638bbe69c
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a8051d48c4)
2023-10-31 12:54:57 +01:00
venkatbvc
85fd4917b1
vmauth: add counter metrics for auth successes and failures (#5166)
New labels `reason="wrong basic auth creds"` and `reason="wrong auth key"` were
added to metric `vm_http_request_errors_total`  to help identify auth errors.

https://github.com/victoriaMetrics/victoriaMetrics/issues/4590

Co-authored-by: Rao, B V Chalapathi <b_v_chalapathi.rao@nokia.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 0638bbe69c)
2023-10-31 12:54:57 +01:00
hagen1778
9debdb497c
dashboards/vmalert: add new panel Missed evaluations
The new panel supposed to indicate alerting groups that miss their evaluations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit aaf9e3d526)
2023-10-31 10:35:57 +01:00
hagen1778
659171686c
deployment/alerts: add TooManyMissedIterations alerting rule
The new rule for vmalert supposed to detect groups that miss their
evaulations due to slow queries.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9866974a53)
2023-10-31 10:35:57 +01:00
hagen1778
497c708aaa
dashboards: fix Errors rate to Alertmanager filter
The panel `Errors rate to Alertmanager` had `group` label filter
applied to the expression, while the metric `vmalert_alerts_send_errors_total`
doesn't have that label. This resulted into always empty results.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8874b525b7)
2023-10-31 10:35:57 +01:00
Roman Khavronenko
afd160f1dc
docs: mention information loss when downsampling gauges (#5204)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 23369321f1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-30 19:05:39 +01:00
Roman Khavronenko
c6d74108fb
docs: explain motivation behind having -downsampling.period on vmselect (#5205)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-30 15:29:35 +01:00
Hui Wang
8a786e5df4
vmalert: fix alert firing state in replay mode (#5192)
fix possible missing firing states for alerting rules in replay mode
Before if one firing stage is bigger than single query request range, like rule with a big `for`, alerting rule won't able to be detected as firing.

Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit abcb21aa5e)
2023-10-30 13:55:48 +01:00
hagen1778
f0d10e2004
docs/troubleshooting: mention issue with un-ordered labels
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5219#issuecomment-1773441711

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e964df8039)
2023-10-30 13:55:45 +01:00
hagen1778
14b1997659
docs: rm mention of default values for security HTTP headers
The headers, their corresponding flags are mentioned at
https://docs.victoriametrics.com/#security

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a64b37cf24)
2023-10-30 11:46:49 +01:00
Dima Lazerka
ed8fc04898
lib/httpserver: add flags to specify HSTS / Frame-Options / CSP headers for httpserver (#5111)
support `Strict-Transport-Security`, `Content-Security-Policy` and `X-Frame-Options`
HTTP headers in all VictoriaMetrics components.
The values for headers can be specified by users via the following flags:
`-http.header.hsts`, `-http.header.csp` and `-http.header.frameOptions`.

Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit ad839aa492)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-30 11:41:38 +01:00
Roman Khavronenko
733b73ffed
lib/storage: log warning about RO mode only on state change (#5191)
Before, vmstorage would log the same message each second producing excessive
amount of logs.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 29cebd82fb)
2023-10-30 11:29:49 +01:00
Aliaksandr Valialkin
a66c261b55
app/vmui: change the order of tables at Top queries tab
Move the most interesting table - queries with the most summary time to execute - to the top
2023-10-28 11:57:08 +02:00
Aliaksandr Valialkin
15dda54e79
lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller
This allows substituting FATAL panics with recoverable runtime errors such as missing or invalid TLS CA file
and/or missing/invalid /var/run/secrets/kubernetes.io/serviceaccount/namespace file.
Now these errors are logged instead of PANIC'ing, so they can be fixed by updating the corresponding files
without the need to restart vmagent.

This is a follow-up for 90427abc65
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5243
2023-10-27 20:27:58 +02:00
Hui Wang
a37125d043
lib/promscrape/discovery/kubernetes: avoid possible panic if given caFile under kubernetes.SDConfig.HTTPClientConfig is not exist (#5243)
follow up d5a599badc
2023-10-27 20:27:58 +02:00