github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
hagen1778	d6ae082598	deployment/dashboards: respect `job` and `instance` filters for `alerts` annotation in cluster and single-node dashboards Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-14 09:38:15 +01:00
Aliaksandr Valialkin	43e3302803	docs/CHANGELOG.md: document `0e056ddb2d` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5203	2023-11-14 01:24:05 +01:00
Zakhar Bessarab	37997abd14	vmcluster: re-routing enhancement (#5293 ) * app/vmstorage: close vminsert connections gradually before stopping storage Implements graceful shutdown approach suggested here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1768146878 Test results for this can be found here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1790640274 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update graceful shutdown logic - close connections from vminsert in determenistic order - update flag description - lower default timeout to 25 seconds. 25 seconds value was chosen because the lowest default value used in default configuration deployments is 30s(default value in Kubernetes and ansible-playbooks). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add information about re-routing enhancement during restart Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/changelog: add entry for new command-line flag Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * {app/vmstorage,lib/ingestserver}: address review feedback Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add note to update workload scheduler timeout Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * wip --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 01:03:44 +01:00
Aliaksandr Valialkin	8eed04b2c6	app/vmauth: add ability to drop the specified number of `/`-delimited prefix parts from request path This can be done via `drop_src_path_prefix_parts` option at `url_map` and `user` levels. See https://docs.victoriametrics.com/vmauth.html#dropping-request-path-prefix	2023-11-13 22:32:22 +01:00
Aliaksandr Valialkin	0feaeca3c1	lib/protoparser/promremotewrite: fall back to Snappy decoding if zstd decoding fails This case is possible after the following steps: 1. vmagent tries to perform handshake with the -remoteWrite.url in order to determine whether the remote storage supports zstd-compressed data. 2. The remote storage is unavailable during the handshake. In this case vmagent falls back to Snappy compression for the data sent to the remote storage. 3. vmagent compresses the collected data into blocks with Snappy and puts these blocks to persistent queue on disk. 4. The remote storage becomes available. 5. vmagent restarts, performs the handshake with the remote storage and detects that it supports zstd-compressed data. 6. vmagent starts sending Snappy-compressed data from persistent queue to the remote storage, while falsely advertizing it sends zstd-compressed data. 7. The remote storage receives Snappy-compressed data and fails unpacking it with zstd. The solution is to just fall back to Snappy decompression if zstd decompression fails. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301	2023-11-13 21:19:08 +01:00
Aliaksandr Valialkin	8af56ea2ed	lib/htmlcomponents: use relative links for the top page and for favicon.ico This allows hiding VictoriaMetrics components behind proxies with arbitrary path prefixes. For example, vmagent HTTP handlers can be served via /vmagent/ path prefix: - http://proxy/vmagent/targets - http://proxy/vmagent/service-discovery The path prefix can be arbitrary. For example, below are vmagent urls for /tenantID/vmagent/ path prefix: - http://proxy/tenantID/vmagent/targets - http://proxy/tenantID/vmagent/service-discovery While at it, consistently serve favicon.ico from any path directory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5306 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5307	2023-11-13 20:29:05 +01:00
Aliaksandr Valialkin	3e93fa61ad	lib/regexutil: properly handle alternate regexps surrounded by .+ or .* Previously the following regexps were improperly handled: .+foo\|bar.+ .foo\|bar. This could lead to unexpected regexp match results. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5297 Thanks to @Haleygo for the initial attempt to fix the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5308	2023-11-13 18:23:38 +01:00
Aliaksandr Valialkin	ba058a4514	docs/CHANGELOG.md: remove trailing whitespace after `bffd30b57a`	2023-11-13 09:24:29 +01:00
Aliaksandr Valialkin	eded218e8c	app/vmauth: properly pass `Host` header to backends Previously the `Host` header was remained unchanged when passing it in requests to backends. This may improperly work if the backend uses host-based routing. While at it, allows http/2.0 requests to backends. While VictoriaMetrics components do not accept http/2.0 requests, other backends can require such requests. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-13 09:05:39 +01:00
Aliaksandr Valialkin	61594d2bd8	app/vmauth: follow-up for `323f3720ed` - Re-use identically configured http.Transport across multiple users. This fixes handling of the limit on the number of connection, which can be established per each backend via -maxIdleConnsPerBackend command-line flag. This limit stopped working after `323f3720ed` - Add docs about backend TLS setup at https://docs.victoriametrics.com/vmauth.html#backend-tls-setup - Add ability to disable backend TLS verification for all the users via -backend.tlsInsecureSkipVerify command-line flag. This flag may be useful when -auth.config contains big number of users, and every user must disable backend TLS verification. - Add ability to specify TLS Root CA via tls_ca_file option at per-user basis and via -backend.tlsCAFile command-line flag across all the users. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-13 08:33:10 +01:00
Aliaksandr Valialkin	bfec8a3751	app/vmauth: improve docs a bit after `323f3720ed` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-11 12:49:28 +01:00
Aliaksandr Valialkin	230230cf0b	lib/logger: add `-loggerMaxArgLen` command-line flag for fine-tuning the maximum length of logged args	2023-11-11 12:30:08 +01:00
Aliaksandr Valialkin	80213f07fa	app/vmselect/promql: optimize instant queries with min_over_time() and max_over_time() rollup functions This is a follow-up for `41a0fdaf39`	2023-11-11 12:10:03 +01:00
Aliaksandr Valialkin	2db1a664e1	deployment: update Go builder from Go1.21.3 to Go1.21.4 See https://github.com/golang/go/issues?q=milestone%3AGo1.21.4+label%3ACherryPickApproved	2023-11-10 22:28:44 +01:00
Aliaksandr Valialkin	010dc15d16	lib/blockcache: do not cache entries, which were attempted to be accessed 1 or 2 times Previously entries which were accessed only 1 time weren't cached. It has been appeared that some rarely executed heavy queries may read indexdb block twice in a row instead of once. There is no need in caching such a block then. This change should eliminate cache size spikes for indexdb/dataBlocks when such heavy queries are executed. Expose -blockcache.missesBeforeCaching command-line flag, which can be used for fine-tuning the number of cache misses needed before storing the block in the caching.	2023-11-10 22:28:03 +01:00
Zakhar Bessarab	73a1862182	docs/changelog: document vmbackupmanager bugfix (#5303 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-11-08 18:51:14 +01:00
Roman Khavronenko	bffd30b57a	app/vmalert: update remote-write process (#5284 ) * app/vmalert: update remote-write process * automatically retry remote-write requests on closed connections. The change should reduce the amount of logs produced in environments with short-living connections or environments without support of keep-alive on network balancers. * increment `vmalert_remotewrite_errors_total` metric if all retries to send remote-write request failed. Before, this metric was incremented only if remote-write client's buffer is overloaded. * increment `vmalert_remotewrite_dropped_rows_total` amd `vmalert_remotewrite_dropped_bytes_total` metrics if remote-write client's buffer is overloaded. Before, these metrics were incremented only after unsuccessful HTTP calls. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Hui Wang <haley@victoriametrics.com>	2023-11-08 14:53:07 +08:00
Yury Molodov	f90d2ec843	vmui: display query error on Explore metrics page (#5272 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5202	2023-11-03 16:23:19 +01:00
Zakhar Bessarab	323f3720ed	app/vmauth: add option to skip TLS verification (#5256 ) Add `tls_insecure_skip_verify` option on per-user basis which allows to disable TLS verification for all requests to backend on behalf of this user. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-11-03 12:04:17 +01:00
Aliaksandr Valialkin	65db6609eb	docs/CHANGELOG.md: update the description of the optimization for SLO/SLI-like queries according to latest changes See commits `4497a08e3d` and `92826b0b4a`	2023-11-02 20:05:05 +01:00
Roman Khavronenko	b5254199c6	app/vmalert: add label `file` pointing to the group's filename to metrics (#5281 ) The filename should help identifying alerting rules belonging to specific groups with identical names but different filenames. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5267 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-02 16:01:31 +01:00
Hui Wang	90d45574bf	vmalert: reduce restore query request for each alerting rule (#5265 ) reduce the number of queries for restoring alerts state on start-up. The change should speed up the restore process and reduce pressure on `remoteRead.url`.	2023-11-02 15:22:13 +01:00
Aliaksandr Valialkin	dd33fc0c76	docs/CHANGELOG.md: typo fix: tis -> this	2023-11-02 08:33:40 +01:00
Aliaksandr Valialkin	87a86ec9db	docs/CHANGELOG.md: document v1.93.7 LTS release	2023-11-02 08:21:00 +01:00
Aliaksandr Valialkin	ed70a40669	app/vmagent/remotewrite: add -remoteWrite.shardByURL.labels command-line flag This command-line flag can be used for specifying a list of labels used for sharding among -remoteWrite.url entries when -remoteWrite.shardByURL command-line flag is set. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942	2023-11-01 23:08:54 +01:00
Alexander Marshalov	828ddd4e4f	vmauth: add browser authorization request for http requests without… (#5234 ) * vmauth: add browser authorization request for http requests without credentials to a route that is not in the `unauthorized_user` section (when `unauthorized_user` is specified). * add link to issue in CHANGELOG * Extend vmauth docs * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-01 20:59:46 +01:00
Aliaksandr Valialkin	da887b49e7	app/vmui: show query execution duration in the header of query input field This should simplify the process of query optimization	2023-11-01 16:43:51 +01:00
Hui Wang	e482eeff58	vmalert: support specifying full http url in notifier static_configs target (#5261 ) * vmalert: support specifying full http or https urls in notifier static_configs target address * show right label results in ui	2023-11-01 19:53:50 +08:00
Aliaksandr Valialkin	c4c6ee9485	app/vmui: fix non-working `Disable cache` checkbox at `JSON` and `Table` views	2023-10-31 22:58:06 +01:00
Aliaksandr Valialkin	ea81f6fc36	app/vmselect/promql: add outliers_iqr(q) and outlier_iqr_over_time(m[d]) functions These functions allow detecting anomalies in series and samples using Interquartile range method. See Outliers section at https://en.wikipedia.org/wiki/Interquartile_range for more details.	2023-10-31 22:10:31 +01:00
Aliaksandr Valialkin	41a0fdaf39	app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions, are optimized via partial result caching: - sum_over_time() - count_over_time() - avg_over_time() - increase() - rate() The basic idea of optimization is to calculate rf(m[d] @ t) as rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d)) where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously The offset may be in the range of up to 1 hour.	2023-10-31 19:25:23 +01:00
Aliaksandr Valialkin	714af89b13	lib/httpserver: follow-up for `0638bbe69c` - Replace spaces with underscores in the `reason` label value for the vm_http_request_errors_total metric in order be consistent with Prometheus-like naming - Clarify the description for the change at docs/CHANGELOG.md Updates https://github.com/victoriaMetrics/victoriaMetrics/issues/4590 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5166	2023-10-31 18:52:39 +01:00
Aliaksandr Valialkin	4ac95b6f49	docs/CHANGELOG.md: move the description for -http.header.* command-line flags from SECURITY to FEATURE The SECURITY label should be applied only to changes, which fix security issues. The change at `ad839aa492` adds new command-line flags, which can be used for improving security in some cases. They do not fix any security issues. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5111	2023-10-31 16:23:08 +01:00
hagen1778	f6208965ce	dashboards/cluster: fix description about `max` threshold for `Concurrent selects` panel. Before, it was mistakenly implying that `max` is equal to the double of available CPUs. Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5214 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 16:05:33 +01:00
Roman Khavronenko	a950873fff	app/vmselect: expose `vm_memory_intensive_queries_total` counter metric (#5208 ) The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit is exceeded by a query. This metric should help to identify expensive and heavy queries without inspecting the logs. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 13:31:09 +01:00
hagen1778	a8051d48c4	docs: follow-up for `0638bbe69c` `0638bbe69c` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 12:54:30 +01:00
hagen1778	aaf9e3d526	dashboards/vmalert: add new panel `Missed evaluations` The new panel supposed to indicate alerting groups that miss their evaluations. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:35:19 +01:00
hagen1778	9866974a53	deployment/alerts: add `TooManyMissedIterations` alerting rule The new rule for vmalert supposed to detect groups that miss their evaulations due to slow queries. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:35:18 +01:00
hagen1778	8874b525b7	dashboards: fix `Errors rate to Alertmanager` filter The panel `Errors rate to Alertmanager` had `group` label filter applied to the expression, while the metric `vmalert_alerts_send_errors_total` doesn't have that label. This resulted into always empty results. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:16:45 +01:00
Hui Wang	abcb21aa5e	vmalert: fix alert firing state in replay mode (#5192 ) fix possible missing firing states for alerting rules in replay mode Before if one firing stage is bigger than single query request range, like rule with a big `for`, alerting rule won't able to be detected as firing. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 13:54:18 +01:00
Dima Lazerka	ad839aa492	lib/httpserver: add flags to specify HSTS / Frame-Options / CSP headers for httpserver (#5111 ) support `Strict-Transport-Security`, `Content-Security-Policy` and `X-Frame-Options` HTTP headers in all VictoriaMetrics components. The values for headers can be specified by users via the following flags: `-http.header.hsts`, `-http.header.csp` and `-http.header.frameOptions`. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 11:33:38 +01:00
Roman Khavronenko	29cebd82fb	lib/storage: log warning about RO mode only on state change (#5191 ) Before, vmstorage would log the same message each second producing excessive amount of logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 10:52:57 +01:00
Aliaksandr Valialkin	632d788b63	lib/promscrape/discovery/kubernetes: stop all the url watchers, which belong to a particular groupWatcher, at once Previously url watchers for pod, service and node objects could be mistakenly closed when service discovery was set up only for endpoints and endpointslice roles, since watchers for these roles may start start pod, service and node url watchers with nil apiWatcher passed to groupWatcher.startWatchersForRole(). Now all the url watchers, which belong to a particular groupWatcher, are stopped at once when this groupWatcher has no apiWatcher subscribers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5216 The issue has been introduced in v1.93.5 when addressing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-10-27 13:51:35 +02:00
Hui Wang	7c90ce39cb	do not print redundant error logs when failed to scrape consul or no… (#5239 ) * do not print redundant error logs when failed to scrape consul or nomad target prometheus performs the same because it uses consul lib which just drops the error(`1806bcb38c/api/api.go (L1134)`)	2023-10-27 13:31:55 +08:00
Aliaksandr Valialkin	d5a599badc	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-25 23:19:37 +02:00
Aliaksandr Valialkin	eed5206376	lib/promauth: properly parse string contents for ca, cert and key fields at tls_config Previously yaml parser wasn't accepting string values for these fields, because it was mistakenly expecting a list of uint8 values instead.	2023-10-25 23:12:21 +02:00
hagen1778	a216fe6728	app/vmalert: follow-up after `c9375cac5e` `c9375cac5e` Descriptions were updated in attempt to make it more clear for readers, re-phrasing and linking missing docs. `eval_delay` was added to tests to verify it can be unmarshalled. `eval_delay` is now applied before timestamp alignment to make it more predictable. Before, if delay < interval the timestamp won't be aligned. `eval_delay` and `eval_offset` was added to API output. `PreviouslySentSeriesToRW` converted to private `previouslySentSeriesToRW`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 13:07:13 +02:00
Hui Wang	c9375cac5e	vmalert: add `-rule.evalDelay` flag and `eval_delay` as group attribute (#5185 ) Also mark `-datasource.lookback` as will be deprecated, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5155.	2023-10-25 11:54:18 +02:00
hagen1778	003ef3a518	deployment/alerts: make `TooHighMemoryUsage` more tolerable to spikes Using `min_over_time` should reduce the amount of false positives when component is running in near-the-threshold state. Now it should trigger only if all collected samples were above the threshold on 10m interval. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-24 09:39:46 +02:00
Alexander Marshalov	33484d3365	lib/streamaggr: respect `streamAgg.dropInput` with empty stream aggr config (#5213 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5207	2023-10-20 15:55:58 +02:00

1 2 3 4 5 ...

1700 commits