github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	fba93dbe0b	vendor: run `make vendor-update`	2023-10-31 20:19:51 +01:00
Aliaksandr Valialkin	41a0fdaf39	app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions, are optimized via partial result caching: - sum_over_time() - count_over_time() - avg_over_time() - increase() - rate() The basic idea of optimization is to calculate rf(m[d] @ t) as rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d)) where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously The offset may be in the range of up to 1 hour.	2023-10-31 19:25:23 +01:00
Aliaksandr Valialkin	c96fc05f3e	docs/Cluster-VictoriaMetrics.md: sync with cluster branch after `9d8f93050c`	2023-10-31 19:13:11 +01:00
Aliaksandr Valialkin	51aab7bb17	app/vmselect/promql: wrap too long line after `a950873fff`	2023-10-31 18:59:10 +01:00
Aliaksandr Valialkin	714af89b13	lib/httpserver: follow-up for `0638bbe69c` - Replace spaces with underscores in the `reason` label value for the vm_http_request_errors_total metric in order be consistent with Prometheus-like naming - Clarify the description for the change at docs/CHANGELOG.md Updates https://github.com/victoriaMetrics/victoriaMetrics/issues/4590 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5166	2023-10-31 18:52:39 +01:00
Aliaksandr Valialkin	98699f203b	lib/persistentqueue: properly re-create flock.lock file inside directory if persistent queue is broken. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5249 Thanks to @Sniper91 for the bugreport and initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5233	2023-10-31 18:38:32 +01:00
Aliaksandr Valialkin	efb6ac27c2	lib/httpserver: call Request.Header() only once instead of calling it each time a new request header is set This is a follow-up for `ad839aa492`	2023-10-31 18:38:32 +01:00
Artem Navoiev	68f82b1c06	github actions: fix typo in hugo version Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2023-10-31 17:52:35 +01:00
Artem Navoiev	1020faa7b4	github actions: use 0.119 hugo version as far latest contains bug Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2023-10-31 17:50:12 +01:00
Aliaksandr Valialkin	aade16f534	docs/Cluster-VictoriaMetrics.md: clarify the description on why -dosnwampling.period must be set at both vmstorage and vmselect This is a follow-up for `ca7457d906`	2023-10-31 16:53:46 +01:00
Aliaksandr Valialkin	a71efce784	docs/Single-server-VictoriaMetrics.md: cosmetic fixes after `23369321f1`	2023-10-31 16:42:29 +01:00
Aliaksandr Valialkin	4ac95b6f49	docs/CHANGELOG.md: move the description for -http.header.* command-line flags from SECURITY to FEATURE The SECURITY label should be applied only to changes, which fix security issues. The change at `ad839aa492` adds new command-line flags, which can be used for improving security in some cases. They do not fix any security issues. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5111	2023-10-31 16:23:08 +01:00
Aliaksandr Valialkin	7ac49162c6	lib/storage: follow-up for `29cebd82fb` Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32(). This makes the code more clear. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159	2023-10-31 16:08:54 +01:00
hagen1778	f6208965ce	dashboards/cluster: fix description about `max` threshold for `Concurrent selects` panel. Before, it was mistakenly implying that `max` is equal to the double of available CPUs. Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5214 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 16:05:33 +01:00
Roman Khavronenko	a950873fff	app/vmselect: expose `vm_memory_intensive_queries_total` counter metric (#5208 ) The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit is exceeded by a query. This metric should help to identify expensive and heavy queries without inspecting the logs. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 13:31:09 +01:00
hagen1778	a8051d48c4	docs: follow-up for `0638bbe69c` `0638bbe69c` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 12:54:30 +01:00
venkatbvc	0638bbe69c	vmauth: add counter metrics for auth successes and failures (#5166 ) New labels `reason="wrong basic auth creds"` and `reason="wrong auth key"` were added to metric `vm_http_request_errors_total` to help identify auth errors. https://github.com/victoriaMetrics/victoriaMetrics/issues/4590 Co-authored-by: Rao, B V Chalapathi <b_v_chalapathi.rao@nokia.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-10-31 12:48:02 +01:00
hagen1778	aaf9e3d526	dashboards/vmalert: add new panel `Missed evaluations` The new panel supposed to indicate alerting groups that miss their evaluations. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:35:19 +01:00
hagen1778	9866974a53	deployment/alerts: add `TooManyMissedIterations` alerting rule The new rule for vmalert supposed to detect groups that miss their evaulations due to slow queries. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:35:18 +01:00
hagen1778	8874b525b7	dashboards: fix `Errors rate to Alertmanager` filter The panel `Errors rate to Alertmanager` had `group` label filter applied to the expression, while the metric `vmalert_alerts_send_errors_total` doesn't have that label. This resulted into always empty results. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 10:16:45 +01:00
Roman Khavronenko	ca7457d906	docs: explain motivation behind having `-downsampling.period` on vmselect (#5205 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 19:03:36 +01:00
Roman Khavronenko	23369321f1	docs: mention information loss when downsampling gauges (#5204 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 15:29:06 +01:00
Hui Wang	abcb21aa5e	vmalert: fix alert firing state in replay mode (#5192 ) fix possible missing firing states for alerting rules in replay mode Before if one firing stage is bigger than single query request range, like rule with a big `for`, alerting rule won't able to be detected as firing. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 13:54:18 +01:00
hagen1778	e964df8039	docs/troubleshooting: mention issue with un-ordered labels See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5219#issuecomment-1773441711 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 13:53:14 +01:00
hagen1778	a64b37cf24	docs: rm mention of default values for security HTTP headers The headers, their corresponding flags are mentioned at https://docs.victoriametrics.com/#security Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 11:46:17 +01:00
Dima Lazerka	ad839aa492	lib/httpserver: add flags to specify HSTS / Frame-Options / CSP headers for httpserver (#5111 ) support `Strict-Transport-Security`, `Content-Security-Policy` and `X-Frame-Options` HTTP headers in all VictoriaMetrics components. The values for headers can be specified by users via the following flags: `-http.header.hsts`, `-http.header.csp` and `-http.header.frameOptions`. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 11:33:38 +01:00
Roman Khavronenko	29cebd82fb	lib/storage: log warning about RO mode only on state change (#5191 ) Before, vmstorage would log the same message each second producing excessive amount of logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 10:52:57 +01:00
Aliaksandr Valialkin	9149353a36	app/vmui: change the order of tables at `Top queries` tab Move the most interesting table - queries with the most summary time to execute - to the top	2023-10-28 11:56:16 +02:00
Aliaksandr Valialkin	613b545dfd	lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller This allows substituting FATAL panics with recoverable runtime errors such as missing or invalid TLS CA file and/or missing/invalid /var/run/secrets/kubernetes.io/serviceaccount/namespace file. Now these errors are logged instead of PANIC'ing, so they can be fixed by updating the corresponding files without the need to restart vmagent. This is a follow-up for `90427abc65` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5243	2023-10-27 20:24:46 +02:00
Hui Wang	90427abc65	lib/promscrape/discovery/kubernetes: avoid possible panic if given caFile under kubernetes.SDConfig.HTTPClientConfig is not exist (#5243 ) follow up `d5a599badc`	2023-10-27 20:20:22 +02:00
Aliaksandr Valialkin	632d788b63	lib/promscrape/discovery/kubernetes: stop all the url watchers, which belong to a particular groupWatcher, at once Previously url watchers for pod, service and node objects could be mistakenly closed when service discovery was set up only for endpoints and endpointslice roles, since watchers for these roles may start start pod, service and node url watchers with nil apiWatcher passed to groupWatcher.startWatchersForRole(). Now all the url watchers, which belong to a particular groupWatcher, are stopped at once when this groupWatcher has no apiWatcher subscribers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5216 The issue has been introduced in v1.93.5 when addressing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-10-27 13:51:35 +02:00
Hui Wang	7c90ce39cb	do not print redundant error logs when failed to scrape consul or no… (#5239 ) * do not print redundant error logs when failed to scrape consul or nomad target prometheus performs the same because it uses consul lib which just drops the error(`1806bcb38c/api/api.go (L1134)`)	2023-10-27 13:31:55 +08:00
hagen1778	3aec7eb44f	app/vmalert: remove unclear comment The timestamp alignment should be applied as a last step to keep the timestamp consistent. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-26 15:41:35 +02:00
Daria Karavaieva	b60bb1d98a	model list - isolation forest (#5235 ) * model list - isolation forest * curse of dimensionality * isol forest definition change, minor fixes * blank line fix	2023-10-26 12:25:54 +02:00
Aliaksandr Valialkin	68b1b3c4d4	Makefile: move build commands for vmalert-tool closer to vmalert This should simplify maintenance of Makefile commands related to vmalert. This is a follow-up for `dc28196237`	2023-10-26 08:07:50 +02:00
Aliaksandr Valialkin	cdbc06a639	lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4	2023-10-25 17:57:56 -07:00
Dima Lazerka	8b41b506c2	Revert "lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4" It broke CI (lint) This reverts commit `5464376d16`.	2023-10-25 16:24:31 -07:00
Aliaksandr Valialkin	5464376d16	lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4	2023-10-26 00:29:51 +02:00
Aliaksandr Valialkin	ac933cc423	lib/promscrape: properly track the number of updated service discovery routines inside Config.mustRestart() This is a follow-up for `d5a599badc`	2023-10-26 00:06:29 +02:00
Aliaksandr Valialkin	612dcf231a	lib/promauth: typo fix in the error message after `d5a599badc`: obtaine -> obtain	2023-10-25 23:38:00 +02:00
Aliaksandr Valialkin	d5a599badc	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-25 23:19:37 +02:00
Aliaksandr Valialkin	c22e3e7b1d	lib/promscrape/discovery/kubernetes/kubeconfig_test.go: make TestParseKubeConfigSuccess test code easier to follow	2023-10-25 23:17:18 +02:00
Aliaksandr Valialkin	eed5206376	lib/promauth: properly parse string contents for ca, cert and key fields at tls_config Previously yaml parser wasn't accepting string values for these fields, because it was mistakenly expecting a list of uint8 values instead.	2023-10-25 23:12:21 +02:00
Aliaksandr Valialkin	4afcb2a689	lib/promscrape: move duplicate code from functions, which collect ScrapeWork lists for distinct SD types into Config.getScrapeWorkGeneric() This removes more than 200 lines of duplicate code	2023-10-25 23:03:40 +02:00
Aliaksandr Valialkin	cb34d4440c	app/vmalert/config: fix flacky test TestParseBad It could return either `failed to read` or `failed to parse` errors depending on whether the given url can be loaded or not under the current environment	2023-10-25 21:30:55 +02:00
Aliaksandr Valialkin	42dd71bb63	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-25 21:24:03 +02:00
Aliaksandr Valialkin	305c96e384	lib/workingsetcache: fix outdated comments for Load() and New() functions	2023-10-25 21:04:20 +02:00
hagen1778	c07909a20b	app/vmalert: fix typo in tests Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 16:28:27 +02:00
hagen1778	eed0c3c6b0	app/vmalert: fix tests after `a216fe6728` `a216fe6728` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 16:25:26 +02:00
hagen1778	a216fe6728	app/vmalert: follow-up after `c9375cac5e` `c9375cac5e` Descriptions were updated in attempt to make it more clear for readers, re-phrasing and linking missing docs. `eval_delay` was added to tests to verify it can be unmarshalled. `eval_delay` is now applied before timestamp alignment to make it more predictable. Before, if delay < interval the timestamp won't be aligned. `eval_delay` and `eval_offset` was added to API output. `PreviouslySentSeriesToRW` converted to private `previouslySentSeriesToRW`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-25 13:07:13 +02:00

1 2 3 4 5 ...

7131 commits