github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	53d871d0b1	app/vmselect/netstorage: reduce tail latency during query processing Previously the selected time series were split evenly among available CPU cores for further processing - e.g unpacking the data and applying the given rollup function to the unpacked data. Some time series could be processed slower than others. This could result in uneven work distribution among available CPU cores, e.g. some CPU cores could complete their work sooner than others. This could slow down query execution. The new algorithm allows stealing time series to process from other CPU cores when all the local work is done. This should reduce the maximum time needed for query execution (aka tail latency). The new algorithm should also scale better on systems with many CPU cores, since every CPU processes locally assigned time series without inter-CPU communications. The inter-CPU communications are used only when all the local work is finished and the pending work from other CPUs needs to be stealed.	2023-01-10 13:43:14 -08:00
Zakhar Bessarab	b2ccdaaa2f	lib/promscrape/discovery/gce: fix crash in case instance does not have any labels set (#3625 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-01-10 11:07:11 +01:00
Denys Holius	b06e795a1e	docs/Release-Guide.md: added missed link to rpm repository (#3623 )	2023-01-10 10:14:56 +01:00
Aliaksandr Valialkin	e640ff72f1	app/vmselect/netstorage: reduce memory allocations when unpacking time series Unpack time series with less than 400K samples in the currently running goroutine. Previously a new goroutine was being started for unpacking the samples. This was requiring additional memory allocations.	2023-01-09 23:18:17 -08:00
Aliaksandr Valialkin	9a563a6aef	app/vmselect/promql: eliminate memory allocation when sorting values inside float64s	2023-01-09 23:06:46 -08:00
Aliaksandr Valialkin	30ed33fae0	app/vmselect/promql: pre-allocate memory for values to be merged in mergeTimeseries() This should reduce the number of memory re-allocations	2023-01-09 22:51:17 -08:00
Aliaksandr Valialkin	645c24dc5f	app/vmselect/promql: consistently intern series names obtained from marshalMetricNameSorted This reduces memory allocations when the returned series names are used as map keys later	2023-01-09 22:45:40 -08:00
Aliaksandr Valialkin	2f3ddd4884	app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult()	2023-01-09 22:38:59 -08:00
Aliaksandr Valialkin	26cf680468	app/vmselect/promql: remove memory allocations from sortMetricTags()	2023-01-09 22:22:15 -08:00
Aliaksandr Valialkin	4f0c11ee93	app/vmselect/promql: intern output series names inside timeseriesToResult() This reduces the number of memory allocations for repeated queries, which return (almost) the same set of time series.	2023-01-09 22:19:56 -08:00
Aliaksandr Valialkin	562d6bca08	app/vmselect/promql: intern output series names during normal aggregation	2023-01-09 22:15:24 -08:00
Aliaksandr Valialkin	21ee9a1fab	app/vmselect/promql: intern output series names during incremental aggregation This should reduce the number of memory allocations for repeated queries	2023-01-09 22:11:36 -08:00
Aliaksandr Valialkin	df2a494a7c	app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying Usually the number of blocks returned per each time series during queries is around 4. So it is a good idea to pre-allocate 4 block references per time series in order to reduce the number of memory allocations.	2023-01-09 22:03:23 -08:00
Aliaksandr Valialkin	c5e0f527bc	app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage This reduces memory allocations for repeated queries, which return (almost) the same set of time series.	2023-01-09 21:53:10 -08:00
Aliaksandr Valialkin	7afcca0c51	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin	67ab49baa9	vendor: `make vendor-update`	2023-01-09 21:34:34 -08:00
Aliaksandr Valialkin	e5eca54951	lib/promscrape/discovery/nomad: sync nomad_sd_configs fields with the Prometheus implementation See the list of configs supported by Prometheus at `f88a0a7d83/discovery/nomad/nomad.go (L76-L84)` - Removed "token" option. In can be set either via NOMAD_TOKEN env var or via `bearer_token` config option. - Removed "scheme" option. It is automatically detected depending on whether the `tls_config` is set. - Removed "services" and "tags" options, since they aren't supported by Prometheus. - Added "region" option. If it is missing, then the region is read from NOMAD_REGION env var. If this var is empty, then it is set to "global" in the same way as Nomad client does. See `865ee8d37c/api/api.go (L297)` and `865ee8d37c/api/api.go (L555-L556)` - If the "server" option is missing, then it is read from NOMAD_ADDR in the same way as Nomad client does - see `865ee8d37c/api/api.go (L294-L296)` This is a follow-up for `8aee209c53` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367	2023-01-09 21:14:48 -08:00
Aliaksandr Valialkin	c38a10e143	app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks()	2023-01-09 21:08:51 -08:00
Aliaksandr Valialkin	1f9d605988	app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 , but doesn't slow down the common case with merging replicated data blocks so significantly. Benchmark results: Before the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13968 85643 ns/op 956.53 MB/s 1700 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 10806 109171 ns/op 1500.77 MB/s 2191 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 8887 130623 ns/op 1881.45 MB/s 2660 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 7440 157348 ns/op 2082.52 MB/s 3174 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 6534 184473 ns/op 2220.38 MB/s 3612 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 13419 85205 ns/op 961.44 MB/s 2213 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 579 1894900 ns/op 43.23 MB/s 46760 B/op 1 allocs/op After the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13832 85298 ns/op 960.40 MB/s 1716 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 8833 134222 ns/op 1220.66 MB/s 2675 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 6487 184830 ns/op 1329.65 MB/s 3636 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 4977 236318 ns/op 1386.61 MB/s 4733 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 4088 296734 ns/op 1380.36 MB/s 5761 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 14083 84067 ns/op 974.47 MB/s 2110 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 536 2043534 ns/op 40.09 MB/s 50511 B/op 1 allocs/op	2023-01-09 13:01:48 -08:00
Denys Holius	fe0e199859	deployment/docker: update Alertmanager tag from v0.24.0 to v0.25.0 in docker-compose files (#3619 ) deployment/docker: bump alertmanager to latest v0.25.0	2023-01-09 12:37:14 +01:00
Roman Khavronenko	8aee209c53	lib/promscrape: remove `datacenter` field from nomad_sd_config (#3612 ) Looks like `datacenter` field isn't part of `/v1/services` API. See https://developer.hashicorp.com/nomad/api-docs/services#list-services and https://developer.hashicorp.com/nomad/api-docs/services#read-service Related issues: https://github.com/traefik/traefik/issues/9109 https://github.com/prometheus/prometheus/issues/11776 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-01-09 09:07:40 +01:00
Aliaksandr Valialkin	28f8dc41b0	lib/promscrape/discoveryutils: cleanup after `5df9fddaf2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468	2023-01-07 01:26:54 -08:00
Zakhar Bessarab	5df9fddaf2	lib/promscrape/discoveryutils: use correct timeout for blocking requests (#3609 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-01-07 01:13:03 -08:00
Aliaksandr Valialkin	41e00a0df7	lib/storage: simplify the fix from `488940502c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566	2023-01-07 01:04:43 -08:00
Dmytro Kozlov	488940502c	lib/storage: fix returning camelcase label names (#3608 ) * lib/storage: fix returning camelcase label names * doc: add change log * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-07 00:50:14 -08:00
Aliaksandr Valialkin	5fe7ff24c2	lib/streamaggr: limit the the number of concurrent flushes of the aggregate data to the exact number of available CPUs This should reduce the maximum memory usage during concurrent flushes of the aggregate data	2023-01-07 00:18:51 -08:00
Aliaksandr Valialkin	ad5bfe3089	lib/promscrape: reduce the number of concurrently executed processScrapedData calls from 2x of the number of CPUs to the number of CPUs This should reduce the maximum memory usage for processScrapedData() function by 2x. The only part, which can be IO-bound in the processScrapedData() is pushData() call, when it buffers data to persistent queue if the remote storage cannot keep up with the data ingestion speed. In this case it is OK if the scrape pace will be limited.	2023-01-07 00:14:30 -08:00
Aliaksandr Valialkin	af263fe881	all: small improvements in error messages and command-line flag descriptions related to concurrency limiters	2023-01-07 00:11:44 -08:00
Aliaksandr Valialkin	45f39e291e	lib/writeconcurrencylimiter: moved the error generation from incConcurrency() to the caller place	2023-01-06 23:45:58 -08:00
Aliaksandr Valialkin	986a05e18d	lib/promscrape: limit the concurrency during parsing and relabeling the scraped samples This should reduce memory usage when scraping big number of targets, since this limits the summary memory usage during concurrent parsing and relabeling by the number of available CPU cores.	2023-01-06 22:59:17 -08:00
Aliaksandr Valialkin	293e4dc77b	app/{vminsert,vmstorage}: add comments on why storage.AddRows() is called without limiting the number of concurrent calls	2023-01-06 22:40:07 -08:00
Aliaksandr Valialkin	5c4bd4f7c1	lib/streamaggr: limit the number of concurrent flushes of aggregate metrics in order to limit memory usage	2023-01-06 22:39:13 -08:00
Aliaksandr Valialkin	c63755c316	lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit Previously the -maxConcurrentInserts was limiting the number of established client connections, which write data to VictoriaMetrics. Some of these connections could be idle. Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting the number of such connections. So now the -maxConcurrentInserts command-line option limits the number of concurrently executed insert requests, not including idle connections. It is recommended removing -maxConcurrentInserts command-line option, since the default value for this option should work good for most cases.	2023-01-06 22:20:19 -08:00
Aliaksandr Valialkin	f299d2ca1a	lib/vmselectapi: limit the number of concurrently executed requests This should prevent from out of memory errors when big number of vmselect nodes send many concurrent requests to vmstorage The limit can be controlled at vmstorage via the following command-line flags: - search.maxConcurrentRequests - search.maxQueueDuration See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits	2023-01-06 22:11:34 -08:00
Aliaksandr Valialkin	e7637885a6	app/vmselect: improve error message when the request cannot be started because too many concurrent requests are already executed	2023-01-06 22:10:42 -08:00
Aliaksandr Valialkin	463b957e54	lib/promscrape/discovery/{consul,nomad}: wait until the deleted serviceWatchers are stopped inside updateServices() call Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367	2023-01-05 21:52:33 -08:00
Aliaksandr Valialkin	f392913d00	lib/promscrape: follow-up after `bced9fb978` - Document the bugfix at docs/CHANGELOG.md - Wait until all the worker goroutines are done in consulWatcher.mustStop() - Do not log `context canceled` errors when discovering consul serviceNames - Removed explicit handling of gzipped responses at lib/promscrape/discoveryutils.Client, since this handling is automatically performed by net/http.Transport. See DisableCompression option at https://pkg.go.dev/net/http#Transport . - Remove explicit handling of the proxyURL, since it is automatically handled by net/http.Transport. See Proxy option at https://pkg.go.dev/net/http#Transport . - Expliticly set MaxIdleConnsPerHost, since its default value equals to 2. Such a small value may result in excess tcp connection churn when more than 2 concurrent requests are processed by lib/promscrape/discoveryutils.Client. - Do not set explicitly the `Host` request header, since it is automatically set by net/http.Client. - Backport the bugfix to the recently added nomad_sd_configs - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468	2023-01-05 21:13:06 -08:00
Zakhar Bessarab	bced9fb978	lib/promscrape/discoveryutils: switch to native http client from fasthttp (#3568 )	2023-01-05 19:34:47 -08:00
Roman Khavronenko	5bdd880142	vmstorage: add more context to the flock acquiring msg (#3584 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-01-05 18:30:42 -08:00
Aliaksandr Valialkin	9f348cf8a1	lib/promscrape/discovery/nomad: follow-up after `48f371a46c` - Remove undocumented `username` and `password` config options from `nomad_sd_config`. TODO: probably, remove these options from `consul_sd_config` too? These options exist there for backwards compatibility purposes. - Add __meta_nomad_service_alloc_id and __meta_nomad_service_job_id meta-labels These labels contain AllocID and JobID fields for the discovered Nomad services. - Various typo fixes. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367	2023-01-05 18:07:20 -08:00
Aliaksandr Valialkin	cad8553c01	Makefile: remove trailing space after `golangci-lint run` command It is left after `ec2c82e800`	2023-01-05 16:59:07 -08:00
Aliaksandr Valialkin	1a28f0e5b3	lib/promrelabel: pass query args via query string at /metric-relabel-debug and /target-relabel-debug pages if their length doesnt exceed 1000 This allows copy-n-pasting the url to another browser window and seeing the same result. The limit in 1000 chars is selected in order to prevent from potential issues with systems which limit the url length such as Internet Explorer - see https://stackoverflow.com/questions/812925/what-is-the-maximum-possible-length-of-a-query-string If the limit is exceeded, then query args are sent via POST method and aren't visible in the url. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580	2023-01-05 16:48:04 -08:00
Karan Sharma	48f371a46c	lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549 ) Add nomad_sd_config support for service discovery	2023-01-05 23:03:58 +01:00
Denys Holius	043b28c725	.github/workflows/nightly-build.yml: added dockerhub login (#3594 )	2023-01-05 16:54:14 +01:00
Luke Palmer	ec2c82e800	Lint and errcheck using golangci-lint (#3558 )	2023-01-05 16:12:46 +01:00
Zakhar Bessarab	01bc0c94ab	doc: add vmbackupmanager monitoring section (#3605 ) * doc: add vmbackupmanager monitoring section Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-01-05 16:03:06 +01:00
Thomas Danielsson	9d1104d812	dashboards: fix operator datasource variable (#3604 ) Got "Failed to upgrade legacy queries Datasource $ds was not found" in Grafana on operator dashboard. It's datasource variable was incorrectly named `datasource`. Also made the rest of the dashboards have homogeneous datasource-variable names and selections, matching vmagent dashboard.	2023-01-05 14:59:56 +01:00
Artem Navoiev	8b763175ff	Add Understand Your Setup Size Guide (#3572 ) docs: add Understand Your Setup Size Guide Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2023-01-05 14:56:50 +01:00
Aliaksandr Valialkin	2ee81a5dbb	docs/CHANGELOG.md: add missing dot	2023-01-05 03:35:02 -08:00
Zakhar Bessarab	185cdcd813	lib/promscrape/discovery/dockerswarm: fix query encoding of filters (#3586 ) Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-05 03:34:25 -08:00

1 2 3 4 5 ...

5594 commits