github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-01-30 15:22:07 +00:00

Author	SHA1	Message	Date
Roman Khavronenko	cb23685681	app/vmselect: make vmselect resilient to absence of cache folder (#5987 ) vmselect uses a cache folder in file system for two purposes: 1. Storing rollup cache results on shutdown; 2. Storing temporary search results from vmstorage during query executions. It could happen that cache folder is deleted accidentally by user, or by OS during cleanup routines. This would cause vmselect to: 1. panic on /metrics call, because `MustGetFreeSpace` will fail; 2. return query error user, as it won't be able to store temporary search results. The changes in this commit are the following: 1. Make `MustGetFreeSpace` to try re-creating the cache folder if it is missing; 2. Make vmselect to try re-creating the cache folder if it can't persist tmp search results. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-03-26 12:59:50 +01:00
Aliaksandr Valialkin	146fccc22d	app/vmselect/netstorage: usae unsafe.SliceData instead of deprecated reflect.SliceHeader	2024-02-29 17:36:28 +02:00
Aliaksandr Valialkin	6697da73e5	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:44:24 +02:00
Aliaksandr Valialkin	f46eaf92eb	app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at `5d66ee88bd` , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-02-23 02:57:16 +02:00
Dan Dascalescu	17cf031fa1	app/vmselect: simplify wording for `too many samples` error (#5827 )	2024-02-20 16:26:38 +01:00
Aliaksandr Valialkin	0e3c532bf7	app/vmselect/netstorage: prevent from disk write IO when closing temporary files Remove temporary file before closing it in order to signal the OS that it shouldn't store the file contents from page cache to disk when the file is closed. Gracefully handle the case when the file cannot be removed before being closed - in this case remove the file after closing it. This allows working on Windows. Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing. This is a follow-up for `9b1e002287` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2024-02-01 19:12:44 +02:00
Aliaksandr Valialkin	db4623efc2	app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs) before trying to append a new item to brsPool and sharing its contents with brs.brs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733	2024-01-31 10:27:50 +02:00
Aliaksandr Valialkin	1c58c00618	app/vmselect/netstorage: limit the initial size for brsPoolCap with 32Kb This should reduce the number of expensive memory allocations with sizes bigger than 32Kb	2024-01-23 22:29:39 +02:00
Aliaksandr Valialkin	43ecd5d258	app/vmselect/netstorage: pre-allocate memory for metricNamesBuf This should reduce the number of metricNamesBuf re-allocations in append()	2024-01-23 21:34:16 +02:00
Aliaksandr Valialkin	41456d9569	app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery() This avoids slow path in Go runtime for allocating objects bigger than 32Kb - see `704401ffa0/src/runtime/malloc.go (L11)` This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `5dd37ad836` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 14:04:49 +02:00
Aliaksandr Valialkin	1f1768d7af	app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size See `704401ffa0/src/runtime/malloc.go (L11)` This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `508c608062` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 13:46:37 +02:00
Aliaksandr Valialkin	e0399ec29a	app/vmselect/netstorage: remove tswPool, since it isnt efficient	2024-01-23 02:28:30 +02:00
Aliaksandr Valialkin	72a838a2a1	app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series This saves a few CPU cycles for common case	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	5dd37ad836	app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	7345567c29	app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map This should reduce the pressure on Go GC, since it will see lower number of pointers. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	678234e9f0	app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery() This should reduce pressure on Go GC at vmselect The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Aliaksandr Valialkin	508c608062	app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice This reduces the number of memory allocations at the cost of possible memory usage increase, since now different metric name strings may hold references to the previous byte slice. This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Nikolay	c9f39fd51f	app/vmselect/netstorage (#5649 ) * app/vmselect/netstorage correctly handle errGlobal set * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 02:47:29 +02:00
Roman Khavronenko	b8b6e120ff	app/vmselect: limit the number of parallel workers by 32 (#5195 ) * app/vmselect: limit the number of parallel workers by 32 The change should improve performance and memory usage during query processing on machines with big number of CPU cores. The number of parallel workers for query processing is controlled via `-search.maxWorkersPerQuery` command-line flag. By default, the number of workers is limited by the number of available CPU cores, but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip - The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage, so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md - Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag - Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md - Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS, because bigger values may worsen query performance and increase CPU usage - Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX, since it is closer to a feature than to a bugfix. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-18 19:51:37 +02:00
Aliaksandr Valialkin	214be01dfa	app/vmselect/netstorage: remove duplicate `see` word from the error message This is a follow-up for `ac6c40e896` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 02:05:44 -07:00
Aliaksandr Valialkin	ac6c40e896	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 01:57:34 -07:00
Aliaksandr Valialkin	45e345806c	app/vmselect/netstorage: remove runtime.Gosched() call from unpackWorker() This should improve scalability of unpackWorker() on systems with many CPU cores. This is a follow-up for `a2ecf4fa4a` and `16f3b279a2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-07-06 10:05:58 -07:00
Aliaksandr Valialkin	036a7b7365	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:11:59 -07:00
Nikolay	9b1e002287	app/vmselect: properly remove temp files at windows system (#4020 ) With non-posix compliant systems it's not possible to remove unclosed files. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-27 18:10:15 -07:00
Aliaksandr Valialkin	5832242b44	app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419	2023-03-25 16:37:07 -07:00
Aliaksandr Valialkin	a1e496ced6	app/vmselect/netstorage: document why runtime.Gosched() is removed at `28f054bb00` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-25 16:36:51 -07:00
Zakhar Bessarab	28f054bb00	vmselect/netstorage: remove direct calls to `Gosched` to reduce amount of locks for global scope using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly. Updates #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-25 16:34:03 -07:00
Aliaksandr Valialkin	70959d5dab	app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker() Call runtime.Gosched() only when there is a work to steal from other workers. Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork(). This should reduce CPU usage when processing queries on systems with big number of CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-20 20:31:02 -07:00
Aliaksandr Valialkin	4856a4cf5a	app/vmselect: optimize incremental aggregates a bit Substitute sync.Map with an ordinary slice indexed by workerID. This should reduce the overhead when updating the incremental aggregate state	2023-03-20 15:37:06 -07:00
Aliaksandr Valialkin	b5db69fe05	app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage It has been appeared that this interning may lead to increased memory usage and increased CPU usage when vmselect performs queries, which select big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863	2023-03-12 00:52:35 -08:00
Oleksandr Redko	9fff48c3e3	app,lib: fix typos in comments (#3804 )	2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin	be8fba9b6a	app/vmselect/netstorage: tune the number of blocks per series which should be unpacked by a single goroutine instead of spinning up multiple goroutines This reduces overhead on time series data unpacking for typical cases, this reducing CPU usage at vmselect	2023-01-12 09:31:44 -08:00
Aliaksandr Valialkin	53d871d0b1	app/vmselect/netstorage: reduce tail latency during query processing Previously the selected time series were split evenly among available CPU cores for further processing - e.g unpacking the data and applying the given rollup function to the unpacked data. Some time series could be processed slower than others. This could result in uneven work distribution among available CPU cores, e.g. some CPU cores could complete their work sooner than others. This could slow down query execution. The new algorithm allows stealing time series to process from other CPU cores when all the local work is done. This should reduce the maximum time needed for query execution (aka tail latency). The new algorithm should also scale better on systems with many CPU cores, since every CPU processes locally assigned time series without inter-CPU communications. The inter-CPU communications are used only when all the local work is finished and the pending work from other CPUs needs to be stealed.	2023-01-10 13:43:14 -08:00
Aliaksandr Valialkin	e640ff72f1	app/vmselect/netstorage: reduce memory allocations when unpacking time series Unpack time series with less than 400K samples in the currently running goroutine. Previously a new goroutine was being started for unpacking the samples. This was requiring additional memory allocations.	2023-01-09 23:18:17 -08:00
Aliaksandr Valialkin	df2a494a7c	app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying Usually the number of blocks returned per each time series during queries is around 4. So it is a good idea to pre-allocate 4 block references per time series in order to reduce the number of memory allocations.	2023-01-09 22:03:23 -08:00
Aliaksandr Valialkin	c5e0f527bc	app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage This reduces memory allocations for repeated queries, which return (almost) the same set of time series.	2023-01-09 21:53:10 -08:00
Aliaksandr Valialkin	7afcca0c51	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin	c38a10e143	app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks()	2023-01-09 21:08:51 -08:00
Aliaksandr Valialkin	1f9d605988	app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 , but doesn't slow down the common case with merging replicated data blocks so significantly. Benchmark results: Before the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13968 85643 ns/op 956.53 MB/s 1700 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 10806 109171 ns/op 1500.77 MB/s 2191 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 8887 130623 ns/op 1881.45 MB/s 2660 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 7440 157348 ns/op 2082.52 MB/s 3174 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 6534 184473 ns/op 2220.38 MB/s 3612 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 13419 85205 ns/op 961.44 MB/s 2213 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 579 1894900 ns/op 43.23 MB/s 46760 B/op 1 allocs/op After the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13832 85298 ns/op 960.40 MB/s 1716 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 8833 134222 ns/op 1220.66 MB/s 2675 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 6487 184830 ns/op 1329.65 MB/s 3636 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 4977 236318 ns/op 1386.61 MB/s 4733 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 4088 296734 ns/op 1380.36 MB/s 5761 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 14083 84067 ns/op 974.47 MB/s 2110 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 536 2043534 ns/op 40.09 MB/s 50511 B/op 1 allocs/op	2023-01-09 13:01:48 -08:00
Roman Khavronenko	7c0ae3a86a	lib/storage: keep sample with the biggest value on timestamp conflict (#3421 ) The change leaves raw sample with the biggest value for identical timestamps per each `-dedup.minScrapeInterval` discrete interval when the deduplication is enabled. ``` benchstat old.txt new.txt name old time/op new time/op delta DeduplicateSamples/minScrapeInterval=1s-10 817ns ± 2% 832ns ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 1.56µs ± 1% 2.12µs ± 0% +35.19% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 1.32µs ± 3% 1.65µs ± 2% +25.57% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 1.13µs ± 2% 1.50µs ± 1% +32.85% (p=0.000 n=10+10) name old speed new speed delta DeduplicateSamples/minScrapeInterval=1s-10 10.0GB/s ± 2% 9.9GB/s ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 5.24GB/s ± 1% 3.87GB/s ± 0% -26.03% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 6.22GB/s ± 3% 4.96GB/s ± 2% -20.37% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 7.28GB/s ± 2% 5.48GB/s ± 1% -24.74% (p=0.000 n=10+10) ``` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin	cae0f37edd	app/vmselect/netstorage: remove superflouos map lookup at ProcessSearchQuery This should reduce CPU usage a bit during querying	2022-11-18 13:40:04 +02:00
Aliaksandr Valialkin	c53b7e66ef	app/vmselect: improve performance scalability on multi-CPU systems for `/api/v1/export/...` endpoints	2022-10-01 22:05:43 +03:00
Aliaksandr Valialkin	343241680b	all: remove the remaining bits of io/ioutil The io/ioutil package is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil VictoriaMetrics requires at least Go1.18, so it is time to remove the io/ioutil from source code This is a follow-up for `02ca2342ab`	2022-08-22 00:20:58 +03:00
Aliaksandr Valialkin	7478d423c5	app/vmselect/netstorage: cleanup after `92630c1ab4` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896	2022-08-04 18:28:11 +03:00
Aliaksandr Valialkin	c2bd75926b	app/vmselect/netstorage: initialize tsw.rowsProcessed before calling tsw.f, since tsw.f can modify r.Timestamps and r.Values lengths	2022-07-30 00:39:36 +03:00
Aliaksandr Valialkin	19a0b4679a	app/vmselect/netstorage: re-use random generator used for series shuffle in Result.RunParallel This should reduce CPU usage needed for rand.Rand initialization	2022-07-30 00:30:37 +03:00
Aliaksandr Valialkin	92630c1ab4	app/vmselect/netstorage: improve the speed of queries over big number of time series on multi-CPU system Reduce inter-CPU communications when processing the query over big number of time series. This should improve performance for queries over big number of time series on systems with many CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896 Based on `b596ac3745` Thanks to @zqyzyq for the idea.	2022-07-25 09:18:44 +03:00
Aliaksandr Valialkin	159c2e15e3	app/vmselect/netstorage: optimize mergeSortBlocks() for the worst case when blocks contain interleaved samples	2022-07-12 12:31:38 +03:00
Aliaksandr Valialkin	8429d4af5a	app/vmselect/netstorage: add mergeSortBlocks benchmark for the worstcase	2022-07-12 12:31:36 +03:00
Aliaksandr Valialkin	cd09f583fe	app/vmselect/netstorage: add benchmarks for mergeSortBlocks This is a follow-up for `743ff84863`	2022-07-11 12:54:48 +03:00

1 2 3 4

155 commits