github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	6a0cf2cd29	app/vmselect/netstorage: add a comment explaining why all the samples in block are taken into account when checking the -search.maxSamplesPerQuery limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851 This is a follow-up for `b07a02c516`	2024-06-25 03:01:43 +02:00
Aliaksandr Valialkin	b07a02c516	Revert "app/vmselect: fix the way of counting raw samples in single query (#6464 )" This reverts commit `6e395048d3`. Reason for revert: the previous logic was correct. The purpose of `-search.maxSamplesPerQuery` command-line flag is to limit the amounts of CPU resources, which could be taken by a single query - see https://docs.victoriametrics.com/#resource-usage-limits . VictoriaMetrics processes samples in blocks during querying - it reads the block, then unpacks it, then filters out samples outside the selected time range. This means that it _spends CPU time_ on reading and unpacking of _all the samples_ in every block on the requested time range, even if only a single sample per each block matches the given time range. The previous logic was effectively limiting CPU time a single query could take. The new logic fails limiting CPU time a single query could take in some pathological cases when only a small fraction of samples per each requested block fit the requested time range. This allows performing multiplication DoS-attacks by querying very narrow time ranges over historical blocks, which tend to be full. For example, if the `-search.maxSamplesPerQuery` equals to a billion, and the query requests a single sample out of 8K samples per each block, this means that the query may unpack a billion of such blocks without exceeding the limit, e.g. it may unpack and process 8K*1e9=8e12 samples. This is not what the resource usage limits were created for originally - see https://docs.victoriametrics.com/#resource-usage-limits Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6464	2024-06-25 02:43:57 +02:00
Hui Wang	6e395048d3	app/vmselect: fix the way of counting raw samples in single query (#6464 ) The limit is specified with command-line flag `-search.maxSamplesPerQuery`. Previously, samples might be over-counted and query can't be fixed by reducing time range. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851	2024-06-14 15:40:30 +02:00
Zakhar Bessarab	af3922b1df	lib/storage: add ability to use downsampling for the given series filter (#733 ) * lib/storage: add ability to use downsampling for the given series filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add information about downsampling filters Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix MetricsQL filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: treat missing downsampling filter as a bug Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/part_header: verify correctness of downsampling filters when opening partition Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: save only appliable rules in part metadata Filter and save only rules which are appliable to partition based on MinTimestamp of stored data. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: update log messages for final dedup Properly specify a reason of re-running deduplication for partition. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: consistently use MaxTimestamp to determine deduplication/downsampling rules Using MinTimestamp leads to applying downsampling to parts which are only partially covered by downsampling rule. For example, partition covers range [1000-2000]. At t=2100 and rule offset 500 data with t=2100-500 => 1600 must be downsampled. The range check against MinTimestamp evaluates to true even though partition contains range which must not be downsampled - [1600:2000]. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Follow-up - Apply the first matching downsampling period if multiple filters match the given time series. This allows fine-tuning the downsampling config for the specific needs. - Take into account downsampling filters during search queries. - Reduce the difference between community and enterprise branches. This should simplify further maintenance of these branches. - Properly parse series filters with colons inside them. - Document the feature at docs/CHANGELOG.md. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-03-30 04:12:23 +02:00
Aliaksandr Valialkin	146fccc22d	app/vmselect/netstorage: usae unsafe.SliceData instead of deprecated reflect.SliceHeader	2024-02-29 17:36:28 +02:00
Aliaksandr Valialkin	6697da73e5	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:44:24 +02:00
Aliaksandr Valialkin	f46eaf92eb	app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at `5d66ee88bd` , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-02-23 02:57:16 +02:00
Dan Dascalescu	17cf031fa1	app/vmselect: simplify wording for `too many samples` error (#5827 )	2024-02-20 16:26:38 +01:00
Aliaksandr Valialkin	db4623efc2	app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs) before trying to append a new item to brsPool and sharing its contents with brs.brs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733	2024-01-31 10:27:50 +02:00
Aliaksandr Valialkin	1c58c00618	app/vmselect/netstorage: limit the initial size for brsPoolCap with 32Kb This should reduce the number of expensive memory allocations with sizes bigger than 32Kb	2024-01-23 22:29:39 +02:00
Aliaksandr Valialkin	43ecd5d258	app/vmselect/netstorage: pre-allocate memory for metricNamesBuf This should reduce the number of metricNamesBuf re-allocations in append()	2024-01-23 21:34:16 +02:00
Aliaksandr Valialkin	41456d9569	app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery() This avoids slow path in Go runtime for allocating objects bigger than 32Kb - see `704401ffa0/src/runtime/malloc.go (L11)` This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `5dd37ad836` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 14:04:49 +02:00
Aliaksandr Valialkin	1f1768d7af	app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size See `704401ffa0/src/runtime/malloc.go (L11)` This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `508c608062` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 13:46:37 +02:00
Aliaksandr Valialkin	e0399ec29a	app/vmselect/netstorage: remove tswPool, since it isnt efficient	2024-01-23 02:28:30 +02:00
Aliaksandr Valialkin	72a838a2a1	app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series This saves a few CPU cycles for common case	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	5dd37ad836	app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	7345567c29	app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map This should reduce the pressure on Go GC, since it will see lower number of pointers. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	678234e9f0	app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery() This should reduce pressure on Go GC at vmselect The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Aliaksandr Valialkin	508c608062	app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice This reduces the number of memory allocations at the cost of possible memory usage increase, since now different metric name strings may hold references to the previous byte slice. This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Nikolay	c9f39fd51f	app/vmselect/netstorage (#5649 ) * app/vmselect/netstorage correctly handle errGlobal set * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 02:47:29 +02:00
Roman Khavronenko	b8b6e120ff	app/vmselect: limit the number of parallel workers by 32 (#5195 ) * app/vmselect: limit the number of parallel workers by 32 The change should improve performance and memory usage during query processing on machines with big number of CPU cores. The number of parallel workers for query processing is controlled via `-search.maxWorkersPerQuery` command-line flag. By default, the number of workers is limited by the number of available CPU cores, but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip - The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage, so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md - Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag - Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md - Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS, because bigger values may worsen query performance and increase CPU usage - Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX, since it is closer to a feature than to a bugfix. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-18 19:51:37 +02:00
Aliaksandr Valialkin	214be01dfa	app/vmselect/netstorage: remove duplicate `see` word from the error message This is a follow-up for `ac6c40e896` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 02:05:44 -07:00
Aliaksandr Valialkin	ac6c40e896	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 01:57:34 -07:00
Aliaksandr Valialkin	45e345806c	app/vmselect/netstorage: remove runtime.Gosched() call from unpackWorker() This should improve scalability of unpackWorker() on systems with many CPU cores. This is a follow-up for `a2ecf4fa4a` and `16f3b279a2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-07-06 10:05:58 -07:00
Aliaksandr Valialkin	a1e496ced6	app/vmselect/netstorage: document why runtime.Gosched() is removed at `28f054bb00` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-25 16:36:51 -07:00
Zakhar Bessarab	28f054bb00	vmselect/netstorage: remove direct calls to `Gosched` to reduce amount of locks for global scope using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly. Updates #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-25 16:34:03 -07:00
Aliaksandr Valialkin	70959d5dab	app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker() Call runtime.Gosched() only when there is a work to steal from other workers. Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork(). This should reduce CPU usage when processing queries on systems with big number of CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-20 20:31:02 -07:00
Aliaksandr Valialkin	4856a4cf5a	app/vmselect: optimize incremental aggregates a bit Substitute sync.Map with an ordinary slice indexed by workerID. This should reduce the overhead when updating the incremental aggregate state	2023-03-20 15:37:06 -07:00
Aliaksandr Valialkin	b5db69fe05	app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage It has been appeared that this interning may lead to increased memory usage and increased CPU usage when vmselect performs queries, which select big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863	2023-03-12 00:52:35 -08:00
Oleksandr Redko	9fff48c3e3	app,lib: fix typos in comments (#3804 )	2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin	be8fba9b6a	app/vmselect/netstorage: tune the number of blocks per series which should be unpacked by a single goroutine instead of spinning up multiple goroutines This reduces overhead on time series data unpacking for typical cases, this reducing CPU usage at vmselect	2023-01-12 09:31:44 -08:00
Aliaksandr Valialkin	53d871d0b1	app/vmselect/netstorage: reduce tail latency during query processing Previously the selected time series were split evenly among available CPU cores for further processing - e.g unpacking the data and applying the given rollup function to the unpacked data. Some time series could be processed slower than others. This could result in uneven work distribution among available CPU cores, e.g. some CPU cores could complete their work sooner than others. This could slow down query execution. The new algorithm allows stealing time series to process from other CPU cores when all the local work is done. This should reduce the maximum time needed for query execution (aka tail latency). The new algorithm should also scale better on systems with many CPU cores, since every CPU processes locally assigned time series without inter-CPU communications. The inter-CPU communications are used only when all the local work is finished and the pending work from other CPUs needs to be stealed.	2023-01-10 13:43:14 -08:00
Aliaksandr Valialkin	e640ff72f1	app/vmselect/netstorage: reduce memory allocations when unpacking time series Unpack time series with less than 400K samples in the currently running goroutine. Previously a new goroutine was being started for unpacking the samples. This was requiring additional memory allocations.	2023-01-09 23:18:17 -08:00
Aliaksandr Valialkin	df2a494a7c	app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying Usually the number of blocks returned per each time series during queries is around 4. So it is a good idea to pre-allocate 4 block references per time series in order to reduce the number of memory allocations.	2023-01-09 22:03:23 -08:00
Aliaksandr Valialkin	c5e0f527bc	app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage This reduces memory allocations for repeated queries, which return (almost) the same set of time series.	2023-01-09 21:53:10 -08:00
Aliaksandr Valialkin	7afcca0c51	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin	c38a10e143	app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks()	2023-01-09 21:08:51 -08:00
Aliaksandr Valialkin	1f9d605988	app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 , but doesn't slow down the common case with merging replicated data blocks so significantly. Benchmark results: Before the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13968 85643 ns/op 956.53 MB/s 1700 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 10806 109171 ns/op 1500.77 MB/s 2191 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 8887 130623 ns/op 1881.45 MB/s 2660 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 7440 157348 ns/op 2082.52 MB/s 3174 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 6534 184473 ns/op 2220.38 MB/s 3612 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 13419 85205 ns/op 961.44 MB/s 2213 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 579 1894900 ns/op 43.23 MB/s 46760 B/op 1 allocs/op After the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13832 85298 ns/op 960.40 MB/s 1716 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 8833 134222 ns/op 1220.66 MB/s 2675 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 6487 184830 ns/op 1329.65 MB/s 3636 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 4977 236318 ns/op 1386.61 MB/s 4733 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 4088 296734 ns/op 1380.36 MB/s 5761 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 14083 84067 ns/op 974.47 MB/s 2110 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 536 2043534 ns/op 40.09 MB/s 50511 B/op 1 allocs/op	2023-01-09 13:01:48 -08:00
Aliaksandr Valialkin	cae0f37edd	app/vmselect/netstorage: remove superflouos map lookup at ProcessSearchQuery This should reduce CPU usage a bit during querying	2022-11-18 13:40:04 +02:00
Aliaksandr Valialkin	c53b7e66ef	app/vmselect: improve performance scalability on multi-CPU systems for `/api/v1/export/...` endpoints	2022-10-01 22:05:43 +03:00
Aliaksandr Valialkin	7478d423c5	app/vmselect/netstorage: cleanup after `92630c1ab4` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896	2022-08-04 18:28:11 +03:00
Aliaksandr Valialkin	c2bd75926b	app/vmselect/netstorage: initialize tsw.rowsProcessed before calling tsw.f, since tsw.f can modify r.Timestamps and r.Values lengths	2022-07-30 00:39:36 +03:00
Aliaksandr Valialkin	19a0b4679a	app/vmselect/netstorage: re-use random generator used for series shuffle in Result.RunParallel This should reduce CPU usage needed for rand.Rand initialization	2022-07-30 00:30:37 +03:00
Aliaksandr Valialkin	92630c1ab4	app/vmselect/netstorage: improve the speed of queries over big number of time series on multi-CPU system Reduce inter-CPU communications when processing the query over big number of time series. This should improve performance for queries over big number of time series on systems with many CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896 Based on `b596ac3745` Thanks to @zqyzyq for the idea.	2022-07-25 09:18:44 +03:00
Aliaksandr Valialkin	159c2e15e3	app/vmselect/netstorage: optimize mergeSortBlocks() for the worst case when blocks contain interleaved samples	2022-07-12 12:31:38 +03:00
Aliaksandr Valialkin	cd09f583fe	app/vmselect/netstorage: add benchmarks for mergeSortBlocks This is a follow-up for `743ff84863`	2022-07-11 12:54:48 +03:00
Aliaksandr Valialkin	743ff84863	app/vmselect/netstorage: optimize mergeSortBlocks function - Use binary search instead of linear scan when locating the run of smallest timestamps in blocks with intersected time ranges. This should improve performance when merging blocks with big number of samples - Skip samples with duplicate timestamps. This should increase query performance in cluster version of VictoriaMetrics with the enabled replication.	2022-07-09 00:34:42 +03:00
Aliaksandr Valialkin	77cbbacfdb	lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics This reduces the number of args to vmselectapi calls	2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin	f435924ab3	lib/vmselectapi: pass maxSuffixes arg to tagValueSuffixes RPC call	2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin	e1b8059086	lib/vmselectapi: rename deleteMetrics to more correct deleteSeries	2022-07-06 12:37:54 +03:00

1 2 3

139 commits