Commit graph

208 commits

Author SHA1 Message Date
Aliaksandr Valialkin
7a9f0b32a2
app/vmselect/netstorage: prevent from disk write IO when closing temporary files
Remove temporary file before closing it in order to signal the OS that it shouldn't
store the file contents from page cache to disk when the file is closed.

Gracefully handle the case when the file cannot be removed before being closed -
in this case remove the file after closing it. This allows working on Windows.

Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing.

This is a follow-up for 9b1e002287
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2024-02-01 19:54:48 +02:00
Aliaksandr Valialkin
4f5cb17042
app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs
This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs)
before trying to append a new item to brsPool and sharing its contents with brs.brs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733
2024-01-31 10:31:51 +02:00
Aliaksandr Valialkin
d8c82b6421
app/vmselect/netstorage: initialize tmpBlocksFileWrapper at goroutine, which continues using it
This may improve CPU cache locality
2024-01-26 21:29:30 +01:00
Aliaksandr Valialkin
cfc1193d15
app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery()
This avoids slow path in Go runtime for allocating objects bigger than 32Kb -
see 704401ffa0/src/runtime/malloc.go (L11)

This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics
after the commit 5dd37ad836 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 14:12:27 +02:00
Aliaksandr Valialkin
fe4ea30a79
app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size
See 704401ffa0/src/runtime/malloc.go (L11)

This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics
after the commit 508c608062

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 13:50:59 +02:00
Aliaksandr Valialkin
953b96ced2
app/vmselect/netstorage: remove tswPool, since it isnt efficient 2024-01-23 02:29:13 +02:00
Aliaksandr Valialkin
68a59bfabd
app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series
This saves a few CPU cycles for common case
2024-01-23 02:29:12 +02:00
Aliaksandr Valialkin
f8a9ef8cbd
app/vmselect/netstorage: group per-vmstorage fields at tmpBlocksFileWrapperShard
This improves code readability a bit
2024-01-23 02:29:12 +02:00
Aliaksandr Valialkin
d52b121222
app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations 2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin
5b05224eb9
app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map
This should reduce the pressure on Go GC, since it will see lower number of pointers.

This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin
b289f15f02
app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery()
This should reduce pressure on Go GC at vmselect

The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin
2ab9a75cca
app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice
This reduces the number of memory allocations at the cost of possible memory usage increase,
since now different metric name strings may hold references to the previous byte slice.
This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory.

This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:29:10 +02:00
Aliaksandr Valialkin
c888d76c4b
app/vmselect/netstorage: make sure that at least a single result is collected from every storage group before deciding whether it is OK to skip results from the remaining storage nodes 2023-12-20 19:53:49 +02:00
Aliaksandr Valialkin
e4bb2808f1
app/vmselect: add support for vmstorage groups with independent -replicationFactor per group
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5197

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect

Thanks to @zekker6 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/718
2023-12-13 00:14:34 +02:00
Aliaksandr Valialkin
5b7f40907e
app/vmselect/netstorage: do not retry request when deadline is exceeded 2023-11-14 19:57:29 +01:00
Roman Khavronenko
cd2247b24a
app/vmselect: limit the number of parallel workers by 32 (#5195)
* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-10-26 09:15:27 +02:00
Aliaksandr Valialkin
9c3a37597c
app/vmselect/netstorage: run make fmt after 58326dbf25 2023-09-10 15:18:15 +02:00
Aliaksandr Valialkin
58326dbf25
app/vmselect: return 503 status code when partial responses are denied and some of vmstorage nodes are temporarily unavailable
This should help detecting this case and automatic retrying the query at healthy cluster replica
in another availability zone.

This commit is needed as a preparation for automatic query retry at another backend at vmauth on 5xx errors
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792#issuecomment-1674338561
2023-09-07 16:07:06 +02:00
Aliaksandr Valialkin
d8afd7fe98
Makefile: update golangci-lint from v1.51.2 to v1.54.2
See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2
2023-09-01 10:25:49 +02:00
Aliaksandr Valialkin
19d61737c1
app/{vminsert,vmselect}: follow-up after 2b7b3293c1
- Document the change at docs/CHANGELOG.md
- Set the default value for -vmstorageUserTimeout to 3 seconds. This is much better
  than the 0 value, which means that TCP connection to unreachable vmstorage could block
  for up to 16 minutes.
- Document -vmstorageUserTimeout at docs/Cluster-VictoriaMetrics.md
2023-08-29 12:17:39 +02:00
Will Jordan
2b7b3293c1
Add vmstorageUserTimeout flags to configure TCP user timeout (Linux) (#4423)
`TCP_USER_TIMEOUT` (since Linux 2.6.37) specifies the maximum amount of
time that transmitted data may remain unacknowledged before TCP will
forcibly close the connection and return `ETIMEDOUT` to the application.

Setting a low TCP user timeout allows RPC connections quickly reroute
around unavailable storage nodes during network interruptions.
2023-08-29 11:46:39 +02:00
Aliaksandr Valialkin
992c300ce9
all: replace atomic.Value with atomic.Pointer[T]
This eliminates the need in .(*T) casting for results obtained from Load()

Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map,
because map is already a pointer type.
2023-07-19 17:48:26 -07:00
Aliaksandr Valialkin
e1a2404db5
app/vmselect/netstorage: follow-up after 173ccf4333
- Clarify docs about -replicationFactor command-line flag at vmselect
- Clarify description for -replicationFactor and -search.skipSlowReplicas command-line flags
- Fix the logic for returning responses if -search.skipSlowReplicas command-line flag
  is enabled. The logic was broken in the 173ccf4333,
  so it could return responses only if some of vmstorage nodes return error,
  while it should return when query results are successfully collected from more than
  (len(storageNodes) - replicationFactor) vmstorage nodes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2023-07-09 11:58:22 -07:00
Haleygo
14e242d0b9
vmselect: fix result collect count (#4599) 2023-07-08 08:21:27 +02:00
Roman Khavronenko
173ccf4333
vmselect: introduce search.skipSlowReplicas cmd-line flag (#4538)
* vmselect: introduce `search.skipSlowReplicas` cmd-line flag

vmselect has two logical conditions during request processing when
`-replicationFactor` cmd-line flag is set:
1. If at least `len(storageNodes) - replicationFactor` responded, it could skip
waiting for the rest of nodes to respond. This could lead to problems described
here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207.
2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded
without an error.

The P1 showed itself error-prone and became the main reason why
`-replicationFactor` wasn't recommended to use at vmselect level.
However, this optimization could be still very useful in situations
when there are slow and fast replicas in cluster.

But P2 remains viable and important conditionless.
Hiding P1 behind the feature-flag `search.skipSlowReplicas`
should make `-replicationFactor` flag usable again. And let users
choose whether they want P1 to be respected.

Related issues
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: update changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-07 11:50:26 +02:00
Aliaksandr Valialkin
eb47ad4b69
app/vmselect/netstorage: remove runtime.Gosched() call from unpackWorker()
This should improve scalability of unpackWorker() on systems with many CPU cores.
This is a follow-up for a2ecf4fa4a and 16f3b279a2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-07-06 10:07:42 -07:00
Aliaksandr Valialkin
ec75d9097d
app/vmselect/netstorage: follow-up after 11ac551d52
- Clarify the scope of the fix at docs/CHANGELOG.md
- Handle the case when -search.maxSamplesPerSeries limit is exceeded
  in the same way as the -search.maxSamplesPerQuery limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4472
2023-07-05 21:13:34 -07:00
Aliaksandr Valialkin
643e99a157
app/vmselect/netstorage: improve code readability a bit after 6c84b61893
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4364
2023-07-05 20:48:38 -07:00
Roman Khavronenko
11ac551d52
app/vmselect/netstorage: properly process -search.maxSamplesPerQuery limit (#4472)
Properly return the error to user when `-search.maxSamplesPerQuery` limit is exceeded.
Before, user could have received a partial response instead.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-23 13:17:34 +02:00
Haleygo
6c84b61893
vmselect:fix init sn take too much time (#4366)
* vmselect: descrease start time for vmselect

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4364
2023-05-30 13:04:31 +02:00
Aliaksandr Valialkin
aac3dccfd1
lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.

While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
2023-04-13 22:22:08 -07:00
Nikolay
b38a145cfd
app/vmselect: properly remove temp files at windows system (#4020)
With non-posix compliant systems it's not possible to remove unclosed files.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-27 18:10:44 -07:00
Aliaksandr Valialkin
db3bcbe56a
app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores
This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419
2023-03-25 16:38:39 -07:00
Aliaksandr Valialkin
a2ecf4fa4a
app/vmselect/netstorage: document why runtime.Gosched() is removed at 28f054bb00
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 16:38:28 -07:00
Zakhar Bessarab
16f3b279a2
vmselect/netstorage: remove direct calls to Gosched to reduce amount of locks for global scope
using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly.

Updates #3966

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-25 16:37:58 -07:00
Aliaksandr Valialkin
08da383eac
app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker()
Call runtime.Gosched() only when there is a work to steal from other workers.
Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork().

This should reduce CPU usage when processing queries on systems with big number of CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-20 20:32:56 -07:00
Aliaksandr Valialkin
18af01c387
app/vmselect: optimize incremental aggregates a bit
Substitute sync.Map with an ordinary slice indexed by workerID.
This should reduce the overhead when updating the incremental aggregate state
2023-03-20 15:42:13 -07:00
Aliaksandr Valialkin
e491fee1f4
app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage
It has been appeared that this interning may lead to increased memory usage and increased CPU usage
when vmselect performs queries, which select big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863
2023-03-12 00:44:08 -08:00
Oleksandr Redko
0e1c395609
app,lib: fix typos in comments (#3804) 2023-02-13 09:32:35 -08:00
Zakhar Bessarab
626bd22157
fix: vmselect multi-level setup panic (#3738)
* app/vmselect/netstorage: fix panic for multi-level cluster setup when `replicationFactor` was set and request contained `trace` parameter (#3734)

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmselect/netstorage: use correct context for retry

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-02-01 08:56:36 -08:00
Aliaksandr Valialkin
26f6cfd3b2
app/vmselect/netstorage: tune the number of blocks per series which should be unpacked by a single goroutine instead of spinning up multiple goroutines
This reduces overhead on time series data unpacking for typical cases,
this reducing CPU usage at vmselect

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
2023-01-12 09:35:15 -08:00
Aliaksandr Valialkin
41b0b951f3
app/vmselect/netstorage: unpack series blocks in the current goroutine if their count doesnt exceed 100
This should improve performance a bit for common case
2023-01-12 01:31:38 -08:00
Aliaksandr Valialkin
98931449c1
app/vmselect/netstorage: reduce tail latency during query processing
Previously the selected time series were split evenly among available CPU cores
for further processing - e.g unpacking the data and applying the given rollup
function to the unpacked data.
Some time series could be processed slower than others.
This could result in uneven work distribution among available CPU cores,
e.g. some CPU cores could complete their work sooner than others.
This could slow down query execution.

The new algorithm allows stealing time series to process from other CPU cores
when all the local work is done. This should reduce the maximum time
needed for query execution (aka tail latency).

The new algorithm should also scale better on systems with many CPU cores,
since every CPU processes locally assigned time series without inter-CPU communications.

The inter-CPU communications are used only when all the local work is finished
and the pending work from other CPUs needs to be stealed.
2023-01-10 13:42:26 -08:00
Aliaksandr Valialkin
158a280822
app/vmselect/netstorage: reduce memory allocations when unpacking time series
Unpack time series with less than 4M samples in the currently running goroutine.
Previously a new goroutine was being started for unpacking the samples.
This was requiring additional memory allocations.
2023-01-09 23:17:34 -08:00
Aliaksandr Valialkin
abbac2c27c
app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying
Usually the number of blocks returned per each time series during queries is around 4.
So it is a good idea to pre-allocate 4 block references per time series
in order to reduce the number of memory allocations.
2023-01-09 22:08:30 -08:00
Aliaksandr Valialkin
2483c67579
app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage
This reduces memory allocations for repeated queries, which return (almost) the same set of time series.
2023-01-09 21:56:27 -08:00
Aliaksandr Valialkin
b7a4650ab0
all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries
This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache
on subsequent calls for the same input regexp.
2023-01-09 21:45:34 -08:00
Aliaksandr Valialkin
9f02f5a05a
app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks() 2023-01-09 21:29:01 -08:00
Aliaksandr Valialkin
96f04c9863
app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333

This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 ,
but doesn't slow down the common case with merging replicated data blocks so significantly.

Benchmark results:

Before the change:

BenchmarkMergeSortBlocks/replicationFactor-1-4         	   13968	     85643 ns/op	 956.53 MB/s	    1700 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-2-4         	   10806	    109171 ns/op	1500.77 MB/s	    2191 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-3-4         	    8887	    130623 ns/op	1881.45 MB/s	    2660 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-4-4         	    7440	    157348 ns/op	2082.52 MB/s	    3174 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-5-4         	    6534	    184473 ns/op	2220.38 MB/s	    3612 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4  	   13419	     85205 ns/op	 961.44 MB/s	    2213 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 	     579	   1894900 ns/op	  43.23 MB/s	   46760 B/op	       1 allocs/op

After the change:

BenchmarkMergeSortBlocks/replicationFactor-1-4         	   13832	     85298 ns/op	 960.40 MB/s	    1716 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-2-4         	    8833	    134222 ns/op	1220.66 MB/s	    2675 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-3-4         	    6487	    184830 ns/op	1329.65 MB/s	    3636 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-4-4         	    4977	    236318 ns/op	1386.61 MB/s	    4733 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-5-4         	    4088	    296734 ns/op	1380.36 MB/s	    5761 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4  	   14083	     84067 ns/op	 974.47 MB/s	    2110 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 	     536	   2043534 ns/op	  40.09 MB/s	   50511 B/op	       1 allocs/op
2023-01-09 12:58:18 -08:00
Roman Khavronenko
909cd04c55
lib/storage: keep sample with the biggest value on timestamp conflict (#3421)
The change leaves raw sample with the biggest value for identical
timestamps per each `-dedup.minScrapeInterval` discrete interval
when the deduplication is enabled.

```
benchstat old.txt new.txt
name                                         old time/op    new time/op    delta
DeduplicateSamples/minScrapeInterval=1s-10      817ns ± 2%     832ns ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10     1.56µs ± 1%    2.12µs ± 0%   +35.19%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10     1.32µs ± 3%    1.65µs ± 2%   +25.57%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10    1.13µs ± 2%    1.50µs ± 1%   +32.85%  (p=0.000 n=10+10)

name                                         old speed      new speed      delta
DeduplicateSamples/minScrapeInterval=1s-10   10.0GB/s ± 2%   9.9GB/s ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10   5.24GB/s ± 1%  3.87GB/s ± 0%   -26.03%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10   6.22GB/s ± 3%  4.96GB/s ± 2%   -20.37%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10  7.28GB/s ± 2%  5.48GB/s ± 1%   -24.74%  (p=0.000 n=10+10)
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-08 18:18:36 -08:00