Commit graph

237 commits

Author SHA1 Message Date
Zakhar Bessarab
cd513b9758
Merge tag 'v1.106.0' into pmm-6401-read-prometheus-data-files 2024-11-04 13:56:36 -03:00
Zakhar Bessarab
4e50d6eed3
lib/storage/partition: prevent panic in case resulting in-memory part is empty after merge (#7329)
It is possible for in-memory part to be empty if ingested samples are
removed by retention filters. In this case, data will not be discarded
due to retention before creating in memory part. After in-memory parts
merge samples will be removed resulting in creating completely empty
part at destination.

 This commit checks for resulting part and skips it, if it's empty.

---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-10-27 20:40:13 +01:00
f41gh7
2557e66ee0
Merge tag 'tags/v1.102.1' into pmm-6401-read-prometheus-data-files-cpc 2024-08-02 11:20:14 +02:00
Aliaksandr Valialkin
8551fbe9f3
Revert "refactor(vmstorage): Refactor the code to reduce the time complexity of MustAddRows and improve readability (#6629)"
This reverts commit e280d90e9a.

Reason for revert: the updated code doesn't improve the performance of table.MustAddRows for the typical case
when rows contain timestamps belonging to ptws[0].

The performance may be improved in theory for the case when all the rows belong to partiton other than ptws[0],
but this partition is automatically moved to ptws[0] by the code at lines
6aad1d43e9/lib/storage/table.go (L287-L298) ,
so the next time the typical case will work.

Also the updated code makes the code harder to follow, since it introduces an additional level of indirection
with non-trivial semantics inside table.MustAddRows - the partition.TimeRangeInPartition() function.
This function needs to be inspected and understood when reading the code at table.MustAddRows().
This function depends on minTsInRows and maxTsInRows vars, which are defined and initialized
many lines above the partition.TimeRangeInPartition() call. This complicates reading and understanding
the code even more.

The previous code was using clearer loop over rows with the clear call to partition.HasTimestamp()
for every timestamp in the row. The partition.HasTimestamp() call is used in the table.MustAddRows()
function multiple times. This makes the use of partition.HasTimestamp() call more consistent,
easier to understand and easier to maintain comparing to the mix of partition.HasTimestamp() and partition.TimeRangeInPartition()
calls.

Aslo, there is no need in documenting some hardcore software engineering refactoring at docs/CHANGLELOG.md,
since the docs/CHANGELOG.md is intended for VictoriaMetrics users, who may not know software engineering.
The docs/CHANGELOG.md must document user-visible changes, and the docs must be concise and clear for VictoriaMetrics users.
See https://docs.victoriametrics.com/contributing/#pull-request-checklist for more details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6629
2024-07-25 14:32:09 +02:00
Ruixiang Tan
e280d90e9a
refactor(vmstorage): Refactor the code to reduce the time complexity of MustAddRows and improve readability (#6629)
### Describe Your Changes
The original logic is not only highly complex but also poorly readable,
so it can be modified to increase readability and reduce time
complexity.


---------

Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com>
2024-07-25 08:55:12 +02:00
Aliaksandr Valialkin
c995ccad93
lib/{storage,mergeset}: do not allow setting dataFlushInterval to values smaller than pending{Items,Rows}FlushInterval
Pending rows and items unconditionally remain in memory for up to pending{Items,Rows}FlushInterval,
so there is no any sense in setting dataFlushInterval (the interval for guaranteed flush of in-memory data to disk)
to values smaller than pending{Items,Rows}FlushInterval, since this doesn't affect the interval
for flushing pending rows and items from memory to disk.

This is a follow-up for 4c80b17027

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6221
2024-07-15 10:08:15 +02:00
Hui Wang
4c80b17027
storage: correctly apply -inmemoryDataFlushInterval when it's set t… (#6221)
…o minimum supported value 1s
pendingRowsFlushInterval was bumped to 2s in
73f0a805e2
2024-05-13 16:44:30 +02:00
Aliaksandr Valialkin
f20d452196
lib/storage: remove outdated misleading comments 2024-05-12 10:24:04 +02:00
hagen1778
381d4494e9 v1.101.0
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkhL6N9vmSTjg0VSVO/dfN0HKlkAFAmYqbhsACgkQO/dfN0HK
 lkBDhRAAhUO7SsbCCHZo4Azdw+G32J05LsYvKMGl0r+j2fYRPovl7Sgf/HdZUKxk
 4aOXPPl8YogHWHVv8qrmFXl7gPWRNaFtCxmVlVIv+eEzwzN18tH2Umn+PwfQTtmN
 VM7ujy54rH8z28AGII8P8h4s/0kNVPGwPP9gEifm8ICIXtKpdnvbtkpAoCFEZvYf
 b5chm/3NQRA1R7c+yRxVs9YH15+XgYG0z/onVaVnjUxPXvme64v3RL+nt/ezimMo
 PXDBt5HRXa6lWIxM+g3oaGJ9/qFKwTrHykXgx3oPPWsphJMVW8ltt8sqg6sGuRJz
 fD/iRjpHIGAfD/2BX90TOMyYbC+s921rPU0+aQ70U5mPU+f8E1fI1HNVlsJiZ9NL
 Xhj9GOJzNQP2moql1dsDibZXhO0aIMfweHduXN7KRK88IPtnQdy1Sj6lhAJdJ2iH
 q1s5ShDx9gLLA2ecuL4COA9tQxTncnTZdsU4Y1bnSif0Iuct03L84ovaCSAuJ5BP
 XrwVo0Vk2albDpw8n2Dzq7Xquiewyb9IlaQ8U5B/tdKSpH4aAydy56PgdC+gHaZk
 6c2aBf0HKmg3qxsp/xb593cWloToPgsgB0KB2m7b+nEPBLP62obzBEeS8P5ahrJB
 UmPA7tw6BlYT93JttotFn+gykZjAELcbHkO8Yoe7JnVQMA8irFs=
 =ecyf
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkhL6N9vmSTjg0VSVO/dfN0HKlkAFAmYrkEYACgkQO/dfN0HK
 lkAj0w//WUXbB/gR3P47t6dNSX0P0qb3D/+AQ2+he/wo3mJ1msd3XtkpiUHcpP0k
 qtrYFrY5wQQ4lC82VMOWdlw7YY5E4ah2tbYCAUBpgpp3Lu7iD/muiLlfwQslwYUy
 PzwKH5HBQQmgtKgGX4miSeVk95TUyzE5m70dF4atIY8ydK73ZqiSV+IC9/cYah2y
 Q4Y9xJZnSMR1cKdMfTpYR0s9gPg5bB9yAKq9qB8TQfxMnW2A8wkhvwf675mJJCZ+
 spRTXzrcKp3thKWmDowTtzu/ONYTRcpQfgiE5MxzySnHQcv4nQnf3jxeT1+1K+pk
 5jI5bFAjkRVy1u1oDsbjHySdFyt1jZA6Klw9dlGf9EVjXfr0jbAUO++8T538CCfW
 UUSx2h83GTvSMvVaCDCtbYdlHZwxgLTwJvFDdcpm4nci9u3wKsfnoKw6doskt1fs
 Sp241F46Ck5embAdtv1FJaGYvH8PVe4j24slBWvn5vhN8TcP7mcsw8DCGjKWfSVS
 JQQxxjCmAyxQOZj+9k2v3wlRjmHoRQ0ELu7EbYQW3eyiYi3bWVQkK5Mg4Z0knwGl
 HWWum7LMwkM59hIAc771VGOz3jELGWXBTZWP8FcoRmtzcvBnzVjndDAiwH89xhR6
 Kym5JKrCkVvytee6xNxkBOIyuiavcB2qoZ7IqhHAYnqF9uhMx/E=
 =Ppg1
 -----END PGP SIGNATURE-----

Merge tag 'v1.101.0' into pmm-6401-read-prometheus-data-files

v1.101.0

Signed-off-by: hagen1778 <roman@victoriametrics.com>

# gpg: Signature made чт 25 кві 16:52:11 2024 CEST
# gpg:                using RSA key 9212FA37DBE64938E0D154953BF75F3741CA9640
# gpg: Good signature from "hagen1778 (VM GPG key) <roman@victoriametrics.com>" [ultimate]

# Conflicts:
#	go.mod
2024-04-26 13:30:14 +02:00
Aliaksandr Valialkin
85d09e5a2d
lib/{mergeset,storage}: log deleting directories inside partitions if they are missing in parts.json
This should improve debuggability of unexpected deletion of directories inside partitions.

While at it, log the proper path to parts.json when the directory for big part is missing in the partition.
parts.json is located inside directory with small parts, and there is no parts.json file inside directory with big parts.
2024-04-16 19:11:32 +02:00
Aliaksandr Valialkin
b7b731d340
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2024-04-04 03:49:49 +03:00
Aliaksandr Valialkin
c3a72b6cdb
lib/storage: consistently use stopCh instead of stop 2024-04-02 21:24:57 +03:00
Aliaksandr Valialkin
4001ca36b8
lib/storage/partition.go: reduce code difference a bit with enterprise branch 2024-03-30 01:39:27 +02:00
Nikolay
a05303eaa0
lib/storage: adds metrics for downsampling (#382)
* lib/storage: adds metrics for downsampling
vm_downsampling_partitions_scheduled - shows the number of parts, that must be downsampled
vm_downsampling_partitions_scheduled_size_bytes - shows total size in bytes for parts, the must be donwsampled

These two metrics answer the questions - is downsampling running? how many parts scheduled for downsampling and how many of them currently downsampled? Storage space that it occupies.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2612

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-03-30 01:11:49 +02:00
Zakhar Bessarab
51f5ac1929
lib/storage/table: wait for merges to be completed when closing a table (#5965)
* lib/storage/table: properly wait for force merges to be completed during shutdown

Properly keep track of running background merges and wait for merges completion when closing the table.
Previously, force merge was not in sync with overall storage shutdown which could lead to holding ptw ref.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs: add changelog entry

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-03-26 13:49:09 +01:00
Aliaksandr Valialkin
5c2f85f38d
vendor: run make vendor-update 2024-03-01 02:38:41 +02:00
Aliaksandr Valialkin
b3d9d36fb3
lib/storage: consistently use atomic.* types instead of atomic.* function calls on ordinary types
See ea9e2b19a5
2024-02-24 00:15:26 +02:00
Aliaksandr Valialkin
f81b480905
lib/mergeset: consistently use atomic.* types instead of atomic.* function calls on ordinary types
See ea9e2b19a5
2024-02-23 23:29:35 +02:00
Aliaksandr Valialkin
a204fd69f1
lib/storage: consistently use atomic.* type for refCount and mustDrop fields in indexDB, table and partition structs
See ea9e2b19a5
2024-02-23 22:54:59 +02:00
Aliaksandr Valialkin
ea9e2b19a5
lib/{storage,mergeset}: properly fix 'unaligned 64-bit atomic operation' panic on 32-bit architectures
The issue has been introduced in bace9a2501
The improper fix was in the d4c0615dcd ,
since it fixed the issue just by an accident, because Go comiler aligned the rawRowsShards field
by 4-byte boundary inside partition struct.

The proper fix is to use atomic.Int64 field - this guarantees that the access to this field
won't result in unaligned 64-bit atomic operation. See https://github.com/golang/go/issues/50860
and https://github.com/golang/go/issues/19057
2024-02-23 22:27:06 +02:00
hagen1778
c8d1d2ab72
lib/storage: cleanup after d4c0615dcd
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 18:53:55 +01:00
Dmytro Kozlov
d4c0615dcd
lib/storage: fix aligning (#5860) 2024-02-23 16:37:21 +01:00
Aliaksandr Valialkin
bace9a2501
lib/{mergeset,storage}: convert bufferred items to searchable parts more optimally
Do not convert shard items to part when a shard becomes full. Instead, collect multiple
full shards and then convert them to a searchable part at once. This reduces
the number of searchable parts, which, in turn, should increase query performance,
since queries need to scan smaller number of parts.
2024-02-23 00:16:34 +02:00
Aliaksandr Valialkin
e8b3045062
lib/storage: handle common case when the number of rows passed to flushRowsToInmemoryParts() doesnt exceed maxRawRowsPerShard 2024-02-22 20:44:11 +02:00
Aliaksandr Valialkin
73f0a805e2
lib/{storage,mergeset}: convert beffered items into searchable in-memory parts exactly once per the given flush interval
Previously the interval between item addition and its conversion to searchable in-memory part
could vary significantly because of too coarse per-second precision. Switch from fasttime.UnixTimestamp()
to time.Now().UnixMilli() for millisecond precision. It is OK to use time.Now() for tracking
the time when buffered items must be converted to searchable in-memory parts, since time.Now()
calls aren't located in hot paths.

Increase the flush interval for converting buffered samples to searchable in-memory parts
from one second to two seconds. This should reduce the number of blocks, which are needed
to be processed during high-frequency alerting queries. This, in turn, should reduce CPU usage.

While at it, hardcode the maximum size of rawRows shard to 8Mb, since this size gives the optimal
data ingestion pefromance according to load tests. This reduces memory usage and CPU usage on systems
with big amounts of RAM under high data ingestion rate.
2024-02-22 20:21:14 +02:00
Aliaksandr Valialkin
aec9cd4316
lib/storage: do not pool rawRowsBlock when flushing rawRows to in-memory blocks
The pooled rawRowsBlock objects occupies big amounts of memory between flushes,
and the flushes are relatively rare. So it is better to don't use the pool
and to allocate rawRow blocks on demand. This should reduce the average
memory usage between flushes.
2024-02-22 17:37:48 +02:00
Aliaksandr Valialkin
b7dfe9894c
lib/storage: do not keep rawRows buffer across flush() calls
The buffer can be quite big under high ingestion rate (e.g. more than 100MB).
This leads to increased memory usage between buffer flushes.
So it is better to re-create the buffer on every flush in order to reduce memory usage
between buffer flushes.
2024-02-22 17:22:26 +02:00
Aliaksandr Valialkin
0b503fba0b
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2024-01-30 22:55:54 +02:00
Aliaksandr Valialkin
bb7a419cc3
lib/{mergeset,storage}: make background merge more responsive and scalable
- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c68a6 .
Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2024-01-26 22:27:47 +01:00
Aliaksandr Valialkin
c3a585cfe5
lib/storage: rename *AssistedMerges to *AssistedMergesCount in order to make these field names less misleading
These fields are counters, not gauges, so adding Count suffix to them makes easier to understand this while reading the code
2024-01-25 10:19:32 +02:00
Aliaksandr Valialkin
ae643ef1f1
lib/{storage,mergeset}: reduce the maxium compression level for the stored data
This reduces CPU usage a bit, while doesn't increase resulting file sizes according to synthetic tests.
2024-01-23 17:46:50 +02:00
Aliaksandr Valialkin
3449d563bd
all: add up to 10% random jitter to the interval between periodic tasks performed by various components
This should smooth CPU and RAM usage spikes related to these periodic tasks,
by reducing the probability that multiple concurrent periodic tasks are performed at the same time.
2024-01-22 18:40:32 +02:00
Aliaksandr Valialkin
90768aa418
lib/storage/partition.go: remove misleading comment, which falsely states that inmemoryParts isn't visible to search
Thanks to @satjd for raising attention to this comment at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5410
2024-01-21 04:49:35 +02:00
Aliaksandr Valialkin
f6c91b49a2
Merge branch 'public-single-node' into HEAD 2023-12-13 01:17:25 +02:00
hagen1778
e96b4410a1
lib/storage: fix typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-21 10:52:53 +01:00
Aliaksandr Valialkin
31a3672982
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2023-10-02 22:36:23 +02:00
Aliaksandr Valialkin
d41841c0c9
lib/{mergeset,storage}: consistently reset isInMerge field in parts passed to mergeParts() before returning from the function
While at it consistently check that the isInMerge field is set in all the parts passed to mergeParts()
2023-10-02 08:05:29 +02:00
Aliaksandr Valialkin
3ca6fea858
lib/{mergeset,storage}: perform at most one assisted merge per each call to addRows/addItems
This should reduce tail latency during data ingestion.

This shouldn't slow down data ingestion in the worst case, since assisted merges are spread among
distinct addRows/addItems calls after this change.
2023-10-01 22:19:46 +02:00
Aliaksandr Valialkin
223ef96198
lib/storage: remove unused atomicSetBool function after 717c53af27 2023-09-25 17:37:24 +02:00
Aliaksandr Valialkin
15dfd94f3b
lib/storage: make it clear that the number of big merge workers always equals to 4
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4915#issuecomment-1733922830
2023-09-25 17:15:45 +02:00
Aliaksandr Valialkin
717c53af27
lib/storage: stop exposing vm_merge_need_free_disk_space metric
This metric confuses users and has no any useful information.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686#issuecomment-1733844128
2023-09-25 16:52:39 +02:00
Aliaksandr Valialkin
1590ddecba
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2023-09-22 12:33:27 +02:00
Zakhar Bessarab
bea3431ed1
lib/storage/partition: add check to ensure parts exist on disk (#5017)
* lib/storage/partition: add check to ensure parts exist on disk

If part exists in parts.json but is missing on disk there will be a misleading error similar to "unexpected number of substrings in the part name".

This change forces verification of part existence and throws a correct error in case it is missing on disk.

Such issue can be result of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 or disk corruption.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/storage/partition: use filepath.Join instead of string concatenation

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/storage/partition: add action points for error message

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* all: add a check for missing part in lib/mergeset and lib/logstorage

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-09-19 11:17:41 +02:00
Aliaksandr Valialkin
b80ebb8bfd
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2023-09-19 01:15:00 +02:00
faceair
b6ad581b45
lib/storage: remove ForceMergeAllParts internal loop (#4999)
Signed-off-by: faceair <git@faceair.me>
2023-09-15 19:04:54 +02:00
Aliaksandr Valialkin
af85055f3a
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2023-09-09 06:18:18 +02:00
Aliaksandr Valialkin
edee262ecc
Makefile: update golangci-lint from v1.51.2 to v1.54.2
See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2
2023-09-01 10:16:42 +02:00
Aliaksandr Valialkin
a2e224593e
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2023-07-06 23:51:50 -07:00
Aliaksandr Valialkin
152ca00fb8
docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix
This is a follow-up for 5eb5df96e2
2023-07-06 17:09:03 -07:00
Nikolay
5eb5df96e2
lib/storage: creates parts.json on start-up if it not exists. (#4450)
* lib/storage: creates parts.json on start-up if it not exists.
It fixes migrations from versions below v1.90.0.
Previously parts.json was created only after successful merge.
But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted.
Since VM cannot check if it must be removed or not.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/storage/partition.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-06-15 11:19:22 +02:00