Roman Khavronenko
5f33445f66
lib/storage: limit max mergeConcurrency value for systems with high number of CPUs ( #2673 )
...
Workers count for merges affects the max part size during merges. Such behaviour
protects storage from running out of disk space for scenario when all workers
are merging parts with the max size.
This works very well for most cases. But for systems where high number of CPUs
is allocated for vmstorage components this could significantly impact the max
part size and result in more unmerged parts than expected.
While checking multiple production highly loaded setups it was discovered that
`max_over_time(vm_active_merges{type="storage/big}[1h]}"` rarely exceeds 2,
and `max_over_time(vm_active_merges{type="storage/small}[1h]}"` rarely exceeds 4.
The change in this commit limits the max value for concurrency accordingly.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 15:02:55 +03:00
Aliaksandr Valialkin
fedfc9e686
lib/storage: stop background merge when storage enters read-only mode
...
This should prevent from `no space left on device` errors when VictoriaMetrics
under-estimates the additional disk space needed for background merge.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603
2022-06-01 14:22:12 +03:00
Aliaksandr Valialkin
cb319b15bb
lib/storage: increase the number of rawRowsShard shards on systems with more than 4 CPU cores
...
This should improve data ingestion scalability on systems with many CPU cores
2022-04-06 19:50:41 +03:00
Aliaksandr Valialkin
123a88bb65
lib/storage: reuse sync.WaitGroup objects
...
This reduces GC load by up to 10% according to memory profiling
2022-04-06 14:00:50 +03:00
Aliaksandr Valialkin
b47f18f555
lib/{mergeset,storage}: tune compression levels for small blocks
...
This should reduce CPU usage spent on compression
2022-02-25 15:34:13 +02:00
Aliaksandr Valialkin
6ae584b9b3
lib/{mergeset,storage}: properly limit cache sizes for indexdb
...
Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`,
since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded,
which could result to out of memory errors.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:45:03 +02:00
Aliaksandr Valialkin
727797a6fd
all: use logger.WithThrottler() where appropriate
2021-12-21 17:10:54 +02:00
Aliaksandr Valialkin
f22aab411b
lib/storage: properly update per-part min_dedup_interval
file contents after merge
...
Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value
This is a follow-up for 4ff647137a
2021-12-17 20:12:18 +02:00
Aliaksandr Valialkin
d36fdbe537
lib/storage: deduplicate samples more thoroughly
...
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 16:00:30 +02:00
Aliaksandr Valialkin
ab4be24397
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
...
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:
vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin
2e43cd9d62
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges
2021-12-01 12:30:03 +02:00
Aliaksandr Valialkin
71c0f7cce3
lib/storage: take into account -storage.minFreeDiskSpaceBytes
when performing big merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:53 +02:00
Aliaksandr Valialkin
b885bd9b7d
lib/{mergeset,storage}: improve the detection of the needed free space for background merge
...
This should prevent from possible out of disk space crashes during big merges.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 10:01:09 +03:00
Aliaksandr Valialkin
1950f57316
lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:03:31 +03:00
Aliaksandr Valialkin
8055439fe4
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
eb335d2c29
lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard
2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94
lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores
2021-06-11 10:49:02 +03:00
Aliaksandr Valialkin
95b735a883
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
0fc857d363
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Nikolay
62d58324dd
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
e37e1b1e34
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
cba2d13456
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
ab8008d6d7
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
b8a5ee2e93
lib/{mergeset,storage}: allow merging smaller number of small parts
...
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.
This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
e8ee9fa7fe
app/vmstorage: export missing vm_cache_size_bytes
metrics for indexdb and data caches
2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin
d5a2b120e9
app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage
2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
ca8919e8e1
lib/storage: wait for pending transactions before closing and dropping the partition
...
This deflakes `make test-full-386` test
2020-12-25 11:46:47 +02:00
Aliaksandr Valialkin
c0511144e3
lib/storage: physically remove stale parts
...
Previously they were removed from partition struct, but the corresponding directories weren't removed.
This is a follow-up for 46dba00756
2020-12-24 16:56:09 +02:00
Aliaksandr Valialkin
66f8fbbb32
lib/storage: do not remove parts outside the configured retention if they are currently merged
...
These parts are automatically removed after the merge is complete.
2020-12-24 09:02:12 +02:00
Aliaksandr Valialkin
fa3bcf220f
lib/storage: remove stale parts as soon as they go outside the configured retention
...
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:55:07 +02:00
Aliaksandr Valialkin
6859737329
lib/storage: properly determine max rows for output part when merging small parts
2020-12-18 23:26:28 +02:00
Aliaksandr Valialkin
edbe35509e
lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage
2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1a237c6903
all: properly handle CPU limits set on the host system/container
...
This can reduce memory usage on systems with enabled CPU limits.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
c0bd208c77
lib/storage: do not report about the need of free disk space if parts cannot be merged due to too big write amplification
2020-11-03 15:32:09 +02:00
Aliaksandr Valialkin
0db7c2b500
app/vmstorage: support for -retentionPeriod
smaller than one month
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin
b51fa16177
app/vmstorage: add -finalMergeDelay
command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it
2020-10-07 17:42:31 +03:00
Aliaksandr Valialkin
7c2e4e267a
lib/storage: allow set values higher than 1 for vm_merge_need_free_disk_space
if there are multiple partitions with deferred merges due to disk space shortage
2020-09-29 22:53:34 +03:00
Aliaksandr Valialkin
097a4c10dd
app/vmstorage: add metrics for determining whether background merges need additional disk space to complete
...
These metrics are:
* vm_small_merge_need_free_disk_space
* vm_big_merge_need_free_disk_space
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-29 21:47:47 +03:00
Aliaksandr Valialkin
a9321f6a60
lib/storage: reduce CPU load for idle VictoriaMetrics by reducing the frequency for the need for background merges
2020-09-21 15:51:26 +03:00
Aliaksandr Valialkin
d96858b921
lib/storage: add /internal/force_merge
handler for running forced compactions on historical per-month partitions
...
This may be useful for freeing up storage space after time series deletion.
See https://victoriametrics.github.io/#force-merge for more details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin
3abbb38254
lib/{mergeset,storage}: compare errors with errors.Is()
2020-09-17 03:03:10 +03:00
Aliaksandr Valialkin
ddb3519e17
lib/{mergeset,storage}: code prettifying
2020-09-17 02:06:37 +03:00
Aliaksandr Valialkin
bf826dd828
lib/storage: removed duplicate checks for empty parts during merge - another check is in the beginning of mergeParts functions
2020-09-17 01:49:08 +03:00
Aliaksandr Valialkin
3149af624d
lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
...
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.
While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:53:13 +03:00
Aliaksandr Valialkin
29bbab0ec9
lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
...
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.
Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
b8303afcd8
lib/storage: improve prioritizing of data ingestion over querying
...
Prioritize also small merges over big merges.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
7d0743422b
lib/storage: properly calculate global metrics in UpdateStats()
2020-07-23 00:35:31 +03:00