Commit graph

232 commits

Author SHA1 Message Date
Aliaksandr Valialkin
8055439fe4 lib/storage: properly detect free disk space shortage during data merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
bced9ee666 lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
b7c0b3dde3 lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
This should reduce memory usage for the pool on systems with big number of CPU cores.

The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:57:52 +03:00
Aliaksandr Valialkin
0efd37cec1 lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce .
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .

This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
6865f3b497 Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
This reverts commit 793fe39921.

Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:11:22 +03:00
Aliaksandr Valialkin
7b33bc67a1 lib/mergeset: remove a pool for inmemoryBlock structs
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 22:00:50 +03:00
Aliaksandr Valialkin
0fc857d363 lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
e37e1b1e34 lib/{storage,mergeset}: fix unaligned 64-bit atomic operation panic for 32-bit architectures
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d lib/mergeset: split rows ingestion among multiple shards
This improves rows ingestion on systems with many CPU cores by reducing lock contention.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244

Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
ab8008d6d7 lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
587132555f lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin
b8a5ee2e93 lib/{mergeset,storage}: allow merging smaller number of small parts
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.

This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
34195218e1 lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains 2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
9566015a36 Revert "lib/mergeset: tune lifetime for entries inside block caches"
This reverts commit 458c89324d.

Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:15 +02:00
Aliaksandr Valialkin
35a23234ca lib/mergeset: tune lifetime for entries inside block caches
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:12:32 +02:00
Aliaksandr Valialkin
500acb958c lib/mergeset: clarify comments in the code a bit 2021-02-16 18:03:11 +02:00
Aliaksandr Valialkin
d788876a7a lib/mergeset: remove unused code after a4140de9e6 2021-02-16 13:40:29 +02:00
Aliaksandr Valialkin
afa9cf9c57 lib/mergeset: remove dead code left after a4140de9e6 2021-02-09 16:51:09 +02:00
Aliaksandr Valialkin
7b7963a77f lib/mergeset: unconditionally cache indexdb blocks
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:49:59 +02:00
Aliaksandr Valialkin
e8ee9fa7fe app/vmstorage: export missing vm_cache_size_bytes metrics for indexdb and data caches 2021-02-09 00:49:58 +02:00
faceair
cf76d8dd79 lib/mergeset: add missing shouldCacheBlock (#1019) 2021-01-15 11:47:25 +02:00
Aliaksandr Valialkin
edbe35509e lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage 2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1a237c6903 all: properly handle CPU limits set on the host system/container
This can reduce memory usage on systems with enabled CPU limits.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
8f8727cb65 lib/mergeset: tune the number of rawItemsBlocks to merge at once
512 blocks give higher ingestion performance and slightly lower memory usage
2020-11-25 21:53:15 +02:00
Aliaksandr Valialkin
8fcd87a6a5 lib/mergeset: help GC by removing refereces to slices in inmemoryBlock.Reset 2020-11-25 21:20:02 +02:00
Aliaksandr Valialkin
7f3e884a31 all: spelling fix: superflouos->superfluous. This is a follow-up for 0acdab3ab9 2020-11-24 12:42:04 +02:00
Aliaksandr Valialkin
f4fd917e4f lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
c736339843 lib/{storage,mergeset}: clean cached index blocks and inmemory blocks more aggressively
Previously such blocks were cleaned after they weren't accessed during 10 minutes.
Now they are cleaned after one minute of missing access. This should reduce memory usage in general case.
2020-11-04 16:44:15 +02:00
Aliaksandr Valialkin
3abbb38254 lib/{mergeset,storage}: compare errors with errors.Is() 2020-09-17 03:03:10 +03:00
Aliaksandr Valialkin
ddb3519e17 lib/{mergeset,storage}: code prettifying 2020-09-17 02:06:37 +03:00
Aliaksandr Valialkin
edf3ee2a9b lib: dump compressed block contents on error during decompression
This should improve detecting root cause for https://github.com/facebook/zstd/issues/2222
2020-08-15 14:51:14 +03:00
Aliaksandr Valialkin
29bbab0ec9 lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.

Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
b8303afcd8 lib/storage: improve prioritizing of data ingestion over querying
Prioritize also small merges over big merges.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
6afdcf8a20 lib/mergeset: properly calculate global metrics in UpdateStats()
Previously these metrics could be calculated multiple times for multiple mergeset.Table instances.
2020-07-23 00:35:29 +03:00
Aliaksandr Valialkin
d962568e93 all: use %w instead of %s for wrapping errors in fmt.Errorf
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
a72f18e821 lib/{storage,mergeset}: further tuning of compression levels depending on block size
This should improve performance for querying newly added data, since it can be unpacked faster.
2020-05-15 13:12:28 +03:00
Aliaksandr Valialkin
6838fa876c lib/mergeset: tune compression levels in order to improve ingestion performance a bit 2020-05-15 12:12:15 +03:00
Aliaksandr Valialkin
3845420a8f lib: extract common code for returning fast unix timestamp into lib/fasttime 2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
7e831741f9 lib/{storage,mergeset}: return dst on error from unmarshalBlockHeaders, so it could be reused 2020-05-14 15:32:23 +03:00
Aliaksandr Valialkin
f442d81648 lib/{storage,mergeset}: cleanup: remove unused partSearch.indexBlockReuse 2020-05-14 14:03:15 +03:00
Aliaksandr Valialkin
0b2f678d8e lib/{storage,mergeset}: make sure that requests and misses cache counters never go down 2020-04-10 14:44:52 +03:00
Aliaksandr Valialkin
e3b18ca1ab lib/mergeset: skip createing temporary part objects when merging source inmemory parts
This should reduce CPU usage when adding new entries to inverted index.
This should alos prevent from creating stalled cleaner goroutines for the created temporary parts,
since they were never closed.

This should fix the following issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 14:09:13 +02:00
Aliaksandr Valialkin
347aaba79d lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
da19fffa08 all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt 2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
cb2a2f281f lib/mergeset: properly update lastAccesstime in indexBlockCache entries
This is a follow-up for 6665f10e7b
2020-01-29 21:21:01 +02:00
Aliaksandr Valialkin
ea53a21b02 all: consistently log durations in seconds with millisecond precision
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
62b041e90a lib/{mergeset,storage}: properly update lastAccessTime in index and data block cache entries 2020-01-20 15:00:10 +02:00
Aliaksandr Valialkin
caffb0cd01 lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm) 2020-01-14 22:47:42 +02:00
Aliaksandr Valialkin
b03ccbf6f7 lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches
This should reduce memory usage in the long run when old blocks and indexes
aren't accessed anymore.
2020-01-14 21:38:29 +02:00
Aliaksandr Valialkin
5d2ff573aa app/vmselect/promql: allow negative offsets
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 00:57:51 +02:00
Aliaksandr Valialkin
3694efd005 lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup
This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.
2019-12-09 15:45:23 +02:00
Aliaksandr Valialkin
639967db59 lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after runTransactions call.
`runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions
can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction
deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction
deletions are finished before returning from `runTransactions`.
2019-12-04 21:40:52 +02:00
Aliaksandr Valialkin
b9616c017f lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:34:37 +02:00
Aliaksandr Valialkin
26ffc77622 lib/{storage,mergeset}: create missing partition directories after restoring from backups
Backup tools could skip empty directories. So re-create such directories on the first run.
2019-11-02 02:27:19 +02:00
Aliaksandr Valialkin
5d2276dbf7 lib/{mergeset,storage}: limit the maximum number of concurrent merges; leave smaller number of parts during final merge 2019-10-29 12:45:37 +02:00
Aliaksandr Valialkin
5b01b7fb01 all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 18:27:49 +03:00
Aliaksandr Valialkin
9fce611fbb lib/mergeset: reduce the maximum number of cached blocks, since there are reports on OOMs due to too big caches
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/189
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/195
2019-09-30 12:27:30 +03:00
Aliaksandr Valialkin
adc18c3ee6 lib/{mergeset,storage}: do not cache inverted index blocks containing tag->metricIDs items
This should reduce the amounts of used RAM during queries with filters over big number of time series.
2019-09-25 13:48:24 +03:00
Aliaksandr Valialkin
de0e4eee2c lib/storage: create and use lib/uint64set instead of map[uint64]struct{}
This should improve inverted index search performance for filters matching big number of time series,
since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls.
See the corresponding benchmarks in `lib/uint64set`.
2019-09-24 21:18:04 +03:00
Aliaksandr Valialkin
67a2bcb98a lib/{storage,mergeset}: verify PrepareBlock callback results
Do not touch the first and the last item passed to PrepareBlock
in order to preserve sort order of mergeset blocks.
2019-09-23 20:46:33 +03:00
Aliaksandr Valialkin
3304dc1e85 lib/mergeset: detect whether we are in test by executable suffix 2019-09-22 23:12:35 +03:00
Aliaksandr Valialkin
7d13c31566 lib/{storage,mergeset}: merge tag->metricID rows into tag->metricIDs rows for common tag values
This should improve lookup performance if the same `label=value` pair exists
in big number of time series.
This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows
occupy less space than the original `tag->metricID` rows.
2019-09-20 22:06:23 +03:00
Aliaksandr Valialkin
bf8505353a lib/mergeset: rename misleading mergeSmallParts to mergeExistingParts 2019-09-19 21:48:36 +03:00
Aliaksandr Valialkin
ebbef20535 lib/mergeset: use sort.IsSorted instead of sort.SliceIsSorted in inmemoryBlock.isSorted in order to reduce memory allocations 2019-09-19 20:13:54 +03:00
Aliaksandr Valialkin
410f993bf6 lib/mergeset: fill partHeader.firstItem on first block flush 2019-09-19 17:48:22 +03:00
Aliaksandr Valialkin
41ef6b060e lib/mergeset: properly check for sorted block headers
Fix a typo for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/181
2019-09-13 21:59:38 +03:00
Aliaksandr Valialkin
568ff61dcf lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache 2019-09-09 11:42:45 +03:00
Aliaksandr Valialkin
0b0153ba3d lib/storage: invalidate tagFilters -> TSIDS cache when newly added index data becomes visible to search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/163
2019-08-29 15:08:44 +03:00
Aliaksandr Valialkin
604a4312f9 all: port to FreeBSD on GOARCH=amd64 2019-08-28 01:46:09 +03:00
Aliaksandr Valialkin
39f3f3a517 lib: move common code for creating flock.lock file into fs.CreateFlockFile 2019-08-13 01:46:20 +03:00
Aliaksandr Valialkin
73f866d874 lib/fs: atomically create file with the given contents on WriteFileAtomically
This should prevent from `transaction` and `metadata.json` files corruption
on unclean shutdown such as OOM, `kill -9`, power loss, etc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/148
2019-08-12 15:02:04 +03:00
Aliaksandr Valialkin
ba8195c58e all: consistency renaming: bytesSize -> sizeBytes 2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
41f512af1c all: add vm_data_size_bytes metrics for easy monitoring of on-disk data size and on-disk inverted index size 2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
b6ea1a7d5e lib/mergeset: make fmt 2019-06-29 14:25:46 +03:00
Aliaksandr Valialkin
2257dcd278 lib/mergeset: speed up binarySearchKey by skipping the first item during binary search 2019-06-29 13:49:32 +03:00
Aliaksandr Valialkin
fb9358635d lib/storage: mention source parts on merge error
This should improve determining broken source part.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/76
2019-06-24 14:09:46 +03:00
Aliaksandr Valialkin
18d6f293f7 lib/fs: consolidate *RemoveAll* funcs into a single MustRemoveAll func
The func syncs parent dir in order to persist directory removal
in the event of power loss
2019-06-12 01:55:18 +03:00
Aliaksandr Valialkin
51e2e255a6 lib/fs: consistency renaming SyncPath -> MustSyncPath, since it doesnt return error 2019-06-11 23:13:45 +03:00
Aliaksandr Valialkin
b491045a4b lib/{storage,mergeset}: sync filenames inside part when finalizing the part
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/63
2019-06-11 21:51:19 +03:00
Aliaksandr Valialkin
3437c30180 all: try hard removing directory with contents
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61
2019-06-11 01:58:08 +03:00
Aliaksandr Valialkin
bdf696ef18 all: fix misspellings 2019-05-25 21:51:24 +03:00
Aliaksandr Valialkin
1836c415e6 all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00