Commit graph

499 commits

Author SHA1 Message Date
Aliaksandr Valialkin
288620ca40
lib/storage: initial support for multi-level downsampling
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36

Based on https://github.com/valyala/VictoriaMetrics/pull/203
2021-12-15 16:42:44 +02:00
Aliaksandr Valialkin
2847c84a7b
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-12-15 16:41:56 +02:00
Aliaksandr Valialkin
4ff647137a
lib/storage: deduplicate samples more thoroughly
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 15:59:58 +02:00
Aliaksandr Valialkin
92070cbb67
lib/storage: return dedup interval in milliseconds from GetDedupInterval()
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:26:38 +02:00
Aliaksandr Valialkin
1d20a19c7d
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:49:12 +02:00
Aliaksandr Valialkin
6a64823581
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-12-14 19:57:21 +02:00
Aliaksandr Valialkin
e1a715b0f5
lib/storage: convert alternate regexps into Graphite wildcards inside __graphite__ pseudo-label
For example, `{__graphite__=~"foo.(bar|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution.
This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.
2021-12-14 19:51:49 +02:00
Aliaksandr Valialkin
45d082bbe2
app/vminsert: add -maxLabelValueLen command-line flag
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908
2021-12-06 11:40:34 +02:00
Aliaksandr Valialkin
b94e986710
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-12-02 15:03:46 +02:00
Aliaksandr Valialkin
7275ebf91a
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:

   vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:43 +02:00
Aliaksandr Valialkin
2fb5a6ca78
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges 2021-12-01 11:02:36 +02:00
Aliaksandr Valialkin
d666755159
lib/storage: take into account -storage.minFreeDiskSpaceBytes when performing big merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:35 +02:00
Aliaksandr Valialkin
d07d2811d4
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-10-22 19:41:51 +03:00
Aliaksandr Valialkin
53bb58ed2a
lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
This should improve the debuggability of the readonly feature.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:59:13 +03:00
Aliaksandr Valialkin
001750c239
lib/storage: fix unaligned access on 32-bit architectures.
The bug has been introduced at a171916ef5
2021-10-08 19:43:03 +03:00
Aliaksandr Valialkin
307034fc2f
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-10-08 16:10:22 +03:00
Aliaksandr Valialkin
cf5cbd1c70
app/{vminsert,vmstorage}: follow-up after a171916ef5
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:35:49 +03:00
Nikolay
4290b46e8c
Adds read-only mode for vmstorage node (#1680)
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269

* changes order a bit

* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage

* renames freeSpaceLimit to ReadOnly
2021-10-08 14:35:48 +03:00
Aliaksandr Valialkin
c149132b14
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-09-23 22:55:56 +03:00
Aliaksandr Valialkin
718eca33ab
lib/storage: properly handle {__name__=~"prefix(suffix1|suffix2)",other_label="..."} queries
They were broken in the commit 00cbb099b6

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644
2021-09-23 21:48:51 +03:00
Aliaksandr Valialkin
163f2a46fd Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-09-15 18:18:59 +03:00
Aliaksandr Valialkin
8f685d81c6 lib/storage: follow up after 00cbb099b6 2021-09-14 14:16:25 +03:00
faceair
00cbb099b6
lib/storage: optimize convert multiple values regexp filter to composite tag filter (#1610)
* lib/storage: optimize convert multiple values regexp filter to composite tag filter

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-09-14 12:47:07 +03:00
Aliaksandr Valialkin
cebcb15ba4 lib/storage: verify that the tsidsFound contain the needed tsids in tests added at f4dead529f 2021-09-11 10:57:13 +03:00
Aliaksandr Valialkin
f4dead529f lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz|",x=~"y|"}
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2021-09-09 21:09:21 +03:00
Aliaksandr Valialkin
bb2d1128b8 Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-09-01 16:37:40 +03:00
Aliaksandr Valialkin
f77dde837a lib/promscrape: add the ability to limit the number of unique series per each scrape target
The number of series per target can be limited with the following options:

* Global limit with `-promscrape.maxSeriesPerTarget` command-line option.
* Per-target limit with `max_series: N` option in `scrape_config` section.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561
2021-09-01 16:03:59 +03:00
Aliaksandr Valialkin
62857fc30e Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-09-01 12:49:13 +03:00
Aliaksandr Valialkin
ffc0ab1774 lib/{mergeset,storage}: improve the detection of the needed free space for background merge
This should prevent from possible out of disk space crashes during big merges.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 09:35:44 +03:00
Aliaksandr Valialkin
b00732074c Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-08-15 23:51:06 +03:00
Aliaksandr Valialkin
4401464c22 all: add support for Prometheus staleness markers
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:10:17 +03:00
Aliaksandr Valialkin
d05cac6c98 li/storage: re-use the per-day inverted index search code for searching in global index
This allows removing a big pile of outdated code for global index search.

This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:31:37 +03:00
Aliaksandr Valialkin
8ee8660ac4 app/vmselect: follow-up for 626073bca8
* Rename -search.maxMetricsPointSearch to -search.maxSamplesPerQuery, so it is more consistent with the existing -search.maxSamplesPerSeries
* Move the -search.maxSamplesPerQuery from vmstorage to vmselect, so it could effectively limit the number of raw samples obtained from all the vmstorage nodes
* Document the -search.maxSamplesPerQuery in docs/CHANGELOG.md
2021-07-28 18:00:23 +03:00
Aliaksandr Valialkin
c2deee9911 lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:04:50 +03:00
Aliaksandr Valialkin
8df8c414de Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files 2021-07-15 14:05:30 +03:00
Aliaksandr Valialkin
d472b03e34 lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples 2021-07-15 12:17:45 +03:00
Aliaksandr Valialkin
682662b2ae lib/storage: remove cache directory if it contains reset_cache_on_startup file
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:58:51 +03:00
Aliaksandr Valialkin
4f80b2f230 lib/storage: properly limit the size of storage/date_metricID cache 2021-07-12 14:25:44 +03:00
Aliaksandr Valialkin
8ca2799478 lib/storage: properly determine when the deduplication is needed in needsDedup
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:

         d < interval
           /     \
   |        v | v        |
     interval   interval

Now it properly returns false for this case
2021-07-12 10:53:30 +03:00
Aliaksandr Valialkin
8b262d4ba7 lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate 2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
a7694092b8 Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
This reverts commit 7c6d3981bf.

Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:21:35 +03:00
Aliaksandr Valialkin
8aa9bba9bd lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.

Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:28:41 +03:00
Aliaksandr Valialkin
7c6d3981bf lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 15:35:03 +03:00
Aliaksandr Valialkin
f3acf065c9 lib/storage: consistency renaming: tagCache -> tagFiltersCache
This improves code readability
2021-07-06 11:03:51 +03:00
Aliaksandr Valialkin
0020b9f904 lib/workingsetcache: properly update stats for requests and cache misses
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.

The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:53:32 +03:00
Aliaksandr Valialkin
d0c830039d lib/storage: tune cache sizes according to production workload 2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
43103be011 lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
9a83e9018d lib/storage: properly detect free disk space shortage during data merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:40:54 +03:00
Aliaksandr Valialkin
c93cee8de8 lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:07 +03:00
Aliaksandr Valialkin
5506cff76e lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 12:40:05 +03:00