Aliaksandr Valialkin
80fc3fda07
lib/storage: follow-up for 38bf5fc136
2022-01-05 16:02:17 +02:00
weng zhao
1e0fe615ad
vmstorage: fix query like {foo=~"bar|"}
return extra timeseries cause by negative filter transformation malfunction ( #2032 )
...
1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix
2. L2762 avoid change tf.value from "bar|" to ".+r|"
2022-01-05 15:57:54 +02:00
Nikolay
6cdc934c3d
adds restore.lock ( #1988 )
...
* adds restore.lock
it must prevent from running storage after incomplete restore process
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958
* return back flock file deletion
* Apply suggestions from code review
* wip
* docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-22 13:10:56 +02:00
Aliaksandr Valialkin
727797a6fd
all: use logger.WithThrottler() where appropriate
2021-12-21 17:10:54 +02:00
Aliaksandr Valialkin
053e85ff3d
all: typo fix: unexected -> unexpected
2021-12-20 17:40:13 +02:00
Aliaksandr Valialkin
f22aab411b
lib/storage: properly update per-part min_dedup_interval
file contents after merge
...
Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value
This is a follow-up for 4ff647137a
2021-12-17 20:12:18 +02:00
Aliaksandr Valialkin
d36fdbe537
lib/storage: deduplicate samples more thoroughly
...
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 16:00:30 +02:00
Aliaksandr Valialkin
bc3923111b
lib/storage: return dedup interval in milliseconds from GetDedupInterval()
...
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:27:27 +02:00
Aliaksandr Valialkin
cdfe854c9b
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
...
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:52:29 +02:00
Aliaksandr Valialkin
c922c7af9a
lib/storage: convert alternate regexps into Graphite wildcards inside __graphite__
pseudo-label
...
For example, `{__graphite__=~"foo.(bar|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution.
This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.
2021-12-14 19:55:59 +02:00
Aliaksandr Valialkin
9aa9b081a4
app/vminsert: add -maxLabelValueLen
command-line flag
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908
2021-12-06 11:42:24 +02:00
Aliaksandr Valialkin
ab4be24397
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
...
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:
vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin
2e43cd9d62
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges
2021-12-01 12:30:03 +02:00
Aliaksandr Valialkin
71c0f7cce3
lib/storage: take into account -storage.minFreeDiskSpaceBytes
when performing big merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:53 +02:00
Aliaksandr Valialkin
93511b4be7
lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
...
This should improve the debuggability of the readonly feature.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:58:09 +03:00
Aliaksandr Valialkin
a7a1305395
lib/storage: fix unaligned access on 32-bit architectures.
...
The bug has been introduced at a171916ef5
2021-10-08 19:38:20 +03:00
Aliaksandr Valialkin
4fddcf4c83
app/{vminsert,vmstorage}: follow-up after a171916ef5
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:09:51 +03:00
Nikolay
a171916ef5
Adds read-only mode for vmstorage node ( #1680 )
...
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
* changes order a bit
* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage
* renames freeSpaceLimit to ReadOnly
2021-10-08 12:52:56 +03:00
Aliaksandr Valialkin
d15d036a5a
lib/storage: properly handle {__name__=~"prefix(suffix1|suffix2)",other_label="..."}
queries
...
They were broken in the commit 00cbb099b6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644
2021-09-23 21:52:31 +03:00
Aliaksandr Valialkin
1493461244
lib/storage: follow up after 00cbb099b6
2021-09-14 14:23:02 +03:00
faceair
61a51f7c15
lib/storage: optimize convert multiple values regexp filter to composite tag filter ( #1610 )
...
* lib/storage: optimize convert multiple values regexp filter to composite tag filter
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-09-14 14:23:01 +03:00
Aliaksandr Valialkin
6d6cf1b6e0
lib/storage: verify that the tsidsFound contain the needed tsids in tests added at f4dead529f
2021-09-11 11:02:56 +03:00
Aliaksandr Valialkin
c2f37f049b
lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz|",x=~"y|"}
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2021-09-09 21:12:53 +03:00
Aliaksandr Valialkin
c4df601f43
lib/promscrape: add the ability to limit the number of unique series per each scrape target
...
The number of series per target can be limited with the following options:
* Global limit with `-promscrape.maxSeriesPerTarget` command-line option.
* Per-target limit with `max_series: N` option in `scrape_config` section.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561
2021-09-01 16:08:12 +03:00
Aliaksandr Valialkin
b885bd9b7d
lib/{mergeset,storage}: improve the detection of the needed free space for background merge
...
This should prevent from possible out of disk space crashes during big merges.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 10:01:09 +03:00
Aliaksandr Valialkin
c1f81f08d4
all: add support for Prometheus staleness markers
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:13:15 +03:00
Aliaksandr Valialkin
c473d8ffe1
li/storage: re-use the per-day inverted index search code for searching in global index
...
This allows removing a big pile of outdated code for global index search.
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:28:20 +03:00
Aliaksandr Valialkin
1950f57316
lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:03:31 +03:00
Aliaksandr Valialkin
9d3f9da5ad
lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples
2021-07-15 12:18:38 +03:00
Aliaksandr Valialkin
e992754e79
lib/storage: remove cache directory if it contains reset_cache_on_startup file
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:59:51 +03:00
Aliaksandr Valialkin
3f705fe8d7
lib/storage: properly limit the size of storage/date_metricID
cache
2021-07-12 14:25:28 +03:00
Aliaksandr Valialkin
ef3c58d7a3
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:54:51 +03:00
Aliaksandr Valialkin
74ace9340d
lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate
2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin
a846febc89
Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
...
This reverts commit 7c6d3981bf
.
Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:26:56 +03:00
Aliaksandr Valialkin
b805a675f3
lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
...
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.
Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:33:25 +03:00
Aliaksandr Valialkin
d8e7c1ef27
lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
...
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 16:33:24 +03:00
Aliaksandr Valialkin
22c6e64bbc
lib/storage: consistency renaming: tagCache -> tagFiltersCache
...
This improves code readability
2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin
21abf487c3
lib/workingsetcache: properly update stats for requests and cache misses
...
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.
The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:54:38 +03:00
Aliaksandr Valialkin
51516b96e6
lib/storage: tune cache sizes according to production workload
2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin
f12f97daa1
lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
...
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 14:25:59 +03:00
Aliaksandr Valialkin
8055439fe4
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
bced9ee666
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
609ad6d9bf
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin
b84aea1e6e
lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
...
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
a22f37599b
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
b133de1e37
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ce10bdc82a
lib/storage: reset cache on disk during series deletion and during indexdb rotation
...
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin
eb335d2c29
lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard
2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94
lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores
2021-06-11 10:49:02 +03:00
Aliaksandr Valialkin
1e4a64844d
lib/storage: properly account the number of loops spent when matching for or suffixes
...
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin
fc2565b4ee
lib/storage: reduce memory allocations when syncing dateMetricIDCache
2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin
10b2855949
lib/storage: fix spelling typo: borken->broken
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
1c16cbacf5
lib/storage: do not stop data ingestion on the first error in Storage.AddRows
...
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
2601844de3
lib/storage: limit the number of rows per each block in Storage.AddRows()
...
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
95b735a883
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
402a8ca710
lib/storage: do not populate MetricID->MetricName cache during data ingestion
...
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.
This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin
0fc857d363
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
165a9f9200
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
e228f479a5
lib/storage: remove possible data race when logging dropped labels
2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
2839055513
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
4e59cf4380
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
4a5f45c77e
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
43c52ff77a
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:57:28 +03:00
Nikolay
62d58324dd
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
b43ba6d85f
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
e37e1b1e34
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
cba2d13456
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
ab8008d6d7
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
72c41323fa
lib/storage: code clarification: remove caching the found metricName in searchMetricName
2021-04-13 10:20:35 +03:00
Aliaksandr Valialkin
59ccc43e3a
lib/storage: properly handle big time ranges passed to /api/v1/labels
and /api/v1/label/<labelName>/values
...
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:10 +03:00
Aliaksandr Valialkin
512addc608
app/{vminsert,vmagent}: add -sortLabels
command-line option for sorting time series labels before ingesting them in the storage
...
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin
ae1c653d55
lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels
2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin
940a547116
lib/storage: do not update b.nextIdx if no samples are removed because of retention
2021-03-29 12:13:38 +03:00
Aliaksandr Valialkin
9c2be144cf
app/vmselect: log the metric which trigger rollup result cache reset
...
This should help finding the source of stale metrics
2021-03-25 21:32:28 +02:00
Aliaksandr Valialkin
f971fe86cd
lib/storage: tune loopsCountPerMetricNameMatch according to production workload
2021-03-25 13:48:17 +02:00
Aliaksandr Valialkin
9947c65df3
lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
...
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:59:34 +02:00
Aliaksandr Valialkin
12ca0efc19
lib/storage: respect the deadline passed to Storage.SearchMetricNames
2021-03-22 23:03:00 +02:00
Aliaksandr Valialkin
40e47935e7
lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache
2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin
1618b0ca6d
lib/storage: tune loopsCountPerMetricNameMatch
2021-03-22 12:57:54 +02:00
Aliaksandr Valialkin
7503111feb
lib/storage: small code simplification after 6cee5338b2
2021-03-18 15:22:39 +02:00
Aliaksandr Valialkin
4443254fb9
lib/storage: prevent from infinite loop if {__graphite__="..."}
filter matches a metric name with *
, [
or {
chars
...
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:57:39 +02:00
Aliaksandr Valialkin
b859fe7879
lib/storage: faster move heavy filters to the end of list
2021-03-17 15:11:56 +02:00
Aliaksandr Valialkin
41fe707bec
lib/storage: limit loops count in order to reduce max CPU usage during filter search
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
e2a0c8bd72
lib/storage: do not modify filterLoopsCount stats with loopsCount stats
...
Such a modification can result in incorrect filter sorting later
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
727ded9d4e
lib/storage: time series search optimization according to production workload profiling
...
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:08:43 +02:00
Aliaksandr Valialkin
f4a44d6c0d
lib/storage: further tuning for time series search
2021-03-16 18:47:29 +02:00
Aliaksandr Valialkin
d074326970
app/vmstorage: add -logNewSeries
command-line flag for determining the source of series churn rate
2021-03-15 22:40:28 +02:00
Aliaksandr Valialkin
fc902734d9
lib/storage: further tuning for time series selector code
2021-03-15 20:32:37 +02:00
Aliaksandr Valialkin
1c26020080
lib/storage: tune per-day index search
2021-03-15 13:36:36 +02:00
Aliaksandr Valialkin
b2732575f7
lib/storage: further tune filters sorting logic
2021-03-12 00:51:35 +02:00
John Belmonte
edf39aa225
spelling fix: adjacent ( #1115 )
2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin
c4a0bd5eac
lib/storage: go fmt
2021-03-08 11:59:31 +02:00
Aliaksandr Valialkin
c76a904bb0
lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
...
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:17:48 +02:00
Aliaksandr Valialkin
c8dde1fd6b
lib/storage: typo fix: umarshal -> unmarshal
2021-03-02 20:48:44 +02:00
Aliaksandr Valialkin
06676c8feb
lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache
2021-02-23 15:50:08 +02:00
faceair
b1409a7413
lib/storage: correct tagfilter match cost ( #1079 )
2021-02-22 21:54:37 +02:00
Aliaksandr Valialkin
587132555f
lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
...
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin
bd3bcdc43c
lib/storage: do not re-calculate stats for heavy tag filters
...
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
b8a5ee2e93
lib/{mergeset,storage}: allow merging smaller number of small parts
...
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.
This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
34195218e1
lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
10ccb92e4d
lib/storage: use composite index for a query with a name filter and negative filters
2021-02-18 18:57:45 +02:00
Aliaksandr Valialkin
418de71509
lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
...
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:33:05 +02:00
Aliaksandr Valialkin
628b8eb55e
lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
...
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 14:01:18 +02:00
Aliaksandr Valialkin
fd41f070db
lib/storage: sort tag filters by the number of loops they need for the execution
...
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:52:29 +02:00
Aliaksandr Valialkin
9566015a36
Revert "lib/mergeset: tune lifetime for entries inside block caches"
...
This reverts commit 458c89324d
.
Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:15 +02:00
Aliaksandr Valialkin
b4c7d5992b
lib/storage: move composite filters to the top during sorting
2021-02-17 20:26:32 +02:00
Aliaksandr Valialkin
2005d8212a
lib/storage: return back filter arg to getMetricIDsForTagFilter function
...
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.
Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:15 +02:00
Aliaksandr Valialkin
83da939947
app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics
2021-02-17 19:13:49 +02:00
Aliaksandr Valialkin
0c5bb2a397
lib/storage: revert ecf132933e
, since negative filters require the same amount of work as non-negative filters
2021-02-17 18:56:13 +02:00
Aliaksandr Valialkin
bbc287ea6a
lib/storage: tag filters sorting...
2021-02-17 18:56:12 +02:00
Aliaksandr Valialkin
f5e841c1e9
lib/storage: further tune tag filters sorting
2021-02-17 17:28:36 +02:00
Aliaksandr Valialkin
9d1f14d94c
lib/storage: tune the logic for sorting tag filters according the their exeuction times
2021-02-17 15:02:19 +02:00
Aliaksandr Valialkin
1a19702d92
lib/storage: make sure that nobody uses partitions when closing the table
2021-02-17 15:02:18 +02:00
Aliaksandr Valialkin
1ab7a8dfd5
lib/storage: more tuning for tag filters sorting according the time they take
2021-02-16 21:27:27 +02:00
Aliaksandr Valialkin
35a23234ca
lib/mergeset: tune lifetime for entries inside block caches
...
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:12:32 +02:00
Aliaksandr Valialkin
55952f8f2e
lib/storage: tune sorting for tag filters
2021-02-16 13:07:42 +02:00
Aliaksandr Valialkin
3eae03a337
lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs
2021-02-15 16:39:52 +02:00
Aliaksandr Valialkin
46e98ed490
vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
...
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
93ff866e91
lib/storage: reduce the minimum supported retention for inverted index from one month to one day
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
6f3bbf21b8
lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
...
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:19:46 +02:00
Aliaksandr Valialkin
9e3993c585
lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
...
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:39:19 +02:00
Aliaksandr Valialkin
4e645a5fd3
lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case
2021-02-10 22:43:07 +02:00
Aliaksandr Valialkin
eeb92eb7fc
lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata
2021-02-10 17:55:51 +02:00
Aliaksandr Valialkin
08f21d8761
app/vmstorage: export vm_composite_index_min_timestamp metric
2021-02-10 17:14:00 +02:00
Aliaksandr Valialkin
b27288f1b0
lib/storage: parallelize tag filters execution a bit
...
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 16:32:27 +02:00
Aliaksandr Valialkin
4262c2f7c2
lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
...
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 16:32:26 +02:00
Aliaksandr Valialkin
681dfb7485
lib/storage: fix inconsistencies in error logs
2021-02-10 16:32:21 +02:00
Aliaksandr Valialkin
148422bcba
lib/storage: disable composite index usage when querying old data
2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin
17d5a03f6e
lib/storage: fix metric name match for composite filter
2021-02-10 01:27:34 +02:00
Aliaksandr Valialkin
fa0ef143b1
lib/storage: optimize search by label filters matching big number of time series
2021-02-10 00:46:17 +02:00
Aliaksandr Valialkin
5c9715a89a
lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
...
This should help systems with multiple CPU cores
2021-02-10 00:04:19 +02:00
Aliaksandr Valialkin
9ed7789fef
optimize Storage.updatePerDateData()
2021-02-09 02:59:53 +02:00
Aliaksandr Valialkin
ea328b7391
lib/storage: skip deduplication when creating inmemory data blocks
...
The deduplication will be performed later during merging such blocks.
2021-02-09 02:26:16 +02:00
Aliaksandr Valialkin
7b7963a77f
lib/mergeset: unconditionally cache indexdb blocks
...
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:49:59 +02:00
Aliaksandr Valialkin
e8ee9fa7fe
app/vmstorage: export missing vm_cache_size_bytes
metrics for indexdb and data caches
2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin
2dbb12563b
lib/storage: optimize data ingestion in the beginning of every hour
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:04:51 +02:00
Aliaksandr Valialkin
c6a7288109
lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
...
This should reduce possible CPU usage spikes at the beginning of every hour.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:46:23 +02:00
Aliaksandr Valialkin
8249f13104
app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
...
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:16:28 +02:00
Aliaksandr Valialkin
2976ec89b8
lib/storage: fix a bug, which breaks searching by Graphite wildcard filters
2021-02-03 20:15:50 +02:00
Aliaksandr Valialkin
45a63a1da9
sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks
2021-02-03 20:15:37 +02:00
Aliaksandr Valialkin
4b930b9ffe
app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"}
syntax
2021-02-03 01:17:19 +02:00
Aliaksandr Valialkin
3d79471fb3
lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
...
This is a follow-up commit after c8ea697db8
2021-01-12 15:20:22 +02:00
Aliaksandr Valialkin
719ad49adf
lib/storage: de-duplicate tags in MetricName.sortTags
...
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:22 +02:00
Aliaksandr Valialkin
d5a2b120e9
app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage
2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
ca8919e8e1
lib/storage: wait for pending transactions before closing and dropping the partition
...
This deflakes `make test-full-386` test
2020-12-25 11:46:47 +02:00
Aliaksandr Valialkin
c0511144e3
lib/storage: physically remove stale parts
...
Previously they were removed from partition struct, but the corresponding directories weren't removed.
This is a follow-up for 46dba00756
2020-12-24 16:56:09 +02:00
Aliaksandr Valialkin
66f8fbbb32
lib/storage: do not remove parts outside the configured retention if they are currently merged
...
These parts are automatically removed after the merge is complete.
2020-12-24 09:02:12 +02:00
Aliaksandr Valialkin
fa3bcf220f
lib/storage: remove stale parts as soon as they go outside the configured retention
...
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:55:07 +02:00
Aliaksandr Valialkin
6859737329
lib/storage: properly determine max rows for output part when merging small parts
2020-12-18 23:26:28 +02:00
Aliaksandr Valialkin
edbe35509e
lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage
2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1a237c6903
all: properly handle CPU limits set on the host system/container
...
This can reduce memory usage on systems with enabled CPU limits.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
9eca96596f
lib/storage: add missing (AccountID, ProjectID) in MetricName.String() test
2020-11-29 01:25:50 +02:00
Aliaksandr Valialkin
03002f1fe1
lib/storage: log metric name plus all its labels when the metric timestamp is outside the configured retention
...
This should simplify debugging when the source of the metric with unexpected timestamp must be found.
2020-11-25 14:44:29 +02:00
Aliaksandr Valialkin
4848a05924
lib/storage: typo fix in error message: allowd->allowed
2020-11-25 14:15:54 +02:00
Aliaksandr Valialkin
7f3e884a31
all: spelling fix: superflouos->superfluous. This is a follow-up for 0acdab3ab9
2020-11-24 12:42:04 +02:00
Aliaksandr Valialkin
f4fd917e4f
lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
...
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
7d76fdedcc
app/vmselect: use storage.NewSearchQuery() instead of constructing storage.SearchQuery in-place
...
This should prevent from bugs when AccountID and ProjectID aren't set in storage.SearchQuery.
2020-11-16 18:04:33 +02:00
Aliaksandr Valialkin
a9287cf564
lib/storage: do not pass (accountID, projectID) to SearchTagNames(), since they are already passed via tfss
2020-11-16 18:04:30 +02:00
Aliaksandr Valialkin
ac7460abdd
lib/storage: add a test for Storage.SearchMetricNames
2020-11-16 13:18:48 +02:00
Aliaksandr Valialkin
eea1be0d5c
app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API
...
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin
4be5b5733a
app/vminsert: add /tags/tagSeries
and /tags/tagMultiSeries
handlers from Graphite Tags API
...
See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
2020-11-16 02:40:04 +02:00
Aliaksandr Valialkin
9ec964bff8
lib/storage: do not show artifically created label for reverse Graphite labels at /api/v1/labels page
2020-11-16 00:44:54 +02:00
immerrr again
1ec1a9f27f
app/vmstorage: add "/internal/force_flush" endpoint ( #893 )
2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin
72011bcc45
app/vmselect: properly handle errors in GetLabelsOnTimeRange and GetLabelValuesOnTimeRange
2020-11-05 01:36:34 +02:00
Aliaksandr Valialkin
f2bff64933
lib/storage: remove data race when updating rowsDeleted
2020-11-05 01:19:30 +02:00
Aliaksandr Valialkin
c5e6c5f5a6
app/vmselect: optimize querying for /api/v1/labels
and /api/v1/label/<name>/values
when start
and end
args are set
2020-11-05 01:19:29 +02:00
Aliaksandr Valialkin
c736339843
lib/{storage,mergeset}: clean cached index blocks and inmemory blocks more aggressively
...
Previously such blocks were cleaned after they weren't accessed during 10 minutes.
Now they are cleaned after one minute of missing access. This should reduce memory usage in general case.
2020-11-04 16:44:15 +02:00
Aliaksandr Valialkin
c0bd208c77
lib/storage: do not report about the need of free disk space if parts cannot be merged due to too big write amplification
2020-11-03 15:32:09 +02:00
Aliaksandr Valialkin
1b9778a756
lib/storage: remove unneeded fmt.Sprintf
2020-11-03 14:21:04 +02:00
Aliaksandr Valialkin
f3a7e6f6e3
lib/storage: remove obsolete code
2020-11-02 19:17:30 +02:00
Aliaksandr Valialkin
901514be88
lib/storage: drop more samples outside the given retention during background merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-31 20:44:47 +02:00
Aliaksandr Valialkin
7599e5c835
lib/storage: properly handle the case when key="__name__" is passed to MetricName.AddTag*
2020-10-20 20:09:52 +03:00
Aliaksandr Valialkin
9c5cd5a6c5
lib/storage: code cleanup after 5bfd4e6218
2020-10-20 16:10:53 +03:00
Aliaksandr Valialkin
0db7c2b500
app/vmstorage: support for -retentionPeriod
smaller than one month
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin
efb1989193
lib/storage: small code adjustements after d2960a20e0
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-10-17 01:17:12 +03:00
faceair
8ddf089deb
evaluate the execution cost of all tag filters ( #824 )
...
* evaluate the execution cost of all tag filters
* fix suffixes typo
2020-10-17 01:13:20 +03:00
Aliaksandr Valialkin
d2e917d1cb
app/vmstorage: add vm_rows_added_to_storage_total
metric, which shows the total number of rows added to storage since app start
2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin
b51fa16177
app/vmstorage: add -finalMergeDelay
command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it
2020-10-07 17:42:31 +03:00
Aliaksandr Valialkin
fd7dd5064a
lib/storage: code cleanup after 10f2eedee0
...
Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search,
since this code became unused after implementing per-day inverted index almost a year ago.
While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names
such as `foo.bar.baz`).
2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin
3ad7566a87
lib/storage: imrpove cache effectiveness for time series ids matching the given filters
...
Previously the maximum cache lifetime has been limited by 10 seconds. Now it is extended up to a day.
This should reduce CPU usage in the following cases:
* when querying recently added data with small churn rate for time series
* when querying historical data
2020-10-01 14:39:46 +03:00
Aliaksandr Valialkin
7c2e4e267a
lib/storage: allow set values higher than 1 for vm_merge_need_free_disk_space
if there are multiple partitions with deferred merges due to disk space shortage
2020-09-29 22:53:34 +03:00
Aliaksandr Valialkin
097a4c10dd
app/vmstorage: add metrics for determining whether background merges need additional disk space to complete
...
These metrics are:
* vm_small_merge_need_free_disk_space
* vm_big_merge_need_free_disk_space
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-29 21:47:47 +03:00
Aliaksandr Valialkin
338a53ccf9
lib/storage: fix tests for 32-bit arches such as GOARCH=386 and GOARCH=arm
2020-09-29 13:10:37 +03:00
Aliaksandr Valialkin
ef416c72c2
lib/storage: fix 32-bit builds for GOARH=386 or GOARCH=arm
2020-09-29 12:42:25 +03:00
Aliaksandr Valialkin
6d8c23fdbd
app/{vminsert,vmselect}: skip accountID and projectID when marshaling/unmarshaling MetricName in /api/v1/export/native and /api/v1/import/native
...
This is needed in order to be able to migrate native data from/to single-node VictoriaMetrics
2020-09-28 00:58:58 +03:00
Aliaksandr Valialkin
aadbd014ff
all: add native format for data export and import
...
The data can be exported via [/api/v1/export/native](https://victoriametrics.github.io/#how-to-export-data-in-native-format ) handler
and imported via [/api/v1/import/native](https://victoriametrics.github.io/#how-to-import-data-in-native-format ) handler.
2020-09-27 17:36:38 +03:00
Aliaksandr Valialkin
533bf76a12
lib/storage: correctly use maxBlockSize in various checks
...
Previously `maxBlockSize` has been multiplied by 8 in certain checks. This is unnecessary.
2020-09-24 18:13:15 +03:00
Aliaksandr Valialkin
31e341371b
lib/storage: code prettifying after be5e1222f3
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-09-22 00:42:20 +03:00
faceair
ad41e39350
add filter to getMetricIDs ( #783 )
...
* add getMetricIDs filter
* check nil filter before use
2020-09-22 00:42:19 +03:00
Aliaksandr Valialkin
a9321f6a60
lib/storage: reduce CPU load for idle VictoriaMetrics by reducing the frequency for the need for background merges
2020-09-21 15:51:26 +03:00
Aliaksandr Valialkin
778ea183ca
lib/decimal: properly store Inf values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-18 19:08:53 +03:00
Aliaksandr Valialkin
d96858b921
lib/storage: add /internal/force_merge
handler for running forced compactions on historical per-month partitions
...
This may be useful for freeing up storage space after time series deletion.
See https://victoriametrics.github.io/#force-merge for more details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin
3abbb38254
lib/{mergeset,storage}: compare errors with errors.Is()
2020-09-17 03:03:10 +03:00
Aliaksandr Valialkin
ddb3519e17
lib/{mergeset,storage}: code prettifying
2020-09-17 02:06:37 +03:00
Aliaksandr Valialkin
bf826dd828
lib/storage: removed duplicate checks for empty parts during merge - another check is in the beginning of mergeParts functions
2020-09-17 01:49:08 +03:00
Aliaksandr Valialkin
81c05f669b
lib/storage: do not store inf values, since they may lead to significant precision loss for previously stored values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-11 14:45:20 +03:00
Aliaksandr Valialkin
f307e6f432
app/vmselect: initial implementation of Graphite Metrics API
...
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin
f5cb213ef9
lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
...
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.
Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin
4ce1368e4b
lib/storage: mention time range used in the query that led to error message
...
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:29 +03:00
Aliaksandr Valialkin
f92255e803
lib/storage: mention tag filters used in the query that led to error message
...
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:54 +03:00
Aliaksandr Valialkin
b3d4ff7ee2
app/vmstorage: improve error logging when the request times out
2020-08-10 13:17:24 +03:00
Aliaksandr Valialkin
307281e922
lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit
...
This should improve data ingestion performance when heavy searches are executed
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-08-07 08:49:13 +03:00
Aliaksandr Valialkin
dd1d59f57a
lib/storage: properly check timeouts and pace limits
...
Previously they were checked on every iteration for small number of iterations
2020-08-07 08:40:56 +03:00
Aliaksandr Valialkin
a2039b3bbc
app/vmselect: return the upper bound on the number of found time series from storage.Search.Init
...
This is used by a single-node version in order to reduce memory allocations during search.
See bc8381613d
for details.
2020-08-06 19:20:31 +03:00
Aliaksandr Valialkin
b690eeff53
lib/storage: reduce the frequency (and overhead) for timeout and pace limiter checks by 4x
2020-08-06 18:45:47 +03:00
Aliaksandr Valialkin
13f8644f8e
lib/storage: optimize prefetching metric names for the given metricIDs
2020-08-06 16:52:58 +03:00
Aliaksandr Valialkin
a3e91c593b
lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
...
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
3149af624d
lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
...
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.
While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:53:13 +03:00
Aliaksandr Valialkin
29bbab0ec9
lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
...
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.
Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
96039dcb40
lib/storage: properly update vm_slow_row_inserts_total
metric when importing multiple data points per time series at once
...
Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series,
while only a single increment is needed when inserting the first data point for this time series.
2020-07-30 16:17:19 +03:00
Sasasu
96bc476e53
lib/storage: metaindexRow use memroy more efficiently ( #655 )
...
due to memory align the metaindexRow structure use 64-byte pre object.
this commit changes the order of field, make metaindexRow use 56-byte pre
object.
Signed-off-by: Sasasu <su@sasasu.me>
2020-07-27 23:23:25 +03:00
Aliaksandr Valialkin
94cc677b0c
lib/storage: slightly reduce code difference between single-node and cluster versions
2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin
fb3d1380ac
lib/storage: respect -search.maxQueryDuration
when searching for time series in inverted index
...
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637
lib/storage: add more fine-grained pace limiting for search
2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
b8303afcd8
lib/storage: improve prioritizing of data ingestion over querying
...
Prioritize also small merges over big merges.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
7d0743422b
lib/storage: properly calculate global metrics in UpdateStats()
2020-07-23 00:35:31 +03:00
Aliaksandr Valialkin
23fa44e56e
lib/storage: reorder mergeBlockStreams() args in order to make them more consistent
2020-07-22 21:58:25 +03:00
Aliaksandr Valialkin
754eac676d
lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs
...
This condition may occur after the following sequence of events:
1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs.
2) All the goroutines return from Storage.AddRows.
3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body.
The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal().
This may take indefinite time.
2020-07-22 21:52:42 +03:00
Aliaksandr Valialkin
67be79a0bc
lib/uint64set: optimize adding items to the set via Set.AddMulti
2020-07-21 20:57:05 +03:00
Aliaksandr Valialkin
be0ab4fbfe
lib/storage: reset MetricName->TSID
cache after marking metricIDs as deleted
...
This is a follow-up commit after 12b16077c4
,
which didn't reset the `tsidCache` in all the required places.
This could result in indefinite errors like:
missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time
Fix this by resetting the cache inside deleteMetricIDs function.
2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin
7335743d57
lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
...
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:34:27 +03:00
Aliaksandr Valialkin
fad008df7e
lib/storage: clarify out of retention period
error message by mentioning -retentionPeriod
command-line flag
2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin
fe58462bef
lib/storage: reset MetricName->TSID cache after deleting time series
...
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.
This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596
Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin
0bff96fe4b
lib/storage: prioritize data ingestion over heavy queries
...
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.
Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
4cb3e7595c
app/vmstorage: add -denyQueriesOutsideRetention
command-line flag for denying queries outside the configured retention
2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
70bf8218bb
app/vmselect/promql: properly override label values from group_left
and group_right
lists like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:32:27 +03:00
Tristan Su
c254b683fd
lib/storage: set big/small merge concurrency ( #568 )
...
fixed #567
Co-authored-by: Tristan Su <suqing.sq@alibaba-inc.com>
2020-06-19 02:21:55 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
5f3a895c23
lib/storage: add key!=".+"
filter additionally to negative filter matching empty value such as key!~"|foo"
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:05:45 +03:00
Aliaksandr Valialkin
c40f29f783
lib/storage: properly match {tag!="|foo"}
filters
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-10 19:34:37 +03:00
Aliaksandr Valialkin
3d0a0b3785
lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds
2020-06-04 13:14:04 +03:00
Aliaksandr Valialkin
eca1afdc20
lib/storage: fix Graphite wildcard matching, which has been broken in v1.36.0
2020-05-28 11:58:47 +03:00
Aliaksandr Valialkin
b0131c79b6
lib/storage: improve search speed for time series matching Graphite whildcards such as foo.*.bar.baz
...
Add index for reverse Graphite-like metric names with dots. Use this index during search for filters
like `__name__=~"foo\\.[^.]*\\.bar\\.baz"` which end with non-empty suffix with dots, i.e. `.bar.baz` in this case.
This change may "hide" historical time series during queries. The workaround is to add `[.]*` to the end of regexp label filter,
i.e. "foo\\.[^.]*\\.bar\\.baz" should be substituted with "foo\\.[^.]*\\.bar\\.baz[.]*".
2020-05-27 21:48:08 +03:00
Aliaksandr Valialkin
2a8f1e6931
lib/storage: do not increment vm_slow_metric_name_loads_total
counter for metric_ids which shouldnt be prefetched, since this may mislead users
2020-05-16 10:23:39 +03:00
Aliaksandr Valialkin
1e5c1d7eaa
app/vmstorage: add vm_slow_metric_name_loads_total
metric, which could be used as an indicator when more RAM is needed for improving query performance
2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin
d6b9a49481
app/vmstorage: add vm_slow_row_inserts_total
and vm_slow_per_day_index_inserts_total
metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series
2020-05-15 13:46:57 +03:00
Aliaksandr Valialkin
a72f18e821
lib/{storage,mergeset}: further tuning of compression levels depending on block size
...
This should improve performance for querying newly added data, since it can be unpacked faster.
2020-05-15 13:12:28 +03:00
Aliaksandr Valialkin
2cf2e9955b
lib/storage: wait for all the goroutines to finish in TestSearch in order to prevent racy behavior on test finish
2020-05-15 12:12:20 +03:00
Aliaksandr Valialkin
67e331ac62
lib/storage: optimize ingestion pefrormance for new time series
2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin
1b5d272e07
lib/storage: reduce indentation in Storage.add
2020-05-14 23:23:56 +03:00
Aliaksandr Valialkin
71d29a8fa1
lib/storage: return the first error instead of the last error, since the first error usually points to the root cause
2020-05-14 23:18:59 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
7e831741f9
lib/{storage,mergeset}: return dst on error from unmarshalBlockHeaders, so it could be reused
2020-05-14 15:32:23 +03:00
Aliaksandr Valialkin
2f42b85e0e
lib/storage: document that getnerateUniqueMetricID should return dense ids
2020-05-14 14:08:59 +03:00
Aliaksandr Valialkin
f442d81648
lib/{storage,mergeset}: cleanup: remove unused partSearch.indexBlockReuse
2020-05-14 14:03:15 +03:00
Aliaksandr Valialkin
8bb44a5d09
lib/storage: optimize label matching for regexp ending with literal suffix
...
For example, `{label=~"foo.*bar.+baz"}` contains literal suffix `baz`,
so it should work faster now.
2020-05-13 11:39:05 +03:00
Aliaksandr Valialkin
bd5f4e0344
lib/storage: properly initialize part struct before trying to close it on error
...
This should prevent from nil pointer dereference bug at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/468 .
2020-05-12 14:54:16 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Aliaksandr Valialkin
8c77cb436a
lib/storage: typo fixes in error messages: or -> of
2020-05-12 12:12:33 +03:00
Aliaksandr Valialkin
bbf06a4248
lib/storage: speed up matching for common regexps in label filters
...
The following regexps have been optimized:
* 'foo.+bar'
* 'foo.+bar.+baz'
This should improve performance for matching Graphite-like metrics.
2020-05-11 22:49:01 +03:00
Aliaksandr Valialkin
37254a139a
lib/storage: add a benchmark for Graphite-like regexps for metric names
2020-05-11 22:49:00 +03:00
Aliaksandr Valialkin
d78ed50edd
lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
...
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:01:32 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
b16e19c053
lib/storage/dedup.go: go fmt
2020-04-26 14:37:36 +03:00
Aliaksandr Valialkin
a0000c3a6e
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:18 +03:00
Aliaksandr Valialkin
13b4069c59
lib/storage: postpone label filters matching too many time series instead of giving up with error
...
This should reduce the frequency of the following errors:
cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters
more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:18:52 +03:00
Aliaksandr Valialkin
f9526809e5
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin
e9d9638627
lib/storage: skip metricID if the corresponding metricID->metricName is missing in inverted index during search
...
This case is possible when the corresponding metricID->metricName entry didn't propagate to inverted index yet.
This should fix the following error:
error when searching tsids for tfss [...]: cannot find metricName by metricID 1582417212213420669: EOF
2020-04-15 00:10:11 +03:00
Aliaksandr Valialkin
e0c6da8e2a
lib/storage: disable deduplication after dedup tests are complete
...
The rest of tests expect that the de-duplication is disabled.
2020-04-10 17:33:38 +03:00
Aliaksandr Valialkin
8ed0d5471a
lib/storage: correctly handle -dedup.minScrapeInterval
values smaller than 8ms
...
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:40:41 +03:00
Aliaksandr Valialkin
0b2f678d8e
lib/{storage,mergeset}: make sure that requests
and misses
cache counters never go down
2020-04-10 14:44:52 +03:00
Aliaksandr Valialkin
0ad7aaf535
lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init
2020-04-01 17:40:27 +03:00
Aliaksandr Valialkin
4c56acbafa
lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
...
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:47:04 +03:00
Aliaksandr Valialkin
504ea876f2
lib/storage: handle errors returned from TagFilters.Add
when cloning TagFilters with negative filter
2020-03-31 16:18:34 +03:00
Aliaksandr Valialkin
ef714e01c1
lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet
2020-03-31 12:35:15 +03:00
Aliaksandr Valialkin
7e755b4bac
lib/storage: optimize per-day inverted index search for tag filters matching big number of time series
...
- Sort tag filters in the ascending number of matching time series
in order to apply the most specific filters first.
- Fall back to metricName search for filters matching big number of time series
(usually this are negative filters or regexp filters).
2020-03-31 00:53:29 +03:00
Aliaksandr Valialkin
d450249955
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
2020-03-30 20:21:47 +03:00
Aliaksandr Valialkin
7cdac6634c
lib/storage: serialize snapshot creation process with mutex
...
This guarantees that the snapshot contains all the recently added data
from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.
2020-03-24 22:27:28 +02:00
Aliaksandr Valialkin
31a533656e
lib/storage: remove obsolete code
2020-03-13 22:42:42 +02:00
Aliaksandr Valialkin
cf9aee4ec3
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin
110cce24d9
lib/storage: add vm_ prefix to deduplicated_samples_total
metric
2020-02-21 19:33:36 +02:00
Aliaksandr Valialkin
a2b81b71b9
lib/storage: typo fix
2020-02-16 15:53:48 +02:00
Aliaksandr Valialkin
ad4cb9f3ca
lib/storage: prevent from clobbering nin-nil lastError in Storage.add
2020-02-16 15:51:35 +02:00
Aliaksandr Valialkin
347aaba79d
lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
...
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
ea66212c93
lib/storage: move -dedup.minScrapeInterval
flag outside lib/storage, so it doesnt show up in vminsert
in cluster version
2020-02-10 13:07:25 +02:00
Aliaksandr Valialkin
56d6b8ed0a
lib/storage: do not deduplicate blocks with less than 32 samples during merge
...
This should improve deduplication accuracy for blocks with higher number of samples.
2020-02-04 18:41:37 +02:00
Aliaksandr Valialkin
7cde594696
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
e3adc095bd
all: add -dedup.minScrapeInterval
command-line flag for data de-duplication
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:18:54 +02:00
Aliaksandr Valialkin
a45f25699c
lib/storage: re-use indexSearch inside Storage.prefetchMetricNames
2020-01-31 01:18:53 +02:00
Aliaksandr Valialkin
da19fffa08
all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
1332ddc15e
lib/storage: pass missing AccountID and ProjectID to searchMetricName
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
4ed5e9a7ce
lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
...
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
ea53a21b02
all: consistently log durations in seconds with millisecond precision
...
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
62b041e90a
lib/{mergeset,storage}: properly update lastAccessTime
in index and data block cache entries
2020-01-20 15:00:10 +02:00
Aliaksandr Valialkin
a851c75703
lib/storage: skip recovering timestamps order for lossless compression (PrecisionBits=64)
2020-01-17 23:59:19 +02:00
Aliaksandr Valialkin
476c7fb109
lib/storage: reduce memory allocations when merging metricID sets
2020-01-17 22:10:56 +02:00
Aliaksandr Valialkin
7d429e2806
lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods
...
Iterate items with newly added Set.ForEach method instead of allocating `[]uint64`
slice for all the items before the iteration.
2020-01-15 12:15:48 +02:00
Aliaksandr Valialkin
caffb0cd01
lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm)
2020-01-14 22:47:42 +02:00
Aliaksandr Valialkin
b03ccbf6f7
lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches
...
This should reduce memory usage in the long run when old blocks and indexes
aren't accessed anymore.
2020-01-14 21:38:29 +02:00
Aliaksandr Valialkin
53e176ed67
lib/storage: limit maxRaRowsPerPartition by 500K for any number of rawRowsShardsPerPartition
...
This should reduce write amplification for high ingestion rate on multi-CPU systems
2020-01-04 23:58:23 +02:00
Aliaksandr Valialkin
a37a006f11
lib/storage: scale ingestion performance by sharding rawRows on systems with more than 8 CPU cores
2019-12-19 18:17:05 +02:00
Aliaksandr Valialkin
8d79412b26
lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series
...
This should speed up `/api/v1/import` and make it more scalable on multi-core systems.
2019-12-19 15:13:36 +02:00
Aliaksandr Valialkin
3694efd005
lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup
...
This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.
2019-12-09 15:45:23 +02:00
Aliaksandr Valialkin
639967db59
lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after runTransactions
call.
...
`runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions
can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction
deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction
deletions are finished before returning from `runTransactions`.
2019-12-04 21:40:52 +02:00
Aliaksandr Valialkin
534da0a8c3
lib/storage: fall back to global inverted index if a filter match too many time series in per-day index
...
Previously this resulted to error message. The query may succeed via search in global index.
2019-12-03 14:48:08 +02:00
Aliaksandr Valialkin
6eb698d1cc
lib/storage: fix printing tag filters in TagFilters.String
2019-12-03 14:25:20 +02:00
Aliaksandr Valialkin
c04f60db35
lib/storage: print __name__
instead of empty string in user-visible tag filters
2019-12-03 14:18:18 +02:00
Aliaksandr Valialkin
625f6ca761
lib/storage: optimize regexp filter search
2019-12-03 00:33:53 +02:00
Aliaksandr Valialkin
b9616c017f
lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
...
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:34:37 +02:00
Aliaksandr Valialkin
4e22b521c2
lib/storage: remove metricID with missing metricID->metricName entry
...
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.
Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:52:13 +02:00
Aliaksandr Valialkin
5a62415bec
lib/storage: protect from time drift during indexdb rotation
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248
2019-12-02 14:43:11 +02:00
Aliaksandr Valialkin
f055dbefda
lib/storage: generate more human-friendly result in TagFilters.String
2019-12-02 13:56:40 +02:00
Aliaksandr Valialkin
0f184affa7
app/vmselect/promql: optimize binary search over big number of samples during rollup calculations
2019-11-25 14:01:54 +02:00
Aliaksandr Valialkin
b9e53490b9
lib/storage: move non-matching tag filters to the top at matchTagFilters
...
This should reduce the amount of useless work needed for matching the next metricNames.
2019-11-21 21:40:36 +02:00
Aliaksandr Valialkin
33d9d63393
lib/storage: speed up time series search for queries with multiple filters
...
Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.
2019-11-21 18:43:40 +02:00
Aliaksandr Valialkin
a02a57fbe9
lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers
2019-11-20 15:40:03 +02:00
Aliaksandr Valialkin
6ca4b94511
lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues
...
The previous commit was accidentally creating 10x smaller number of time series than Prometheus
and this led to invalid benchmark results.
The updated benchmark results:
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers/n="1" 272756688 6194893 -97.73%
BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 10781372 -92.19%
BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 10632834 -92.11%
BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 10679975 -94.55%
BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 100118510 -98.74%
BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 154955671 -97.96%
BenchmarkHeadPostingForMatchers/i=~"" 1142371741 258003769 -77.42%
BenchmarkHeadPostingForMatchers/i!="" 9964150263 159783895 -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 10937895 -94.96%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 10990027 -94.57%
BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 87004349 -82.11%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 53342793 -84.79%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 54256156 -85.76%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 21823279 -75.62%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 46671359 -87.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 53915842 -87.30%
VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 19:48:27 +02:00
Aliaksandr Valialkin
6f61fd367a
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus
...
See the corresponding benchmark in Prometheus - 23c0299d85/tsdb/head_bench_test.go (L52)
The benchmark allows performing apples-to-apples comparison of time series search
in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness -
contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix it.
Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots:
- Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers
- VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers
Benchmark results:
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87%
BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14%
BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15%
BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41%
BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89%
BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84%
BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59%
BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36%
BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41%
The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics.
Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 18:47:02 +02:00
Aliaksandr Valialkin
d297b65089
lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"}
metric
2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin
494ad0fdb3
lib/storage: remove inmemory index for recent hour, since it uses too much memory
...
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.
Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin
633dd81bb5
lib/storage: add -disableRecentHourIndex
flag for disabling inmemory index for recent hour
...
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin
f1620ba7c0
lib/storage: fix inmemory inverted index issues found in v1.29
...
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:35:38 +02:00
Mike Poindexter
955a592106
Add test for invalid caching of tsids ( #232 )
...
* Add test for invalid caching of tsids
* Clean up error handling
2019-11-12 15:52:46 +02:00
Oleg Kovalov
74ba42d111
fix misspelled words ( #229 )
2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin
c48e39eea9
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin
6bdde0d6d4
lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has
2019-11-10 22:04:23 +02:00
Aliaksandr Valialkin
9ea2bd822e
lib/storage: implement per-day inverted index
2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin
dea2f3efed
lib/storage: use specialized cache for (date, metricID) entries
...
This improves ingestion performance.
2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin
9a43902bd8
lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero
2019-11-09 19:03:51 +02:00
Aliaksandr Valialkin
c16e17dede
lib/storage: properly set time range when deleting time series
2019-11-09 18:50:02 +02:00
Aliaksandr Valialkin
8126007c15
lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster
2019-11-09 18:04:26 +02:00
Aliaksandr Valialkin
50773348d3
lib/storage: small code prettifying
2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin
0bc54c23ce
lib/storage: inmemoryInvertedIndex prettifying
2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin
46e67bb78c
lib/storage: export vm_new_timeseries_created_total
metric for determining time series churn rate
2019-11-08 19:58:21 +02:00
Aliaksandr Valialkin
0063c857f5
lib/storage: add inmemory inverted index for the last hour
...
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 19:37:46 +02:00
Aliaksandr Valialkin
89c03a5464
lib/storage: populate partition names from both small
and big
directories
...
Certain partition directories may be missing after restoring from backups
if they had no data. Re-create such directories on start.
2019-11-06 19:50:21 +02:00
Aliaksandr Valialkin
1c777e0245
lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
...
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.
Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin
c567a4353a
lib/storage: take into account the requested time range when caching TSIDs for the given tag filters
2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin
c6564c5d26
lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting
2019-11-05 18:41:50 +02:00
Aliaksandr Valialkin
a10c4cad85
lib/storage: return back finalPartsToMerge from 2 to 3 in order to prevent from excessive merges in old partitions
2019-11-05 17:28:57 +02:00
Aliaksandr Valialkin
e5b1fa0c38
lib/storage: separate the max inverted index scan loops per metric into fast and slow loops
...
Slow loops could require seeks and expensive regexp matching, while fast loops just scans
all the metricIDs for the given `tag=value` prefix. So these operations must have separate
max loops multiplier.
2019-11-05 17:28:57 +02:00
Aliaksandr Valialkin
f93c4f2493
lib/storage: skip repeated useless work when intersection of metricIDs with the given filter is too expensive
...
This should improve performance for query filters over big number of time series.
2019-11-05 14:35:55 +02:00
Aliaksandr Valialkin
f48e97263c
lib/storage: reduce the maximum inverted index scans before giving up to label filters matching by metric name
...
The new value reduces the amount of wasted work during index scans over big number of time series.
2019-11-05 14:35:53 +02:00
Aliaksandr Valialkin
d2f688c550
lib/storage: try potentially faster tag filters at first, then apply slower tag filters
...
The fastest tag filters are non-negative non-regexp, since they are the most specific.
The slowest tag filters are negative regexp, since they require scanning
all the entries for the given label.
2019-11-05 14:35:48 +02:00
Aliaksandr Valialkin
2a38d30f93
lib/storage: pass pointer to MetricName in Fatalf, so it is properly detected as an interface with String() method
...
This fixes lint errors
2019-11-04 01:06:45 +02:00
Artem Navoiev
e05500cbd4
add unittests for bytesutil and storage ( #221 )
2019-11-04 00:57:24 +02:00
Aliaksandr Valialkin
f5fbc3ffd7
lib/{storage,uint64set}: add Set.Union() function and use it
2019-11-04 00:48:32 +02:00
Aliaksandr Valialkin
23e078261e
lib/storage: tune the returned value from adjustMaxMetricsAdaptive
2019-11-04 00:45:28 +02:00
Aliaksandr Valialkin
386c349c8c
lib/storage: remove interface conversion in hot path during block merging
...
This should improve merge speed a bit for parts with big number of small blocks.
2019-11-03 12:33:48 +02:00
Aliaksandr Valialkin
26ffc77622
lib/{storage,mergeset}: create missing partition directories after restoring from backups
...
Backup tools could skip empty directories. So re-create such directories on the first run.
2019-11-02 02:27:19 +02:00
Aliaksandr Valialkin
6ab9c98a1e
app/vmstorage: add -bigMergeConcurrency
and -smallMergeConcurrency
flags for tuning the maximum number of CPU cores used during merges
2019-10-31 16:17:29 +02:00
Aliaksandr Valialkin
6a22727676
lib/storage: optimize getMetricIDsForRecentHours for per-tenant lookups
2019-10-31 15:51:09 +02:00