Aliaksandr Valialkin
4ff647137a
lib/storage: deduplicate samples more thoroughly
...
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 15:59:58 +02:00
Aliaksandr Valialkin
92070cbb67
lib/storage: return dedup interval in milliseconds from GetDedupInterval()
...
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:26:38 +02:00
Aliaksandr Valialkin
1d20a19c7d
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
...
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:49:12 +02:00
Aliaksandr Valialkin
8ca2799478
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:53:30 +03:00
kreedom
fb967ae6c8
happy fmt
2020-04-26 14:16:32 +03:00
Aliaksandr Valialkin
d7c1ff8b0c
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:02 +03:00
Aliaksandr Valialkin
ded0c0d3c7
lib/storage: correctly handle -dedup.minScrapeInterval
values smaller than 8ms
...
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:36:41 +03:00
Aliaksandr Valialkin
c4acd20d2a
lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
...
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:48:48 +03:00
Aliaksandr Valialkin
18af31a4c2
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:48:07 +02:00
Aliaksandr Valialkin
d21cb43e48
lib/storage: add vm_ prefix to deduplicated_samples_total
metric to be conistent with other metrics
2020-02-21 19:33:59 +02:00
Aliaksandr Valialkin
e210cd9da1
lib/storage: move -dedup.minScrapeInterval
flag outside lib/storage, so it doesnt show up in vminsert
in cluster version
2020-02-10 13:09:51 +02:00
Aliaksandr Valialkin
bd4698bb7a
lib/storage: do not deduplicate blocks with less than 32 samples during merge
...
This should improve deduplication accuracy for blocks with higher number of samples.
2020-02-04 18:41:54 +02:00
Aliaksandr Valialkin
42864bb52f
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:46:52 +02:00
Aliaksandr Valialkin
c3d86eef96
all: add -dedup.minScrapeInterval
command-line flag for data de-duplication
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:16:57 +02:00