github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	e0d0348f36	lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init	2020-04-01 17:42:44 +03:00
Aliaksandr Valialkin	c4acd20d2a	lib/storage: remove duplicate data points on 7/8minScrapeInterval interval instead of 1/2minScrapeInterval This should reduce storage usage and should improve deduplication accuracy	2020-04-01 15:48:48 +03:00
Aliaksandr Valialkin	b699c46046	lib/storage: handle errors returned from `TagFilters.Add` when cloning TagFilters with negative filter	2020-03-31 16:18:02 +03:00
Aliaksandr Valialkin	972713bd79	lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet	2020-03-31 12:51:21 +03:00
Aliaksandr Valialkin	5d99ca6cfc	lib/storage: optimize per-day inverted index search for tag filters matching big number of time series - Sort tag filters in the ascending number of matching time series in order to apply the most specific filters first. - Fall back to metricName search for filters matching big number of time series (usually this are negative filters or regexp filters).	2020-03-31 00:48:35 +03:00
Aliaksandr Valialkin	318326c309	lib/storage: properly handle `{label=~"foo\|"}` filters as Prometheus does Such filters must match all the time series with `label="foo"` plus all the time series without `label` Previously only time series with `label="foo"` were matched. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395	2020-03-31 00:48:18 +03:00
Aliaksandr Valialkin	f3e0c55ea1	lib/storage: serialize snapshot creation process with mutex This guarantees that the snapshot contains all the recently added data from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.	2020-03-24 22:27:05 +02:00
Aliaksandr Valialkin	df91d2d91f	lib/storage: remove obsolete code	2020-03-13 22:48:17 +02:00
Aliaksandr Valialkin	18af31a4c2	all: properly split `vm_deduplicated_samples_total` among cluster components Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345	2020-02-27 23:48:07 +02:00
Aliaksandr Valialkin	d21cb43e48	lib/storage: add vm_ prefix to `deduplicated_samples_total` metric to be conistent with other metrics	2020-02-21 19:33:59 +02:00
Aliaksandr Valialkin	ce15cecae4	lib/storage: typo fix	2020-02-16 15:53:44 +02:00
Aliaksandr Valialkin	32e153e834	lib/storage: prevent from clobbering nin-nil lastError in Storage.add	2020-02-16 15:51:26 +02:00
Aliaksandr Valialkin	eceaf13e5e	lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate It has been appeared that time.Timer was used in places where time.Ticker must be used instead. This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .	2020-02-13 13:10:07 +02:00
Aliaksandr Valialkin	e210cd9da1	lib/storage: move `-dedup.minScrapeInterval` flag outside lib/storage, so it doesnt show up in `vminsert` in cluster version	2020-02-10 13:09:51 +02:00
Aliaksandr Valialkin	bd4698bb7a	lib/storage: do not deduplicate blocks with less than 32 samples during merge This should improve deduplication accuracy for blocks with higher number of samples.	2020-02-04 18:41:54 +02:00
Aliaksandr Valialkin	42864bb52f	all: do not clash flag description with back-quoted flag types See https://golang.org/pkg/flag/#PrintDefaults for more details.	2020-02-04 15:46:52 +02:00
Aliaksandr Valialkin	c3d86eef96	all: add `-dedup.minScrapeInterval` command-line flag for data de-duplication Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278	2020-01-31 01:16:57 +02:00
Aliaksandr Valialkin	2152f6f0cd	lib/storage: re-use indexSearch inside Storage.prefetchMetricNames	2020-01-31 01:16:53 +02:00
Aliaksandr Valialkin	ad8af629bb	all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt	2020-01-30 15:08:58 +02:00
Aliaksandr Valialkin	d68546aa4a	lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init This should speed up Search.NextMetricBlock loop for big number of found time series.	2020-01-30 15:08:51 +02:00
Aliaksandr Valialkin	680080887d	all: consistently log durations in seconds with millisecond precision This should improve logs readability	2020-01-22 18:28:27 +02:00
Aliaksandr Valialkin	6665f10e7b	lib/{mergeset,storage}: properly update `lastAccessTime` in index and data block cache entries	2020-01-20 14:59:47 +02:00
Aliaksandr Valialkin	3748fb24b6	lib/storage: skip recovering timestamps order for lossless compression (PrecisionBits=64)	2020-01-18 00:09:33 +02:00
Aliaksandr Valialkin	f9289b804a	lib/storage: reduce memory allocations when merging metricID sets	2020-01-17 22:10:44 +02:00
Aliaksandr Valialkin	605d588ba6	lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods Iterate items with newly added Set.ForEach method instead of allocating `[]uint64` slice for all the items before the iteration.	2020-01-15 12:12:49 +02:00
Aliaksandr Valialkin	893b62c682	lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm)	2020-01-14 22:47:04 +02:00
Aliaksandr Valialkin	7830c10eb2	lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches This should reduce memory usage in the long run when old blocks and indexes aren't accessed anymore.	2020-01-14 21:38:44 +02:00
Aliaksandr Valialkin	fc71602039	lib/storage: limit maxRaRowsPerPartition by 500K for any number of rawRowsShardsPerPartition This should reduce write amplification for high ingestion rate on multi-CPU systems	2020-01-04 23:57:31 +02:00
Aliaksandr Valialkin	1825893eef	lib/storage: scale ingestion performance by sharding rawRows on systems with more than 8 CPU cores	2019-12-19 18:18:29 +02:00
Aliaksandr Valialkin	97f70ccda7	lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series This should speed up `/api/v1/import` and make it more scalable on multi-core systems.	2019-12-19 18:18:29 +02:00
Aliaksandr Valialkin	0ed9258545	lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.	2019-12-09 15:44:53 +02:00
Aliaksandr Valialkin	72345eb5bd	lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after `runTransactions` call. `runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction deletions are finished before returning from `runTransactions`.	2019-12-04 21:40:30 +02:00
Aliaksandr Valialkin	a247236f61	lib/storage: fall back to global inverted index if a filter match too many time series in per-day index Previously this resulted to error message. The query may succeed via search in global index.	2019-12-03 14:48:31 +02:00
Aliaksandr Valialkin	54741ee578	lib/storage: fix printing tag filters in TagFilters.String	2019-12-03 14:25:13 +02:00
Aliaksandr Valialkin	efbc83a13e	lib/storage: print `__name__` instead of empty string in user-visible tag filters	2019-12-03 14:18:28 +02:00
Aliaksandr Valialkin	f52874dab4	lib/storage: optimize regexp filter search	2019-12-03 00:43:12 +02:00
Aliaksandr Valialkin	638a5cbb16	lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed This should fix the issue on NFS when incompletely removed dirs may be left after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction files are already removed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162	2019-12-02 21:36:31 +02:00
Aliaksandr Valialkin	20812008a7	lib/storage: remove metricID with missing metricID->metricName entry The metricID->metricName entry can be missing in the indexdb after unclean shutdown when only a part of entries for new time series is written into indexdb. Recover from such a situation by removing the broken metricID. New metricID will be automatically created for time series with the given metricName when new data point will arive to it.	2019-12-02 20:46:44 +02:00
Aliaksandr Valialkin	62a915f2b2	lib/storage: protect from time drift during indexdb rotation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248	2019-12-02 14:44:42 +02:00
Aliaksandr Valialkin	70b8191fab	lib/storage: generate more human-friendly result in TagFilters.String	2019-12-02 13:52:22 +02:00
Aliaksandr Valialkin	da98703748	app/vmselect/promql: optimize binary search over big number of samples during rollup calculations	2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin	7a4635f853	all: remove the remaining mentions of cluster version	2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin	f652c0f40f	lib/storage: move non-matching tag filters to the top at matchTagFilters This should reduce the amount of useless work needed for matching the next metricNames.	2019-11-21 21:35:13 +02:00
Aliaksandr Valialkin	b8cde6cce1	lib/storage: speed up time series search for queries with multiple filters Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.	2019-11-21 18:43:17 +02:00
Aliaksandr Valialkin	5c1e4143e9	lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers	2019-11-20 15:39:28 +02:00
Aliaksandr Valialkin	b6f22a62cb	lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues The previous commit was accidentally creating 10x smaller number of time series than Prometheus and this led to invalid benchmark results. The updated benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 6194893 -97.73% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 10781372 -92.19% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 10632834 -92.11% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 10679975 -94.55% BenchmarkHeadPostingForMatchers/i=~"." 7962582919 100118510 -98.74% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 154955671 -97.96% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 258003769 -77.42% BenchmarkHeadPostingForMatchers/i!="" 9964150263 159783895 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".",j="foo" 216995884 10937895 -94.96% BenchmarkHeadPostingForMatchers/n="1",i=~".",i!="2",j="foo" 202541348 10990027 -94.57% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 87004349 -82.11% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 53342793 -84.79% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 54256156 -85.76% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 21823279 -75.62% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 46671359 -87.70% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.",j="foo" 424563825 53915842 -87.30% VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)	2019-11-18 19:50:58 +02:00
Aliaksandr Valialkin	8a0dfc6220	lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - `23c0299d85/tsdb/head_bench_test.go (L52)` The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~"." 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.	2019-11-18 18:45:06 +02:00
Aliaksandr Valialkin	2ab4cea5e5	lib/storage: always start using per-day inverted index on the next day after its creation The current day could miss entries for already stopped time series before enabling per-day index. This fixes the issue when queries return empty results during the first hour after upgrading to v1.29.*	2019-11-16 12:11:25 +02:00
Aliaksandr Valialkin	119dfd01bb	lib/storage: add `vm_cache_size_bytes{type="storage/hour_metric_ids"}` metric	2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin	86a1cd700b	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 17:58:07 +02:00

1 2 3 4

178 commits