github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	9d8fdff6c5	lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps This should reduce disk space usage when scraping targets containing metrics with identical names such as `node_cpu_seconds_total`, histograms, quantiles, etc. Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring the effectiveness of timestamp blocks merging.	2020-09-09 23:59:32 +03:00
Aliaksandr Valialkin	582c74cd93	lib/storage: mention tag filters used in the query that led to error message This should improve detecting invalid or heavy queries that lead to errors.	2020-08-10 13:36:49 +03:00
Aliaksandr Valialkin	f3d33e23c9	app/vmstorage: improve error logging when the request times out	2020-08-10 13:23:26 +03:00
Aliaksandr Valialkin	84fd8af6d3	lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit This should improve data ingestion performance when heavy searches are executed See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618	2020-08-07 08:49:40 +03:00
Aliaksandr Valialkin	9043a509a3	lib/storage: properly check timeouts and pace limits Previously they were checked on every iteration for small number of iterations	2020-08-07 08:40:37 +03:00
Aliaksandr Valialkin	ad730d8a17	lib/storage: optimize prefetching metric names for the given metricIDs	2020-08-06 16:53:10 +03:00
Aliaksandr Valialkin	8f16388428	lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2 This should limit the maximum memory usage and reduce CPU trashing on vmstorage when multiple heavy queries are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-08-05 18:30:07 +03:00
Aliaksandr Valialkin	922d9aadf2	lib/storage: properly update `vm_slow_row_inserts_total` metric when importing multiple data points per time series at once Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series, while only a single increment is needed when inserting the first data point for this time series.	2020-07-30 16:17:24 +03:00
Aliaksandr Valialkin	039c9d2441	lib/storage: respect `-search.maxQueryDuration` when searching for time series in inverted index Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`. This commit stops searching in inverted index on query timeout.	2020-07-23 21:21:42 +03:00
Aliaksandr Valialkin	2a45871823	lib/storage: add more fine-grained pace limiting for search	2020-07-23 19:26:08 +03:00
Aliaksandr Valialkin	6f05c4d351	lib/storage: improve prioritizing of data ingestion over querying Prioritize also small merges over big merges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-07-23 13:23:36 +03:00
Aliaksandr Valialkin	e4303d3d21	lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs This condition may occur after the following sequence of events: 1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs. 2) All the goroutines return from Storage.AddRows. 3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body. The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal(). This may take indefinite time.	2020-07-22 21:52:34 +03:00
Aliaksandr Valialkin	d3442b40b2	lib/uint64set: optimize adding items to the set via Set.AddMulti	2020-07-21 20:56:59 +03:00
Aliaksandr Valialkin	e1107fec10	lib/storage: reset `MetricName->TSID` cache after marking metricIDs as deleted This is a follow-up commit after `12b16077c4` , which didn't reset the `tsidCache` in all the required places. This could result in indefinite errors like: missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time Fix this by resetting the cache inside deleteMetricIDs function.	2020-07-14 14:06:32 +03:00
Aliaksandr Valialkin	cb92113632	lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense, since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs, i.e. to GOMAXPROCS.	2020-07-08 17:32:18 +03:00
Aliaksandr Valialkin	32b9fb58b8	lib/storage: clarify `out of retention period` error message by mentioning `-retentionPeriod` command-line flag	2020-07-08 13:54:26 +03:00
Aliaksandr Valialkin	12b16077c4	lib/storage: reset MetricName->TSID cache after deleting time series This should prevent from adding new data points to deleted time series without the need to check for the deleted time series. This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set contains big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596 Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604	2020-07-06 22:01:08 +03:00
Aliaksandr Valialkin	6daa5f7500	lib/storage: prioritize data ingestion over heavy queries Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream. Prevent this by delaying queries' execution until free resources are available for data ingestion. Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources for data ingestion and/or for executing heavy queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2020-07-05 19:42:05 +03:00
Aliaksandr Valialkin	d5dddb0953	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:05:11 +03:00
Aliaksandr Valialkin	b19ca3eb5f	lib/storage: do not increment `vm_slow_metric_name_loads_total` counter for metric_ids which shouldnt be prefetched, since this may mislead users	2020-05-16 10:21:17 +03:00
Aliaksandr Valialkin	82ffbcb9a6	app/vmstorage: add `vm_slow_metric_name_loads_total` metric, which could be used as an indicator when more RAM is needed for improving query performance	2020-05-15 14:11:45 +03:00
Aliaksandr Valialkin	82ccdfaa91	app/vmstorage: add `vm_slow_row_inserts_total` and `vm_slow_per_day_index_inserts_total` metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series	2020-05-15 13:44:32 +03:00
Aliaksandr Valialkin	4fc33163c4	lib/storage: optimize ingestion pefrormance for new time series	2020-05-15 13:24:37 +03:00
Aliaksandr Valialkin	8b32e7c3a0	lib/storage: reduce indentation in Storage.add	2020-05-15 13:24:37 +03:00
Aliaksandr Valialkin	1573ececb2	lib/storage: return the first error instead of the last error, since the first error usually points to the root cause	2020-05-15 13:24:37 +03:00
Aliaksandr Valialkin	0afd48d2ee	lib: extract common code for returning fast unix timestamp into lib/fasttime	2020-05-14 23:02:07 +03:00
Aliaksandr Valialkin	dbd0c552d5	lib/storage: gradually pre-populate per-day inverted index for the next day This should prevent from CPU usage spikes at 00:00 UTC every day when inverted index for new day must be quickly created for all the active time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430	2020-05-12 12:13:05 +03:00
Aliaksandr Valialkin	364db13c9c	app/vmselect: add `/api/v1/status/tsdb` page with useful stats for locating root cause for high cardinality issues See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268	2020-04-22 22:03:43 +03:00
Aliaksandr Valialkin	f3e0c55ea1	lib/storage: serialize snapshot creation process with mutex This guarantees that the snapshot contains all the recently added data from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.	2020-03-24 22:27:05 +02:00
Aliaksandr Valialkin	18af31a4c2	all: properly split `vm_deduplicated_samples_total` among cluster components Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345	2020-02-27 23:48:07 +02:00
Aliaksandr Valialkin	ce15cecae4	lib/storage: typo fix	2020-02-16 15:53:44 +02:00
Aliaksandr Valialkin	32e153e834	lib/storage: prevent from clobbering nin-nil lastError in Storage.add	2020-02-16 15:51:26 +02:00
Aliaksandr Valialkin	eceaf13e5e	lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate It has been appeared that time.Timer was used in places where time.Ticker must be used instead. This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .	2020-02-13 13:10:07 +02:00
Aliaksandr Valialkin	2152f6f0cd	lib/storage: re-use indexSearch inside Storage.prefetchMetricNames	2020-01-31 01:16:53 +02:00
Aliaksandr Valialkin	d68546aa4a	lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init This should speed up Search.NextMetricBlock loop for big number of found time series.	2020-01-30 15:08:51 +02:00
Aliaksandr Valialkin	680080887d	all: consistently log durations in seconds with millisecond precision This should improve logs readability	2020-01-22 18:28:27 +02:00
Aliaksandr Valialkin	605d588ba6	lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods Iterate items with newly added Set.ForEach method instead of allocating `[]uint64` slice for all the items before the iteration.	2020-01-15 12:12:49 +02:00
Aliaksandr Valialkin	97f70ccda7	lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series This should speed up `/api/v1/import` and make it more scalable on multi-core systems.	2019-12-19 18:18:29 +02:00
Aliaksandr Valialkin	62a915f2b2	lib/storage: protect from time drift during indexdb rotation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248	2019-12-02 14:44:42 +02:00
Aliaksandr Valialkin	7a4635f853	all: remove the remaining mentions of cluster version	2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin	119dfd01bb	lib/storage: add `vm_cache_size_bytes{type="storage/hour_metric_ids"}` metric	2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin	86a1cd700b	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 17:58:07 +02:00
Aliaksandr Valialkin	c57eb0ff83	lib/storage: add `-disableRecentHourIndex` flag for disabling inmemory index for recent hour This may be useful for saving RAM on high number of time series aka high cardinality	2019-11-13 15:02:51 +02:00
Aliaksandr Valialkin	ca259864e2	lib/storage: return back inmemory inverted index for recent hour Issues fixed: - Slow startup times. Now the index is loaded from cache during start. - High memory usage related to superflouos index copies every 10 seconds.	2019-11-13 13:11:04 +02:00
Aliaksandr Valialkin	01bb3c06c7	lib/storage: remove inmemory inverted index for recent hours Production load with >10M active time series showed it could slow down VictoriaMetrics startup times and could eat all the memory leading to OOM. Remove inmemory inverted index for recent hours until thorough testing on production data shows it works OK.	2019-11-13 10:45:53 +02:00
Oleg Kovalov	b4f44befa3	fix misspelled words (#229 )	2019-11-12 00:16:42 +02:00
Aliaksandr Valialkin	8e8f98f712	lib/storage: add tests for dateMetricIDCache	2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin	c342f5e37e	lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has	2019-11-10 22:04:01 +02:00
Aliaksandr Valialkin	ee7765b10d	lib/storage: implement per-day inverted index	2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin	5810ba57c2	lib/storage: use specialized cache for (date, metricID) entries This improves ingestion performance.	2019-11-09 23:06:11 +02:00

1 2

77 commits