github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	ed1b394a1a	app/vmstorage: expose `vm_indexdb_items_added_total` and `vm_indexdb_items_added_size_bytes_total` counters at `/metrics` page These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471	2022-04-21 13:19:42 +03:00
Nikolay	4cf6219e07	lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache (#2293 ) * lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache It should decrease memory usage for regexp caching with storing cacheEntry by pointer - golang map should be able to effectivly shrink it's size original issue with this case - unexpected map grows and storage OOM Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Adds missing metrics for regexp cache and regexpPrefixes cache * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-03-26 12:57:27 +02:00
Roman Khavronenko	bd7837d524	lib: allow to configure cache size by type (#2206 ) * lib: allow to configure cache size by type https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1940 Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-21 13:55:51 +02:00
Roman Khavronenko	d107f86fbc	lib/index: reduce read/write load after indexDB rotation (#2177 ) * lib/index: reduce read/write load after indexDB rotation IndexDB in VM is responsible for storing TSID - ID's used for identifying time series. The index is stored on disk and used by both ingestion and read path. IndexDB is stored separately to data parts and is global for all stored data. It can't be deleted partially as VM deletes data parts. Instead, indexDB is rotated once in `retention` interval. The rotation procedure means that `current` indexDB becomes `previous`, and new freshly created indexDB struct becomes `current`. So in any time, VM holds indexDB for current and previous retention periods. When time series is ingested or queried, VM checks if its TSID is present in `current` indexDB. If it is missing, it checks the `previous` indexDB. If TSID was found, it gets copied to the `current` indexDB. In this way `current` indexDB stores only series which were active during the retention period. To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both write and read path consult `tsidCache` and on miss the relad lookup happens. When rotation happens, VM resets the `tsidCache`. This is needed for ingestion path to trigger `current` indexDB re-population. Since index re-population requires additional resources, every index rotation event may cause some extra load on CPU and disk. While it may be unnoticeable for most of the cases, for systems with very high number of unique series each rotation may lead to performance degradation for some period of time. This PR makes an attempt to smooth out resource usage after the rotation. The changes are following: 1. `tsidCache` is no longer reset after the rotation; 2. Instead, each entry in `tsidCache` gains a notion of indexDB to which they belong; 3. On ingestion path after the rotation we check if requested TSID was found in `tsidCache`. Then we have 3 branches: 3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID. 3.2 Slow path. It wasn't found, so we generate it from scratch, add to `current` indexDB, add it to `tsidCache`. 3.3 Smooth path. It was found but does not belong to the `current` indexDB. In this case, we add it to the `current` indexDB with some probability. The probability is based on time passed since the last rotation with some threshold. The more time has passed since rotation the higher is chance to re-populate `current` indexDB. The default re-population interval in this PR is set to `1h`, during which entries from `previous` index supposed to slowly re-populate `current` index. The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs were moved from `previous` indexDB to the `current` indexDB. This metric supposed to grow only during the first `1h` after the last rotation. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin	6ae584b9b3	lib/{mergeset,storage}: properly limit cache sizes for indexdb Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`, since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded, which could result to out of memory errors. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-20 18:45:03 +02:00
Aliaksandr Valialkin	cdfe854c9b	lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge() This improves the code readability and debuggability, since the output of these functions stops depending on global state.	2021-12-14 20:52:29 +02:00
Aliaksandr Valialkin	ab4be24397	app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query: vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9	2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin	4fb19fe34b	all: consistently return `application/json` content-type without `charset=utf-8` The `application/json` content-type has utf-8 encoding by default. See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2021-11-09 18:07:22 +02:00
Aliaksandr Valialkin	4fddcf4c83	app/{vminsert,vmstorage}: follow-up after `a171916ef5` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269	2021-10-08 14:09:51 +03:00
Nikolay	a171916ef5	Adds read-only mode for vmstorage node (#1680 ) * adds read-only mode for vmstorage https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269 * changes order a bit * moves isFreeDiskLimitReached var to storage struct renames functions to be consistent change protoparser api - with optional storage limit check for given openned storage * renames freeSpaceLimit to ReadOnly	2021-10-08 12:52:56 +03:00
Aliaksandr Valialkin	c473d8ffe1	li/storage: re-use the per-day inverted index search code for searching in global index This allows removing a big pile of outdated code for global index search. This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486	2021-07-30 10:28:20 +03:00
Aliaksandr Valialkin	22c6e64bbc	lib/storage: consistency renaming: tagCache -> tagFiltersCache This improves code readability	2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin	44855f0c9b	app/{vmselect,vmstorage}: clarify the description for `-dedup.minScrapeInterval` command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1426	2021-07-02 15:06:41 +03:00
Aliaksandr Valialkin	165a9f9200	app/vmstorage: add ability to limit series cardinality via `-storage.maxHourlySeries` and `-storage.maxDailySeries` command-line flags	2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin	4a5f45c77e	app/vminsert: add support for data ingestion via other vminsert nodes	2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin	6dc5d3b357	all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com	2021-04-20 20:20:01 +03:00
Aliaksandr Valialkin	4028d692f5	app: do not process non-GET requests on at `/` handler	2021-04-02 22:56:38 +03:00
Aliaksandr Valialkin	512addc608	app/{vminsert,vmagent}: add `-sortLabels` command-line option for sorting time series labels before ingesting them in the storage This option can be useful when samples for the same time series are ingested with distinct order of labels. For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.	2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin	ae1c653d55	lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels	2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin	d074326970	app/vmstorage: add `-logNewSeries` command-line flag for determining the source of series churn rate	2021-03-15 22:40:28 +02:00
Aliaksandr Valialkin	83da939947	app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics	2021-02-17 19:13:49 +02:00
Aliaksandr Valialkin	08f21d8761	app/vmstorage: export vm_composite_index_min_timestamp metric	2021-02-10 17:14:00 +02:00
Aliaksandr Valialkin	e8ee9fa7fe	app/vmstorage: export missing `vm_cache_size_bytes` metrics for indexdb and data caches	2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin	d5a2b120e9	app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage	2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin	a2eb451de4	app/{vmagent,vminsert}: follow-up for `ce8c2dd1f1`: return `/targets` page in HTML when requested via web browser	2020-12-14 14:13:01 +02:00
Aliaksandr Valialkin	1a237c6903	all: properly handle CPU limits set on the host system/container This can reduce memory usage on systems with enabled CPU limits. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946	2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin	bdac2171f1	all: do not print usage info for all the flags when incorrect command-line flag is passed This should improve usability for VictoriaMetrics apps that have big number of command-line flags, i.e. all the apps.	2020-12-03 21:46:19 +02:00
Aliaksandr Valialkin	7ceaf4ba8f	all: consistently return text-based HTTP responses with charset=utf-8 This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2020-11-13 10:30:21 +02:00
immerrr again	1ec1a9f27f	app/vmstorage: add "/internal/force_flush" endpoint (#893 )	2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin	0db7c2b500	app/vmstorage: support for `-retentionPeriod` smaller than one month Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17	2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin	d2e917d1cb	app/vmstorage: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start	2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin	b51fa16177	app/vmstorage: add `-finalMergeDelay` command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it	2020-10-07 17:42:31 +03:00
Aliaksandr Valialkin	abfd3a8fab	app/{vminsert,vmselect,vmstorage}: add a link to https://victoriametrics.github.io/Cluster-VictoriaMetrics.html from main page of every cluster component	2020-10-06 15:30:07 +03:00
Aliaksandr Valialkin	fd7dd5064a	lib/storage: code cleanup after `10f2eedee0` Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search, since this code became unused after implementing per-day inverted index almost a year ago. While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names such as `foo.bar.baz`).	2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin	536aa8779a	app/vmstorage: rename `vm_{big\|small}_merge_need_free_disk_space` to `vm_merge_need_free_disk_space` This simplifies alerting.	2020-09-29 22:53:33 +03:00
Aliaksandr Valialkin	097a4c10dd	app/vmstorage: add metrics for determining whether background merges need additional disk space to complete These metrics are: * vm_small_merge_need_free_disk_space * vm_big_merge_need_free_disk_space Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686	2020-09-29 21:47:47 +03:00
Aliaksandr Valialkin	2ee0dc27a6	app/vmstorage: parallelize data processing obtained from a single connection from vminsert Previously vmstorage could use only a single CPU core for data processing from a single connection from vminsert. Now all the CPU cores can be used for data processing from a single connection from vminsert. This should improve the maximum data ingestion performance for a single vminsert->vmstorage connection.	2020-09-28 21:41:16 +03:00
Aliaksandr Valialkin	9b15b11f74	app/vmstorage: added `-forceMergeAuthKey` command-line flag for protecting `/internal/force_merge` endpoint	2020-09-17 14:24:20 +03:00
Aliaksandr Valialkin	d96858b921	lib/storage: add `/internal/force_merge` handler for running forced compactions on historical per-month partitions This may be useful for freeing up storage space after time series deletion. See https://victoriametrics.github.io/#force-merge for more details. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686	2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin	f5cb213ef9	lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps This should reduce disk space usage when scraping targets containing metrics with identical names such as `node_cpu_seconds_total`, histograms, quantiles, etc. Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring the effectiveness of timestamp blocks merging.	2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin	6721e47ae9	app: respect CPU limits set via cgroups Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage for cases when VictoriaMetrics components run in containers with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685	2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin	a455930ab4	app/vmstorage: rename `vm_cache_size_entries{type="storage/prefetchedMetricIDs"}` to `vm_cache_entries{type="storage/prefetchedMetricIDs"}` to be consistent with other `vm_cache_entries` metrics	2020-08-06 16:34:18 +03:00
Aliaksandr Valialkin	a3e91c593b	lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2 This should limit the maximum memory usage and reduce CPU trashing on vmstorage when multiple heavy queries are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin	29bbab0ec9	lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts. Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon, which should return CPU and disk IO usage to normal levels. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618	2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin	b8303afcd8	lib/storage: improve prioritizing of data ingestion over querying Prioritize also small merges over big merges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin	31ef39e8da	lib/httpserver: log remote address in error message from `httpserver.Errorf` This should improve detection of the root cause of errors. Thanks to Anant for the idea.	2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin	0bff96fe4b	lib/storage: prioritize data ingestion over heavy queries Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream. Prevent this by delaying queries' execution until free resources are available for data ingestion. Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources for data ingestion and/or for executing heavy queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin	2784015a4d	all: print `--help` output to stdout instead of stderr This is easier to grep and pipe	2020-05-16 12:03:06 +03:00
Aliaksandr Valialkin	1e5c1d7eaa	app/vmstorage: add `vm_slow_metric_name_loads_total` metric, which could be used as an indicator when more RAM is needed for improving query performance	2020-05-15 14:12:24 +03:00

1 2

87 commits