github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	6685f6ce7c	lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache This makes the code more clear and easier to read This is a follow-up for `7094fa38bc`	2023-07-13 23:13:23 -07:00
Aliaksandr Valialkin	0c49552849	lib/mergeset: skip common prefix in binarySearchKey() function This should improve performance a bit when the search if performed among items with long common prefix	2023-07-13 22:04:59 -07:00
Aliaksandr Valialkin	3dacdcb707	lib/storage: optimize BenchmarkIndexDBGetTSIDs() - Sort MetricName tags only once before the benchmark loop. - Obtain indexSearch per each benchmark loop in order to give a chance for background merge for the recently created parts	2023-07-13 21:56:53 -07:00
Aliaksandr Valialkin	443661a5da	lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage()	2023-07-13 17:13:24 -07:00
Aliaksandr Valialkin	7094fa38bc	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 16:07:30 -07:00
Aliaksandr Valialkin	3b50b94f7a	lib/storage: fix possible test failure in TestStorageAddRowsConcurrent The number of parts in the snapshot partition may be zero if concurrent goroutine just started creating new partition, but didn't put data into it yet when the current goroutine made a snapshot.	2023-07-13 15:03:45 -07:00
Aliaksandr Valialkin	4ba19f6b32	lib/mergeset: simplify fulsuhInmemoryParts() a bit	2023-07-13 12:33:30 -07:00
Aliaksandr Valialkin	a79e53d82a	lib/logstorage: fix TestValuesEncoder() on 32-bit architectures	2023-07-13 11:27:13 -07:00
Dmytro Kozlov	79c42814cf	lib/logstorage: fix panic (#4620 )	2023-07-13 09:53:41 +02:00
Zakhar Bessarab	51a9cc9783	docs: make `httpAuth.` flags description less ambiguous (#4588 ) docs: make `httpAuth.` flags description less ambiguous Currently, it may confuse users whether `httpAuth.` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix a typo Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-07 13:50:13 +02:00
Aliaksandr Valialkin	152ca00fb8	docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix This is a follow-up for `5eb5df96e2`	2023-07-06 17:09:03 -07:00
Aliaksandr Valialkin	8a07621a0c	lib/promscrape: disable support for service discovery and metrics scrape via http2 Reasons for disabling http2: - http2 is used very rarely comparing to http for Prometheus metrics exposition and service discovery - http2 is much harder to debug than http - http2 has very bad security record because of its complexity - see https://portswigger.net/research/http2 VictoriaMetrics components are compiled with nethttpomithttp2 tag because of these issues. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4274 This is a follow-up for `72c3cd47eb`	2023-07-06 16:03:37 -07:00
Alexander Marshalov	af53c7cc78	fix removing storage data dir before restoring from backup (#598 ) * fix removing storage data dir before restoring from backup Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fixes after merge with `enterprise-single-node` branch Signed-off-by: Alexander Marshalov <_@marshalov.org> --------- Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-07-06 14:16:18 -07:00
Aliaksandr Valialkin	3286ca3318	lib/backup/actions: remove misleading comment about the default value for Concurrency field	2023-07-06 14:07:08 -07:00
Aliaksandr Valialkin	792860db10	lib/promscrape/discoveryutils: re-use checkRedirect function for both client and blockingClient Also document follow_redirects option at https://docs.victoriametrics.com/sd_configs.html#http-api-client-options This is a follow-up for `b3d0ff463a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-07-06 10:51:33 -07:00
Alexander Marshalov	fc67d94e86	vmbackupmanager bugfixes: (#577 ) - error on running with empty -dst dir and without -runOnStart - error on restoring with backup, created before v1.90.0	2023-07-05 22:07:15 -07:00
Aliaksandr Valialkin	3c5623ce7f	lib/logstorage: go fmt	2023-07-04 14:13:14 -07:00
Aliaksandr Valialkin	6d35d21f60	lib/logstorage: fix `make test-pure` tests	2023-07-04 13:14:30 -07:00
Aliaksandr Valialkin	d1dd25122a	lib/httputils: fix test after `b49d04b3dc` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-07-04 09:40:12 -07:00
Haleygo	5fc0ee43d4	fix parse for invalid partial RFC3339 format (#4539 ) The validation was needed for covering corner cases when storage is tested with data from 1970. This resulted into unexpected search results, as year was parsed incorrectly from the given timestamp. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-07-03 13:11:49 +02:00
Alexander Marshalov	1cc06e39cd	show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading (#4460 ) (#4530 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-28 14:44:45 +02:00
Aliaksandr Valialkin	83aa78dfb4	app/vlstorage: export vl_active_merges and vl_merges_total metrics	2023-06-21 20:58:57 -07:00
Aliaksandr Valialkin	dde9ceed07	app/vlinsert/jsonline: code prettifying	2023-06-21 19:39:22 -07:00
Aliaksandr Valialkin	7346bb4f03	app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize	2023-06-21 01:11:25 -07:00
Aliaksandr Valialkin	00c3dbd15d	app/victoria-logs: add ability to debug data ingestion by passing `debug` query arg to data ingestion API	2023-06-20 20:02:46 -07:00
Aliaksandr Valialkin	87b66db47d	app/victoria-logs: initial code release	2023-06-19 22:55:12 -07:00
Aliaksandr Valialkin	aeac39cfd1	lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level	2023-06-19 22:48:37 -07:00
Aliaksandr Valialkin	0f01eea4e9	lib/netutil: ignore arificial timeout generated by net/http.Server This prevents from the inflated vm_tcplistener_read_timeouts_total counter	2023-06-19 22:46:40 -07:00
Aliaksandr Valialkin	298aab3f54	lib/mergeset: do not create flock.lock file at mergeset table, since it is created at the lib/storage.Storage level	2023-06-19 22:45:31 -07:00
Aliaksandr Valialkin	371182f299	lib/fs: add ReaderAt.Path() function This function is going to be used in VictoriaLogs	2023-06-19 22:42:27 -07:00
Aliaksandr Valialkin	497ec3f3e6	lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions These functions are going to be used by VictoriaLogs	2023-06-19 22:40:55 -07:00
Aliaksandr Valialkin	3409317a67	lib/cgroup: add SetGOGC() function This function is going to be used by VictoriaLogs	2023-06-19 22:39:00 -07:00
Aliaksandr Valialkin	c1bed35b39	lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions This is needed for the upcoming VictoriaLogs	2023-06-19 22:37:26 -07:00
Aliaksandr Valialkin	78eaa056c0	app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs	2023-06-19 22:34:20 -07:00
Aliaksandr Valialkin	b49d04b3dc	lib/promutils.ParseTime(): add support for timestamps in milliseconds See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-06-19 22:25:04 -07:00
Nikolay	5eb5df96e2	lib/storage: creates parts.json on start-up if it not exists. (#4450 ) * lib/storage: creates parts.json on start-up if it not exists. It fixes migrations from versions below v1.90.0. Previously parts.json was created only after successful merge. But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted. Since VM cannot check if it must be removed or not. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/storage/partition.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-06-15 11:19:22 +02:00
Roman Khavronenko	f50f35a8e0	lib/storage: add comment for how `mustBeDeleted` field should be used (#4454 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-15 11:17:45 +02:00
Roman Khavronenko	f71cc99a8c	lib/mergeset: add comment for how `mustBeDeleted` field should be used (#4449 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-14 18:13:16 +02:00
Alexander Marshalov	40d12be607	fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390 ) (#4439 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-12 16:16:43 +02:00
Roman Khavronenko	dfe53a36fc	lib/promscrape/discoveryutils: properly check for net.ErrClosed (#4426 ) This error may be wrapped in another error, and should normally be tested using `errors.Is(err, net.ErrClosed)`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-09 09:26:33 +02:00
Roman Khavronenko	3305a6901c	app/vmagent: mention `enable_http2` in changelog (#4403 ) Follow-up after `72c3cd47eb` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-05 16:31:58 +02:00
Haleygo	72c3cd47eb	vmagent:scrape config support enable_http2 (#4295 ) app/vmagent: support `enable_http2` in scrape config This change adds HTTP2 support for scrape config and improves compatibility with Prometheus config. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283	2023-06-05 15:56:49 +02:00
Nikolay	f263031fe9	app/vmauth: properly handle LOCAL proxy protocol command (#4373 ) app/vmauth: properly handle LOCAL proxy protocol command It is required for handling health checks from load balancers https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335	2023-05-31 15:37:59 +02:00
Haleygo	b3d0ff463a	vmagent:support follow_redirects on SD level (#4286 ) * vmagent:support follow_redirects on SD level * fix follow_redirects on sd level https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-05-26 09:39:45 +02:00
Aliaksandr Valialkin	1f2f74e70e	lib/promrelabel: use monospace font at textarea for writing relabel configs on /metric-relabel-debug and /target-relabel-debug pages This simplifies visual inspection of indentation in yaml configs	2023-05-18 20:48:41 -07:00
Aliaksandr Valialkin	1f28b46ae9	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Possible solution - to create the new indexdb in one hour before the indexdb rotation and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation. Then the new indexdb will contain all the needed data just after the rotation, so it won't trigger increased CPU and disk IO.	2023-05-18 11:30:49 -07:00
Haleygo	1531d757ea	fix lint check	2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin	e0e16a2d36	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:19:27 -07:00
Roman Khavronenko	2ce02a7fe6	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin	278278af95	lib/storage: reduce the unimportant logging during Storage start / stop This should improve the visibility of potentially important logs	2023-05-16 15:14:21 -07:00

1 2 3 4 5 ...

2001 commits