github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Roman Khavronenko	4b8088e377	lib/storage: properly check for `storage/prefetchedMetricIDs` cache expiration deadline (#5607 ) Before, this cache was limited only by size. Cache invalidation by time happens with jitter to prevent thundering herd problem. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-15 10:03:06 +01:00
Aliaksandr Valialkin	d2c94a0663	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin	c005245741	lib/prompb: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 22:46:06 +02:00
Aliaksandr Valialkin	f2229c2e42	lib/prompb: change type of Label.Name and Label.Value from []byte to string This makes it more consistent with lib/prompbmarshal.Label	2024-01-14 22:33:21 +02:00
Aliaksandr Valialkin	f405384c8c	lib/protoparser/datadogv2: simplify code for parsing protobuf messages after `0597718435` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2024-01-14 21:43:01 +02:00
Aliaksandr Valialkin	dd25049858	lib/protoparser/opentelemetry: use github.com/VictoriaMetrics/easyproto for protobuf message unmarshaling and marshaling This reduces VictoriaMetrics binary size by 100KB. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2424	2024-01-14 21:19:03 +02:00
Aliaksandr Valialkin	0597718435	lib/protoparser/datadogv2: add support for reading protobuf-encoded requests at /api/v2/series endpoint Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094	2024-01-14 21:09:05 +02:00
Artem Navoiev	d374595e31	docs: mention staleNaN handling during deduplication See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-11 11:53:58 +01:00
hagen1778	91ccea236f	app/all: follow-up after `84d710beab` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-09 13:34:54 +01:00
zhdd99	fe2d9f6646	lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549 ) Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-09 13:24:34 +01:00
Aliaksandr Valialkin	7575f5c501	lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc.	2023-12-21 23:05:41 +02:00
Aliaksandr Valialkin	b4ba8d0d76	lib/protoparser: add missing /datadog/ prefix to the /api/v2/series path in the description for -datadog.maxInsertRequestSize command-line flag	2023-12-21 21:04:53 +02:00
Aliaksandr Valialkin	fb90a56de2	app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics , while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format. The support for this format will be added later. Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2023-12-21 20:50:55 +02:00
Aliaksandr Valialkin	01f9edda64	lib/promauth: add more context to errors returned by Options.NewConfig() in order to simplify troubleshooting	2023-12-20 21:58:12 +02:00
Aliaksandr Valialkin	160cc9debd	app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags This is a follow-up for `5ebd5a0d7b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427	2023-12-20 21:35:28 +02:00
Morgan	5ebd5a0d7b	Expose OAuth2 Endpoint Parameters to cli (#5427 ) The user may which to control the endpoint parameters for instance to set the audience when requesting an access token. Exposing the parameters as a map allows for additional use cases without requiring modification.	2023-12-20 20:16:43 +02:00
Nikolay	7cfde237ec	lib/awsapi: properly assume role with webIdentity token (#5495 ) * lib/awsapi: properly assume role with webIdentity token introduce new irsaRoleArn param for config. It's only needed for authorization with webIdentity token. First credentials obtained with irsa role and the next sts assume call for an actual roleArn made with those credentials. Common use case for it - cross AWS accounts authorization https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3822 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-12-20 19:05:39 +02:00
Aliaksandr Valialkin	5a88bc973f	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:23:42 +02:00
Aliaksandr Valialkin	326a77c697	all: add -metrics.exposeMetadata command-line flag, which can be used for adding TYPE and HELP metadata for metrics exposed at /metrics page This may be needed for systems, which require this metadata such as Google Cloud Managed Prometheus. See https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type	2023-12-19 03:20:40 +02:00
Aliaksandr Valialkin	4b529562ce	lib/pushmetrics: add -pushmetrics.header and -pushmetrics.disableCompression command-line flags	2023-12-17 19:56:46 +02:00
Aliaksandr Valialkin	0379a0eb82	lib/protoparser/opentelemetry: allow ingesting metrics without resource labels Some clients may ingest samples via OpenTelemetry protocol without Resource labels. Previously VictoriaMetrics was silently dropping such samples. The commit `317834f876` added vm_protoparser_rows_dropped_total{type="opentelemetry",reason="resource_not_set"} counter for tracking of such dropped samples. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5459 It is better from usability PoV to accept such samples instead of dropping them and incrementing the corresponding counter.	2023-12-17 19:12:58 +02:00
Zakhar Bessarab	317834f876	lib/protoparser/opentelemetry: add metric to track skipped rows without resource (#5459 ) Currently, it is impossible to understand why metrics are not ingested when resource is not set by OTEL exporter. Adding metric should simplify debugging and make it improve debuggability. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-12-15 11:16:25 +01:00
Aliaksandr Valialkin	72dbd24b22	lib/fs: remove unused IsEmptyDir() This function became unused after the commit `43b24164ef` The unused function has been found with deadode tool - https://go.dev/blog/deadcode	2023-12-14 19:38:53 +02:00
Aliaksandr Valialkin	0f91f83639	app/vmselect: add support for vmstorage groups with independent -replicationFactor per group Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5197 See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect Thanks to @zekker6 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/718	2023-12-13 00:14:45 +02:00
hagen1778	e0fc5ef140	lib/promscrape: comsetic changes after `e373bb84d5` * fix typos in docs * add `shard-` prefix to generated links when `-promscrape.cluster.memberURLTemplate` is enabled Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-12 11:28:18 +01:00
Aliaksandr Valialkin	51df2248f0	vendor: run `make vendor-update`	2023-12-11 10:48:36 +02:00
Aliaksandr Valialkin	042267541f	app/vmauth: add support for `hot standby` mode via `first_available` load balancing policy vmauth in `hot standby` mode sends requests to the first url_prefix while it is available. If the first url_prefix becomes unavailable, then vmauth falls back to the next url_prefix. This allows building highly available setup as described at https://docs.victoriametrics.com/vmauth.html#high-availability Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792	2023-12-08 23:31:07 +02:00
Aliaksandr Valialkin	b05e1512d4	lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets	2023-12-08 19:03:51 +02:00
noodles2hg	8efe694160	lib/streamaggr/streamaggr.go: fix link in error message (#5439 )	2023-12-08 16:55:05 +03:00
Aliaksandr Valialkin	e373bb84d5	lib/promscrape: add `-promscrape.cluster.memberURLTemplate` command-line flag for creating direct links to vmagent instances at /service-discovery page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018#issuecomment-1843811569	2023-12-07 16:04:21 +02:00
Aliaksandr Valialkin	7cb8ed8271	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:05:32 +02:00
Aliaksandr Valialkin	67468a0c46	lib/promscrape: show `never scraped` message for never scraped targets at /targets page	2023-12-06 22:33:39 +02:00
Aliaksandr Valialkin	65bc460323	lib/promscrape: follow-up for `97373b7786` Substitute O(N^2) algorithm for exposing the `vm_promscrape_scrape_pool_targets` metric with O(N) algorithm, where N is the number of scrape jobs. The previous algorithm could slow down /metrics exposition significantly when -promscrape.config contains thousands of scrape jobs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5311 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5335	2023-12-06 17:35:50 +02:00
Hui Wang	97373b7786	vmagent: add `vm_promscrape_scrape_pool_targets` for scrape jobs like… (#5335 ) * vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers * add extra panel for new metric	2023-12-06 15:44:39 +08:00
Aliaksandr Valialkin	06c73df55a	Revert "add datadog /api/v2/series and /api/beta/sketches support (#5094 )" This reverts commit `543f218fe9`. Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080	2023-12-05 02:26:22 +02:00
Aliaksandr Valialkin	bc550e22d7	Revert "lib/protoparser/datadog: follow-up after 543f218fe96574b9b2189c8350bb09afa349e3bb" This reverts commit `98d0f81f21`. Reson for revert: see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080	2023-12-05 02:19:29 +02:00
Aliaksandr Valialkin	0160435802	app/vmagent: code cleanup for Kafka and Google PubSub consumers / producers - Add links to relevant docs into descriptions for every -kafka.* and -gcp.pubsub.* command-line flags. - Wait until message processing goroutines are stopped before returning from gcppubsub.Stop(). - Prevent from multiple calls to Init() without Stop(). - Drop message if tenantID cannot be parsed properly. - Take into account tenantID for all the supported message formats. - Support gzip-compressed messages for graphite format. - Use exponential backoff sleep when the message cannot be pushed to remote storage systems because of disabled on-disk persistence - https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence - Unblock from sleep as soon as Stop() is called. Previously the sleep could take up to 2 seconds after Stop() is called. - Remove unused globalCtx and initContext from app/vmagent/remotewrite/gcppubsub - Mention Google PubSub support at docs/enterprise.md - Make Google PubSub docs more clear at docs/vmagent.md This is a follow-up for commits 115245924a5f096c5a3383d6cc8e8b6fbd421984 and e6eab781ce42285a6a1750dc01eba6801dd35516 . Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/717 Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/713	2023-12-04 22:46:28 +02:00
Aliaksandr Valialkin	f5c4fcc250	lib/backup: consistently use path.Join() when constructing paths for s3, gs and azblob E.g. replace `fs.Dir + filePath` with `path.Join(fs.Dir, filePath)` The fs.Dir is guaranteed to end with slash - see Init() functions. The filePath may start with slash. If it starts with slash, then `fs.Dir + filePath` constructs an incorrect path with double slashes. path.Join() properly substitutes duplicate slashes with a single slash in this case. While at it, also substitute incorrect usage of filepath.Join() with path.Join() for constructing paths to object storage systems, which expect forward slashes in paths. filepath.Join() substittues forward slashes with backslashes on Windows, so this may break creating or managing backups from Windows. This is a follow-up for 0399367be602b577baf6a872ca81bf0f99ba401b Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/719	2023-12-04 10:34:39 +02:00
Aliaksandr Valialkin	487f6380d0	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-01 16:48:48 +02:00
Aliaksandr Valialkin	ac65c6b178	lib/promrelabel: add `keep_if_contains` and `drop_if_contains` relabeling actions	2023-11-29 12:22:43 +02:00
Nikolay	41f7940f97	lib/streamaggr: properly reference slice with labels (#5406 ) * lib/streamaggr: properly reference slice with labels by limiting slice capacity. It must fix issues with slice modification, in case of append new slice will be allocated, instead of modifying refrenced slice https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402 * Reduce memory allocations when output_relabel_configs adds new labels to output samples --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-29 10:03:04 +02:00
hagen1778	98d0f81f21	lib/protoparser/datadog: follow-up after `543f218fe9` * prevent /api/v1 from panic on parsing rows * add tests for Extract function for v1 and v2 api's * separate request types in different pools to prevent different objects mixing * add changelog line `543f218fe9` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-28 15:04:15 +01:00
Andrii Chubatiuk	543f218fe9	add datadog /api/v2/series and /api/beta/sketches support (#5094 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com> Co-authored-by: Nikolay <https://github.com/f41gh7> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-11-28 14:52:29 +01:00
Aliaksandr Valialkin	5034aa0773	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:09:44 +02:00
Nikolay	090cb2c9de	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:42:11 +01:00
Roman Khavronenko	0cf55ded34	lib/protoparser: decrease `import.maxLineLen` from 100MB to 10MB (#5364 ) Tests showed that importing a single line with 70MB size takes 5.3GiB RSS memory for VictoriaMetrics single-node. In the scenario when user exports and imports data from one VM to another, it could possibly lead to OOM exception for destination VM. Importing a single line with 16MB size taks 1.3GiB RSS memory. Hence, the limit for `import.maxLineLen` was decreased from 100MB to 10MB to improve reliability of VictoriaMetrics during imports. Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 12:53:04 +02:00
hagen1778	d493da562e	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 11:20:43 +01:00
hagen1778	e96b4410a1	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 10:52:53 +01:00
Hui Wang	ae3107153c	lib/protoparser/promremotewrite: fall back to zstd decoding if Snappy-decoding fails (#5344 ) This case is possible after the following steps: 1. vmagent successfully performed handshake with the -remoteWrite.url and the remote storage supports zstd-compressed data. 2. remote storage became unavailable or slow to ingest data, vmagent compressed the collected data into blocks with zstd and puts these blocks to persistent queue on disk. 3. vmagent restarts and the remote storage is unavailable during the handshake, then vmagent falls back to Snappy compression. 4. vmagent starts sending zstd-compressed data from persistent queue to the remote storage, while falsely advertizing it sends Snappy-compressed data. 5. The remote storage receives zstd-compressed data and fails unpacking it with Snappy. The solution is the same as `12cd32fd75`, just fall back to zstd decompression if Snappy decompression fails.	2023-11-17 15:51:09 +01:00
Aliaksandr Valialkin	d9a7dea9a1	lib/querytracer: add missing blank comment line after `3121d76bee`	2023-11-15 16:10:43 +01:00
Aliaksandr Valialkin	3076c1f400	lib/ingestserver: properly log the number of closed connections Previously there was off-by-one error, which resulted in logging len(conns-1) connections instead of len(conns) Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922	2023-11-14 21:53:24 +01:00
Nikolay	3121d76bee	lib/querytracer: makes package concurrent safe to use (#5322 ) * lib/querytracer: makes package concurrent safe to use it must fix various issues with concurrent code usage. Especially, when it's not reasonable to wait for all goroutines to be finished * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 20:59:08 +01:00
Aliaksandr Valialkin	cb106bdf39	lib/logger: increase default -loggerMaxArgLen command-line flag value from 500 to 1000 The 500 chars limit for the maximum arg lengths during logging appeared to be too low for some cases	2023-11-14 19:52:27 +01:00
Aliaksandr Valialkin	f9bd265249	lib/ingestserver: typo fix after `f7834767c1` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922	2023-11-14 03:26:26 +01:00
Zakhar Bessarab	37997abd14	vmcluster: re-routing enhancement (#5293 ) * app/vmstorage: close vminsert connections gradually before stopping storage Implements graceful shutdown approach suggested here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1768146878 Test results for this can be found here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1790640274 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update graceful shutdown logic - close connections from vminsert in determenistic order - update flag description - lower default timeout to 25 seconds. 25 seconds value was chosen because the lowest default value used in default configuration deployments is 30s(default value in Kubernetes and ansible-playbooks). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add information about re-routing enhancement during restart Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/changelog: add entry for new command-line flag Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * {app/vmstorage,lib/ingestserver}: address review feedback Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add note to update workload scheduler timeout Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * wip --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 01:03:44 +01:00
Aliaksandr Valialkin	cef7a39ba3	lib/logstorage: always check the previous indexBlockHeader for blocks with matching tenantID and/or streamID The previous indexBlockHeader may contain blocks for the matching tenantID and/or streamID, so it must be scanned unconditionally during the search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5295 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4856 This is a follow-up for `89dcbc2fe7`	2023-11-13 23:13:53 +01:00
XLONG96	89dcbc2fe7	lib/logstorage: fix streamID and tenantID search (#4856 ) (#5295 )	2023-11-13 23:09:39 +01:00
Aliaksandr Valialkin	0feaeca3c1	lib/protoparser/promremotewrite: fall back to Snappy decoding if zstd decoding fails This case is possible after the following steps: 1. vmagent tries to perform handshake with the -remoteWrite.url in order to determine whether the remote storage supports zstd-compressed data. 2. The remote storage is unavailable during the handshake. In this case vmagent falls back to Snappy compression for the data sent to the remote storage. 3. vmagent compresses the collected data into blocks with Snappy and puts these blocks to persistent queue on disk. 4. The remote storage becomes available. 5. vmagent restarts, performs the handshake with the remote storage and detects that it supports zstd-compressed data. 6. vmagent starts sending Snappy-compressed data from persistent queue to the remote storage, while falsely advertizing it sends zstd-compressed data. 7. The remote storage receives Snappy-compressed data and fails unpacking it with zstd. The solution is to just fall back to Snappy decompression if zstd decompression fails. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301	2023-11-13 21:19:08 +01:00
Aliaksandr Valialkin	8af56ea2ed	lib/htmlcomponents: use relative links for the top page and for favicon.ico This allows hiding VictoriaMetrics components behind proxies with arbitrary path prefixes. For example, vmagent HTTP handlers can be served via /vmagent/ path prefix: - http://proxy/vmagent/targets - http://proxy/vmagent/service-discovery The path prefix can be arbitrary. For example, below are vmagent urls for /tenantID/vmagent/ path prefix: - http://proxy/tenantID/vmagent/targets - http://proxy/tenantID/vmagent/service-discovery While at it, consistently serve favicon.ico from any path directory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5306 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5307	2023-11-13 20:29:05 +01:00
Aliaksandr Valialkin	cf23dc6480	all: cleanup: remove `// +build ...` lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20	2023-11-13 19:12:51 +01:00
Aliaksandr Valialkin	3e93fa61ad	lib/regexutil: properly handle alternate regexps surrounded by .+ or .* Previously the following regexps were improperly handled: .+foo\|bar.+ .foo\|bar. This could lead to unexpected regexp match results. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5297 Thanks to @Haleygo for the initial attempt to fix the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5308	2023-11-13 18:23:38 +01:00
Aliaksandr Valialkin	6340911d38	lib/stringsutil: add tests for LimitStringLen() function	2023-11-13 10:32:33 +01:00
Dmytro Kozlov	4722b70c89	lib/stringsutil: fix failing test (#5313 ) We have failed test on master branch. ``` --- FAIL: TestFormatLogMessage (0.00s) logger_test.go:24: unexpected result; got "foo: abcde, \"foo bar baz\", xx" want "foo: a..e, \"f..z\", xx" ``` if failed because maxArgs maxLen <= 4 in the `LimitStringLen` in that case we always will return the income string but in the test we limit the maxLen by value 4 ``` f("foo: %s, %q, %s", []interface{}{"abcde", fmt.Errorf("foo bar baz"), "xx"}, 4, `foo: a..e, "f..z", xx`)	2023-11-13 09:51:49 +01:00
Aliaksandr Valialkin	230230cf0b	lib/logger: add `-loggerMaxArgLen` command-line flag for fine-tuning the maximum length of logged args	2023-11-11 12:30:08 +01:00
Aliaksandr Valialkin	010dc15d16	lib/blockcache: do not cache entries, which were attempted to be accessed 1 or 2 times Previously entries which were accessed only 1 time weren't cached. It has been appeared that some rarely executed heavy queries may read indexdb block twice in a row instead of once. There is no need in caching such a block then. This change should eliminate cache size spikes for indexdb/dataBlocks when such heavy queries are executed. Expose -blockcache.missesBeforeCaching command-line flag, which can be used for fine-tuning the number of cache misses needed before storing the block in the caching.	2023-11-10 22:28:03 +01:00
Aliaksandr Valialkin	d407d13e7b	Makefile: update golangci-lint version from v1.54.2 to v1.55.1 See https://github.com/golangci/golangci-lint/releases/tag/v1.55.1	2023-11-10 20:23:48 +01:00
Aliaksandr Valialkin	815fda8995	docs: update -help output after recent changes to VictoriaMetrics components	2023-11-02 20:27:10 +01:00
Aliaksandr Valialkin	65db6609eb	docs/CHANGELOG.md: update the description of the optimization for SLO/SLI-like queries according to latest changes See commits `4497a08e3d` and `92826b0b4a`	2023-11-02 20:05:05 +01:00
Aliaksandr Valialkin	714af89b13	lib/httpserver: follow-up for `0638bbe69c` - Replace spaces with underscores in the `reason` label value for the vm_http_request_errors_total metric in order be consistent with Prometheus-like naming - Clarify the description for the change at docs/CHANGELOG.md Updates https://github.com/victoriaMetrics/victoriaMetrics/issues/4590 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5166	2023-10-31 18:52:39 +01:00
Aliaksandr Valialkin	98699f203b	lib/persistentqueue: properly re-create flock.lock file inside directory if persistent queue is broken. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5249 Thanks to @Sniper91 for the bugreport and initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5233	2023-10-31 18:38:32 +01:00
Aliaksandr Valialkin	efb6ac27c2	lib/httpserver: call Request.Header() only once instead of calling it each time a new request header is set This is a follow-up for `ad839aa492`	2023-10-31 18:38:32 +01:00
Aliaksandr Valialkin	7ac49162c6	lib/storage: follow-up for `29cebd82fb` Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32(). This makes the code more clear. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159	2023-10-31 16:08:54 +01:00
venkatbvc	0638bbe69c	vmauth: add counter metrics for auth successes and failures (#5166 ) New labels `reason="wrong basic auth creds"` and `reason="wrong auth key"` were added to metric `vm_http_request_errors_total` to help identify auth errors. https://github.com/victoriaMetrics/victoriaMetrics/issues/4590 Co-authored-by: Rao, B V Chalapathi <b_v_chalapathi.rao@nokia.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-10-31 12:48:02 +01:00
Dima Lazerka	ad839aa492	lib/httpserver: add flags to specify HSTS / Frame-Options / CSP headers for httpserver (#5111 ) support `Strict-Transport-Security`, `Content-Security-Policy` and `X-Frame-Options` HTTP headers in all VictoriaMetrics components. The values for headers can be specified by users via the following flags: `-http.header.hsts`, `-http.header.csp` and `-http.header.frameOptions`. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 11:33:38 +01:00
Roman Khavronenko	29cebd82fb	lib/storage: log warning about RO mode only on state change (#5191 ) Before, vmstorage would log the same message each second producing excessive amount of logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 10:52:57 +01:00
Aliaksandr Valialkin	613b545dfd	lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller This allows substituting FATAL panics with recoverable runtime errors such as missing or invalid TLS CA file and/or missing/invalid /var/run/secrets/kubernetes.io/serviceaccount/namespace file. Now these errors are logged instead of PANIC'ing, so they can be fixed by updating the corresponding files without the need to restart vmagent. This is a follow-up for `90427abc65` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5243	2023-10-27 20:24:46 +02:00
Hui Wang	90427abc65	lib/promscrape/discovery/kubernetes: avoid possible panic if given caFile under kubernetes.SDConfig.HTTPClientConfig is not exist (#5243 ) follow up `d5a599badc`	2023-10-27 20:20:22 +02:00
Aliaksandr Valialkin	632d788b63	lib/promscrape/discovery/kubernetes: stop all the url watchers, which belong to a particular groupWatcher, at once Previously url watchers for pod, service and node objects could be mistakenly closed when service discovery was set up only for endpoints and endpointslice roles, since watchers for these roles may start start pod, service and node url watchers with nil apiWatcher passed to groupWatcher.startWatchersForRole(). Now all the url watchers, which belong to a particular groupWatcher, are stopped at once when this groupWatcher has no apiWatcher subscribers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5216 The issue has been introduced in v1.93.5 when addressing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-10-27 13:51:35 +02:00
Hui Wang	7c90ce39cb	do not print redundant error logs when failed to scrape consul or no… (#5239 ) * do not print redundant error logs when failed to scrape consul or nomad target prometheus performs the same because it uses consul lib which just drops the error(`1806bcb38c/api/api.go (L1134)`)	2023-10-27 13:31:55 +08:00
Aliaksandr Valialkin	cdbc06a639	lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4	2023-10-25 17:57:56 -07:00
Dima Lazerka	8b41b506c2	Revert "lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4" It broke CI (lint) This reverts commit `5464376d16`.	2023-10-25 16:24:31 -07:00
Aliaksandr Valialkin	5464376d16	lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4	2023-10-26 00:29:51 +02:00
Aliaksandr Valialkin	ac933cc423	lib/promscrape: properly track the number of updated service discovery routines inside Config.mustRestart() This is a follow-up for `d5a599badc`	2023-10-26 00:06:29 +02:00
Aliaksandr Valialkin	612dcf231a	lib/promauth: typo fix in the error message after `d5a599badc`: obtaine -> obtain	2023-10-25 23:38:00 +02:00
Aliaksandr Valialkin	d5a599badc	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-25 23:19:37 +02:00
Aliaksandr Valialkin	c22e3e7b1d	lib/promscrape/discovery/kubernetes/kubeconfig_test.go: make TestParseKubeConfigSuccess test code easier to follow	2023-10-25 23:17:18 +02:00
Aliaksandr Valialkin	eed5206376	lib/promauth: properly parse string contents for ca, cert and key fields at tls_config Previously yaml parser wasn't accepting string values for these fields, because it was mistakenly expecting a list of uint8 values instead.	2023-10-25 23:12:21 +02:00
Aliaksandr Valialkin	4afcb2a689	lib/promscrape: move duplicate code from functions, which collect ScrapeWork lists for distinct SD types into Config.getScrapeWorkGeneric() This removes more than 200 lines of duplicate code	2023-10-25 23:03:40 +02:00
Aliaksandr Valialkin	42dd71bb63	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-25 21:24:03 +02:00
Aliaksandr Valialkin	305c96e384	lib/workingsetcache: fix outdated comments for Load() and New() functions	2023-10-25 21:04:20 +02:00
Alexander Marshalov	33484d3365	lib/streamaggr: respect `streamAgg.dropInput` with empty stream aggr config (#5213 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5207	2023-10-20 15:55:58 +02:00
hagen1778	fd2d07ba33	lib/storage: follow-up after `188cfe3a85` `188cfe3a85` See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-17 15:45:14 +02:00
Ilya Trefilov	188cfe3a85	lib/storage: do not create tsid if metric contains stale marker(#5069 ) (#5174 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5069	2023-10-17 15:30:58 +02:00
Hui Wang	e16d3f5639	fix inconsistent behaviors with prometheus when scraping (#5153 ) * fix inconsistent behaviors with prometheus when scraping 1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting; 2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header; 3. don't send requests if there are wrong auth configs in: 1. vmagent remoteWrite; 2. vmalert datasource/remoteRead/remoteWrite/notifier. * add changelogs * address review comments * fix ut	2023-10-17 17:58:19 +08:00
Aliaksandr Valialkin	b8c267075e	lib/promscrape: add a link to https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets in descriptions for -promscrape.cluster.* command-line flags This should help users figuring out the purpose of -promscrape.cluster.* command-line flags	2023-10-16 14:46:22 +02:00
Aliaksandr Valialkin	fc98b62760	lib/promutils, app/vmalert-tool/unittest: move promutils.Duration.ParseTime() to app/vmalert-tool/unittest.durationToTime() The ParseTime() function looks strange, since it converts relative duration to absolute time since Unix Epoch. In most scenarios such a conversion is used by mistake. It is better to do not expose such a function for public use and hide it inside the package where it is needed, e.g. inside app/vmalert-tool/unittest. This is a follow-up for `dc28196237` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4789	2023-10-16 14:19:31 +02:00
Alexander Marshalov	b248413a07	fixed error when creating a full backup using the `-origin` flag (#5180 ) * fixed error when creating a full backup using the `-origin` flag (#5144) * Update docs/CHANGELOG.md --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-16 12:02:51 +02:00
Haleygo	8b6ccad41d	fix ingesting stale point, follow up `fe8cc573d1` (#5179 )	2023-10-16 09:05:37 +02:00
Aliaksandr Valialkin	2c334ed953	app/{vmagent,vminsert}: follow-up for NewRelic data ingestion protocol support This is a follow-up for `f60c08a7bd` Changes: - Make sure all the urls related to NewRelic protocol start from /newrelic . Previously some urls were started from /api/v1/newrelic - Remove /api/v1 part from NewRelic urls, since it has no sense - Remove automatic transformation from CamelCase to snake_case for NewRelic labels and metric names, since it may complicate the transition from NewRelic to VictoriaMetrics. Preserve all the metric names and label names, so users could query metrics and labels by the same names which are used in NewRelic. The automatic transformation from CamelCase to snake_case can be added later as a special action for relabeling rules if needed. - Properly update per-tenant data ingestion stats at app/vmagent/newrelic/request_handler.go . Previously it was always zero. - Fix NewRelic urls in vmagent when multitenant data ingestion is enabled. Previously they were mistakenly started from `/`. - Document NewRelic data ingestion url at docs/Cluster-VictoriaMetrics.md - Remove superflouos memory allocations at lib/protoparser/newrelic - Improve tests at lib/protoparser/newrelic/* Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3520 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4712	2023-10-16 00:25:25 +02:00
hagen1778	fe8cc573d1	docs: remove extra `/` in the end of the link Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-14 07:43:40 +02:00
Haleygo	dc28196237	vmalert-tool: implement unittest (#4789 ) 1. split package rule under /app/vmalert, expose needed objects 2. add vmalert-tool with unittest subcmd https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945	2023-10-13 13:54:33 +02:00
Zakhar Bessarab	2fc7e9f47e	lib/backup: add `-deleteAllObjectVersions` command-line flag (#5147 ) New flag enforces removal of all versions of the object in remote object storage. See: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5121 - https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages	2023-10-10 14:13:23 +02:00
Dmytro Kozlov	f60c08a7bd	app/(vminsert\|vmagent): add support for new relic infrastructure agent (#4712 ) Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-10-05 14:39:51 +02:00
Aliaksandr Valialkin	75dd7b30ba	lib/filestream: add `-filestream.disableFadvise` syscall for unconditional disabling of `fadvise` syscall This may be needed in rare cases when performing backups on systems with big number of CPU cores and big value passed to -concurrency command-line flag. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5120	2023-10-04 16:19:46 +02:00
Zakhar Bessarab	b296c8e95a	lib/logstorage: fix free space check (#5113 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-03 12:39:41 +02:00
Roman Khavronenko	a4bd73ec7e	lib/promscrape: make concurrency control optional (#5073 ) * lib/promscrape: make concurrency control optional Before, `-maxConcurrentInserts` was limiting all calls to `promscrape.Parse` function: during ingestion and scraping. This behavior is incorrect. Cmd-line flag `-maxConcurrentInserts` should have effect onl on ingestion. Since both pipelines use the same `promscrape.Parse` function, we extend it to make concurrency limiter optional. So caller can decide whether concurrency should be limited or not. This commit makes `c53b5788b4` obsolete. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Revert "dashboards: move `Concurrent inserts` panel to Troubleshooting section" This reverts commit `c53b5788b4`. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-02 21:32:11 +02:00
Aliaksandr Valialkin	859977d591	Revert "lib/promscrape: add metric `vm_promscrape_scrapes_skipped_total` (#5074 )" This reverts commit `74301cdbf5`. Reason for revert: vmagent already provides better approach for detecting slow scrape targets via the following query: scrape_duration_seconds / scrape_timeout_seconds > 1 This query depends on automatically generated per-target metrics. See https://docs.victoriametrics.com/vmagent.html#automatically-generated-metrics for more details. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5074	2023-10-02 20:59:56 +02:00
Aliaksandr Valialkin	8dce4eb189	lib/logstorage: follow-up for `94627113db` - Move uniqueFields from rows to blockStreamMerger struct. This allows localizing all the references to uniqueFields inside blockStreamMerger.mustWriteBlock(), which should improve readability and maintainability of the code. - Remove logging of the event when blocks cannot be merged because they contain more than maxColumnsPerBlock, since the provided logging didn't provide the solution for the issue with too many columns. I couldn't figure out the proper solution, which could be helpful for end user, so decided to remove the logging until we find the solution. This commit also contains the following additional changes: - It truncates field names longer than 128 chars during logs ingestion. This should prevent from ingesting bogus field names. This also should prevent from too big columnsHeader blocks, which could negatively affect search query performance, since columnsHeader is read on every scan of the corresponding data block. - It limits the maximum length of const column value to 256. Longer values are stored in an ordinary columns. This helps limiting the size of columnsHeader blocks and improving search query performance by avoiding reading too long const columns on every scan of the corresponding data block. - It deduplicates columns with identical names during data ingestion and background merging. Previously it was possible to pass columns with duplicate names to block.mustInitFromRows(), and they were stored as is in the block. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4969	2023-10-02 19:19:08 +02:00
Roman Khavronenko	74301cdbf5	lib/promscrape: add metric `vm_promscrape_scrapes_skipped_total` (#5074 ) * lib/promscrape: add metric `vm_promscrape_scrapes_skipped_total` add metric `vm_promscrape_scrapes_skipped_total`to show whether vmagent skips the scrapes. This could happen if vmagent is overloaded or target is responding too slow for configured `scrape_interval`. The follow-up commit should add a corresponding alerting rule and panel to vmagent dashboard. Signed-off-by: hagen1778 <roman@victoriametrics.com> * deployment/docker: add `TooManyScrapeSkips` alerting rule for vmagent Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: add panels `Scrape duration 0.99 quantile` and `Skipped scrapes` to vmagent dashboard Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-02 17:12:12 +02:00
Aliaksandr Valialkin	7b33a27874	lib/logstorage: follow-up for `8a23d08c21` - Compare the actual free disk space to the value provided via -storage.minFreeDiskSpaceBytes directly inside the Storage.IsReadOnly(). This should work fast in most cases. This simplifies the logic at lib/storage. - Do not take into account -storage.minFreeDiskSpaceBytes during background merges, since it results in uncontrolled growth of small parts when the free disk space approaches -storage.minFreeDiskSpaceBytes. The background merge logic uses another mechanism for determining whether there is enough disk space for the merge - it reserves the needed disk space before the merge and releases it after the merge. This prevents from out of disk space errors during background merge. - Properly handle corner cases for flushing in-memory data to disk when the storage enters read-only mode. This is better than losing the in-memory data. - Return back Storage.MustAddRows() instead of Storage.AddRows(), since the only case when AddRows() can return error is when the storage is in read-only mode. This case must be handled by the caller by calling Storage.IsReadOnly() before adding rows to the storage. This simplifies the code a bit, since the caller of Storage.MustAddRows() shouldn't handle errors returned by Storage.AddRows(). - Properly store parsed logs to Storage if parts of the request contain invalid log lines. Previously the parsed logs could be lost in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4945	2023-10-02 16:52:23 +02:00
Aliaksandr Valialkin	10d9214980	lib/logstorage: run up to GOMAXPROCS flushers of old in-memory parts to disk One flusher isn't enough under high data ingestion rate. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2023-10-02 16:20:59 +02:00
Aliaksandr Valialkin	da9ef90277	lib/logstorage: assist merging in-memory parts at data ingestion path if their number starts exceeding maxInmemoryPartsPerPartition This is a follow-up for `9310e9f584` , which removed data ingestion pacing. This can result in uncontrolled growth of in-memory parts under high data ingestion rate, which, in turn, can result in unbounded RAM usage, OOM crashes and slow query performance. While at it, consistently reset isInMerge field for parts passed to mergeParts() before returning from this function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4828	2023-10-02 08:24:58 +02:00
Aliaksandr Valialkin	d41841c0c9	lib/{mergeset,storage}: consistently reset isInMerge field in parts passed to mergeParts() before returning from the function While at it consistently check that the isInMerge field is set in all the parts passed to mergeParts()	2023-10-02 08:05:29 +02:00
Aliaksandr Valialkin	3ca6fea858	lib/{mergeset,storage}: perform at most one assisted merge per each call to addRows/addItems This should reduce tail latency during data ingestion. This shouldn't slow down data ingestion in the worst case, since assisted merges are spread among distinct addRows/addItems calls after this change.	2023-10-01 22:19:46 +02:00
Zakhar Bessarab	94627113db	lib/logstorage: prevent from panic during background merge (#4969 ) * lib/logstorage: prevent from panic during background merge Fixes panic during background merge when resulting block would contain more columns than maxColumnsPerBlock. Buffered data will be flushed and replaced by the next block. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: clarify field description and comment Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-09-29 11:58:20 +02:00
Zakhar Bessarab	8a23d08c21	lib/logstorage: switch to read-only mode when running out of disk space (#4945 ) * lib/logstorage: switch to read-only mode when running out of disk space Added support of `--storage.minFreeDiskSpaceBytes` command-line flag to allow graceful handling of running out of disk space at `--storageDataPath`. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix error handling logic during merge Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix log level Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-09-29 11:55:38 +02:00
Zakhar Bessarab	9310e9f584	lib/logstorage/datadb: remove parts merge cond (#4828 ) It was added in order to limit number of goroutines performing assisted merges during ingestion. It turned out that blocking ingestion goroutines lower ingestion performance and limits overall ingestion around 40k items per seconds because of lock contention. Removing parts merge sync.Cond allows to remove lock contention at write path and significantly improves write performance. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-09-29 11:50:14 +02:00
Aliaksandr Valialkin	223ef96198	lib/storage: remove unused atomicSetBool function after `717c53af27`	2023-09-25 17:37:24 +02:00
Aliaksandr Valialkin	15dfd94f3b	lib/storage: make it clear that the number of big merge workers always equals to 4 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4915#issuecomment-1733922830	2023-09-25 17:15:45 +02:00
Aliaksandr Valialkin	717c53af27	lib/storage: stop exposing vm_merge_need_free_disk_space metric This metric confuses users and has no any useful information. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686#issuecomment-1733844128	2023-09-25 16:52:39 +02:00
Zakhar Bessarab	8d99c12a7d	lib/promscrape/discovery/kubernetes: supress context.Cancelled error in logs (#5048 ) lib/promscrape/discovery/kubernetes: supress context.Cancelled error in logs It is possible that context.Cancelled will appear after k8s watcher was closed due to reload(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850). Logging an error misinforms user and looks like vmagent discovery will stop working even though this does not affect discovery. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-09-22 13:01:33 +02:00
Aliaksandr Valialkin	3140ef7261	lib/storage: log fatal error inside searchMetricName() instead of propagating it to the caller This simplifies the code a bit at searchMetricName() and searchMetricNameWithCache() call sites This is a result of investigating https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4972	2023-09-22 11:41:06 +02:00
Zakhar Bessarab	760cdcec68	lib/backup: fix issue with inconsistent copying of appliedRetention.txt (#5027 ) * lib/backup: fix issue with inconsistent copying of appliedRetention.txt appliedRetention.txt can be modified in place, so it should be always copied just the same as parts.json Updates: https://github.com/victoriaMetrics/victoriaMetrics/issues/5005 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add changelog entry for appliedRetention.txt copying fix Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-09-21 11:25:19 +02:00
Zakhar Bessarab	bea3431ed1	lib/storage/partition: add check to ensure parts exist on disk (#5017 ) * lib/storage/partition: add check to ensure parts exist on disk If part exists in parts.json but is missing on disk there will be a misleading error similar to "unexpected number of substrings in the part name". This change forces verification of part existence and throws a correct error in case it is missing on disk. Such issue can be result of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 or disk corruption. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: use filepath.Join instead of string concatenation Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: add action points for error message Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * all: add a check for missing part in lib/mergeset and lib/logstorage --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-19 11:17:41 +02:00
Aliaksandr Valialkin	30a645cd82	lib/promscrape/discovery/kubernetes: follow-up after `03fece44e0` - Properly update vm_promscrape_discovery_kubernetes_url_watchers and vm_promscrape_discovery_kubernetes_group_watchers metrics after config changes - Properly stop goroutine responsible for recreating scrapeWorks after the corresponding urlWatcher is stopped - Log the event when urlWatcher is stopped in order to simplify debugging Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4861	2023-09-18 23:23:45 +02:00
Aliaksandr Valialkin	03fece44e0	lib/promscrape/discovery/kubernetes: wait for 10 seconds before checking whether the urlWatcher must be stopped This should prevent from excess urlWatcher churn on config reload, since it leads to removal of all the apiWatchers before creating new apiWatchers. So, every config reload would lead to stopping of all the previous urlWatchers and starting new urlWatchers. The new logic gives 10 seconds for config reload before stopping unused urlWatchers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4861	2023-09-18 17:45:12 +02:00
Aliaksandr Valialkin	76af32d869	lib/promscrape/discovery/kubernetes: follow-up after `eeb862f3ff` - Move the bugfix description to the correct place in docs/CHANGELOG.md - Prevent from logging of 'context canceled' errors after the url watcher is stopped, since these errors are expected and may confuse users. - Remove unused urlWatcher.refCount field. - Remove unused urlWatcher.close() method. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-09-18 17:06:39 +02:00
Aliaksandr Valialkin	4d01bc6d52	lib/backup: properly copy parts.json files inside indexdb directory additional to data directory This is a follow-up for `264ffe3fa1` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5006	2023-09-18 16:16:50 +02:00
Aliaksandr Valialkin	f93a7b8457	lib/backup/common: consistently use canonical path with / directory separators at Part.Path Previously Part.Path could contain `\` directory separators on Windows OS, which could result in incorrect filepaths generation when making backups at object storage. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704 This is a follow-up for `f2df8ad480`	2023-09-18 16:15:34 +02:00
Zakhar Bessarab	eeb862f3ff	lib/promscrape/discovery/kubernetes: fix leaking api watcher (#4861 ) * lib/promscrape/discovery/kubernetes: fix leaking api watcher goroutine which was polling k8s API had no execution control. This leaded to leaking goroutines during config reload. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape/discovery/kubernetes: use reference counting for urlWatcher cleanup Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape/discovery/kubernetes: remove waitgroup sync for goroutines polling API server This is unnecessary since context will is cancelled and new requests will not be sent. Also, using waitgroup will increase time required to perform reload which might result in missed scrapes. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape/discovery/kubernetes: clarify comment Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Apply suggestions from code review * lib/promscrape/discovery/kubernetes: address review feedback Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-09-15 19:40:13 +02:00
faceair	b6ad581b45	lib/storage: remove ForceMergeAllParts internal loop (#4999 ) Signed-off-by: faceair <git@faceair.me>	2023-09-15 19:04:54 +02:00
Zakhar Bessarab	264ffe3fa1	lib/backup: force copying of parts.json (#5006 ) * lib/backup: force copying of parts.json Copying of parts.json is required because `part.key()` comparison can create same key value for files with different contents. This will result in inconsistent backup being created or restored. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/backup: ensure parts.json is only copied once Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-09-15 19:04:38 +02:00
Aliaksandr Valialkin	a09c680170	lib/storage: handle fatal errors inside indexSearch.getTSIDByMetricID() instead of returning them to the caller This simplifies the code a bit at caller side	2023-09-15 11:55:42 +02:00
Aliaksandr Valialkin	9de440c803	lib/logger: increase the maximum log arg size from 200 to 500 The 200 chars limit has been appeared too small for typical log messages emitted by VictoriaMetrics components This is a follow-up for `87fea7d8ac`	2023-09-07 16:11:08 +02:00
Aliaksandr Valialkin	87fea7d8ac	lib/logger: limit the maximum arg length, which can be emitted to log lines This should prevent from emitting too long lines when too long args are passed to logger.* functions. For example, too long MetricsQL queries or too long data samples.	2023-09-07 15:22:46 +02:00
Aliaksandr Valialkin	24d61bf193	lib/flagutil: add Duration.Milliseconds() convenience function after `0c7d46d637` This function is a faster replacement for Duration.Duration().Milliseconds() call	2023-09-03 10:56:44 +02:00
Dima Lazerka	0c7d46d637	flagutil: Make .Msecs private (#4906 ) * Introduce flagutil.Duration To avoid conversion bugs * Fix tests * Clarify documentation re. month=31 days * Add fasttime.UnixTime() to obtain time.Time The goal is to refactor out the last usage of `.Msecs`. * Use fasttime for time.Now() * wip - Remove fasttime.UnixTime(), since it doesn't improve code readability and maintainability - Run `make docs-sync` for syncing changes from README.md to docs/ folder - Make lib/flagutil.Duration.Msec private - Rename msecsPerMonth const to msecsPer31Days in order to be consistent with retention31Days --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-03 10:33:37 +02:00
Aliaksandr Valialkin	edee262ecc	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:16:42 +02:00
Dima Lazerka	e0e856d2e7	Add flagutil.Duration to avoid conversion bugs (#4835 ) * Introduce flagutil.Duration To avoid conversion bugs * Fix tests * Comment why not .Seconds()	2023-09-01 09:27:51 +02:00
Nikolay	00685b627f	lib/promscrape/k8s_sd: set resourceVersion to 0 by default for watch … (#4901 ) * lib/promscrape/k8s_sd: set resourceVersion to 0 by default for watch requests it must reduce load for kubernetes ETCD servers. Since requests without resourceVersion performs force cache sync at kubernetes API server with ETCD more info at https://kubernetes.io/docs/reference/using-api/api-concepts/\#semantics-for-watch https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4855 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-08-30 16:03:41 +02:00
Aliaksandr Valialkin	1bba4c5118	lib/auth: add NewTokenPossibleMultitenant() for parsing auth token, which can be multitenant Disallow parsing multitenant token at auth.NewToken(). Use auth.NewTokenPossibleMultitenant() at vminsert only. All the other callers should call auth.NewToken(), since they do not support multitenant token. This is a follow-up for `f0c06b428e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4910	2023-08-30 14:17:55 +02:00
Aliaksandr Valialkin	039b8667c4	lib/proxy: consistently use gopkg.in/yaml.v2 across all the code	2023-08-29 13:12:38 +02:00
Aliaksandr Valialkin	317a273c6d	lib/logstorage: eliminate data race when clearing s.ptwHot after deleting the corresponding partition The previous code could result in the following data race: 1. The s.ptwHot partition is marked to be deleted 2. ptw.decRef() is called on it 3. ptw.pt is set to nil 4. s.ptwHot.pt is accessed from concurrent goroutine, which leads to panic. The change clears s.ptwHot under s.partitionsLock in order to prevent from the data race. This is a follow-up for `8d50032dd6` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4895	2023-08-29 11:09:55 +02:00
crossoverJie	8d50032dd6	lib/logstorage: Set ptwHot to nil when the partition pointed by ptwHot is dropped (#4902 )	2023-08-29 11:01:19 +02:00
hagen1778	5d848363f0	lib/promscrape: follow-up after `eabcfc9bcd` `-promscrape.cluster.membersCount` by default should be `1`, like every single vmagent is a cluster of one member on its own. The change additionally validates that user can't set `-promscrape.cluster.membersCount` to value lower than `1`. `eabcfc9bcd` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-08-29 10:04:57 +02:00
Haleygo	eabcfc9bcd	fix clusterMembersCount check (#4900 )	2023-08-29 15:58:24 +08:00
crossoverJie	cde5029bce	lib/logstorage: add nil check for ptwHot.pt (#4896 )	2023-08-27 01:24:26 +02:00
Zakhar Bessarab	6e8611f301	lib/promscrape/client: sync timeout for HostClient and http.Client (#4889 ) Initially, stream parse mode was reading data from response and parsing it on flight. This was causing longer delay to read the whole response and required increasing timeout value to allow data processing while reading. So that `908e35affd` increased timeout value to fix this. But after `74c00a8762` response in stream parse mode is saved into memory and then parsed eliminating necessity of having timeout value higher that for usual scrape. Updates: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4847 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-08-25 15:47:11 +02:00
Aliaksandr Valialkin	f1c2508243	lib/promscrape: add -promscrape.cluster.memberLabel command-line flag This flag allows specifying an additional label to add to all the scraped metrics. The flag must contain label name to add. The label value will be equal to -promscrape.cluster.memberNum. This functionality can help when there is a need to differentiate metrics scraped by distinct vmagent instances in the cluster according to https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247#issuecomment-1692279393	2023-08-24 22:03:54 +02:00
hagen1778	4ebe8bb1d5	app/vmagent: follow-up after `6788704152` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4884 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-08-24 11:36:42 +02:00
Zakhar Bessarab	6788704152	lib/promscrape/client: make User-Agent consistent between fasthttp and native client (#4886 ) User agent was not set for native client which resulted in using one provided by Golang. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4884 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-08-24 11:31:13 +02:00
Nikolay	c5aac34b68	lib/storage: properly caclucate nextRotationTimestamp (#4874 ) cause of typo unix millis was used instead of unix for current timestamp calculation https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873	2023-08-23 13:22:53 +02:00
Dmytro Kozlov	b7d07e5acf	lib/protoparser: handle unexpected EOF error when parsing lines in prometheus exposition format (#4851 ) Previously only io.EOF was handled, and io.ErrUnexpectedEOF was ignored, but it may happen if the client interrupts the connection. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4817	2023-08-18 08:55:42 +02:00
Aliaksandr Valialkin	cd9f86afe1	lib/envflag: do not allow unsupported form for boolean command-line flags in the form `-boolFlag value` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4845	2023-08-17 13:26:53 +02:00
Aliaksandr Valialkin	fdae53a75b	lib/promrelabel: properly replace `:` char with `_` in metric names when -usePromCompatibleNaming command-line flag is set This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071 comment from @johnseekins	2023-08-14 16:14:42 +02:00
Aliaksandr Valialkin	63e3571e8c	lib/promrelabel: stop emitting DEBUG log lines when parsing `if` expressions These lines were accidentally left in the commit `62651570bb` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635	2023-08-14 15:24:31 +02:00
Aliaksandr Valialkin	ac6c40e896	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 01:57:34 -07:00
Aliaksandr Valialkin	a0f695f5de	app/vmbackup: add ability to make server-side copying of existing backups	2023-08-13 17:24:24 -07:00
Nikolay	d144e39592	lib/protoparser/openetelemetry: fixes panic (#4821 ) Opentelemetry format allows histograms with non-counter buckets. In this case it makes no sense to add buckets into database and save only counter with _count suffix. It could be used as gauge. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4814 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-08-12 05:09:18 -07:00
Nikolay	f111ddb862	lib/promscrape: adds validation for proxy_url scheme (#4823 ) * lib/promscrape: adds validation for proxy_url scheme adds tests https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4811 * Update lib/proxy/proxy.go * Update lib/proxy/proxy.go --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-08-12 05:03:08 -07:00
Aliaksandr Valialkin	d7067c46d0	lib/flagutil: add defaultValue arg to NewArray{Int,Bytes,Duration} functions The defaultValue is printed in the flag description when passing -help to the app. This is a follow-up for `aef31f201a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4776	2023-08-12 04:19:05 -07:00
Zakhar Bessarab	1bd7637fe1	lib/promrelabel: fix relabeling if clause (#4816 ) * lib/promrelabel: fix relabeling if clause being applied to labels outside of current context Relabeling is applied to each metric row separately, but in order to lower amount of memory allocations it is reusing labels. Functions which are working on current metric row labels are supposed to use only current metric labels by using provided offset, but if clause matcher was using the whole labels set instead of local metrics. This leaded to invalid relabeling results such as one described here: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/CHANGELOG.md: document the bugfix Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-08-11 06:37:48 -07:00
Aliaksandr Valialkin	be5c4818f5	lib/httpserver: properly quote the returned address from GetQuotedRemoteAddr() for requests with X-Forwarded-For header Make sure that the quoted address can be used as JSON string. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676#issuecomment-1663203424 This is a follow up for `252643d100` and `ac0b7e0421` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676	2023-08-11 05:19:50 -07:00
Zakhar Bessarab	f2df8ad480	vmbackupmanager: fixes for windows compatibility (#641 ) * app/vmbackupmanager/storage: fix path join for windows See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/backup: fixes for windows support - close dir before running os.RemoveAll. Windows FS does not allow to delete directory before all handles will be closed. - add path "normalization" for local FS to use the same format of paths for both *unix and Windows See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-08-11 02:56:11 -07:00
Aliaksandr Valialkin	f35d27aa2b	app/vlstorage: expose vl_data_size_bytes metric at /metrics page for tracking the on-disk data size (both indexdb and the data itself)	2023-07-31 07:56:53 -07:00
Aliaksandr Valialkin	d18ff993e6	lib/promscrape: add a comment why `honor_timestamps` is set to false by default This should prevent from returning it back to true in the future Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697	2023-07-28 21:36:32 -07:00
Aliaksandr Valialkin	e3ef3df938	lib/promscrape: use local scrape timestamp for scraped metrics unless `honor_timestamps: true` is set explicitly This fixes the case with gaps for metrics collected from cadvisor, which exports invalid timestamps, which break staleness detection at VictoriaMetrics side. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697 , https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1654614799 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1656540535 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1773	2023-07-28 21:11:26 -07:00
Aliaksandr Valialkin	9082a84566	lib/storage: update nextRotationTimestamp relative to the timestamp of the indexdb rotation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563	2023-07-28 19:48:10 -07:00
Roman Khavronenko	9f1b9b86cc	vmalert: revert unittest feature (#4734 ) * Revert "vmalert: unittest support stale datapoint (#4696)" This reverts commit `0b44df7ec8`. * Revert "docs: specify min version and limitations for vmalert's unit tests" This reverts commit `a24541bd` Signed-off-by: hagen1778 <roman@victoriametrics.com> * Revert "vmalert: init unit test (#4596)" This reverts commit `da60a68d` Signed-off-by: hagen1778 <roman@victoriametrics.com> * docs: mention unittest revert in changelog Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-07-28 10:42:02 +02:00
Aliaksandr Valialkin	3d73640815	lib/promscrape/discovery: close unused HTTP connections to service discovery servers This should prevent from connection leaks See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4724	2023-07-27 14:48:56 -07:00
Nikolay	46ecbbea26	lib/protoparser: adds opentelemetry parser (#2570 ) * lib/protoparser: adds opentelemetry parser app/{vmagent,vminsert}: adds opentelemetry ingestion path Adds ability to ingest data with opentelemetry protocol protobuf and json encoding is supported data converted into prometheus protobuf timeseries each data type has own converter and it may produce multiple timeseries from single datapoint (for summary and histogram). only cumulative aggregationFamily is supported for sum(prometheus counter) and histogram. Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> updates deps fixes tests wip wip wip wip lib/protoparser/opentelemetry: moves to vtprotobuf generator go mod vendor lib/protoparse/opentelemetry: reduce memory allocations * wip - Remove support for JSON parsing, since it is too fragile and is rarely used in practice. The most clients send OpenTelemetry metrics in protobuf. The JSON parser can be added in the future if needed. - Remove unused code from lib/protoparser/opentelemetry/pb and lib/protoparser/opentelemetry/proto - Do not re-use protobuf message between ParseStream() calls, since there is high chance of high fragmentation of the re-used message because of too complex nested structure of the message. * wip * wip * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-27 13:26:45 -07:00
Alexander Marshalov	7e5555f9c7	fixed label values decoding for pushgateway compatibility (#4727 ) Fixed decoding of label values with slash for pushgateway and prometheus golang client compatibility + added some tests. (#4962)	2023-07-27 17:09:28 +02:00
Aliaksandr Valialkin	ad08d9c884	lib/promrelabel: return correct string representation for IfExpression containing a single selector This is a follow-up for `62651570bb` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635	2023-07-24 19:33:14 -07:00
Aliaksandr Valialkin	62651570bb	lib/promrelabel: add support for a list of series selectors at IfExpression This makes possible specifying a list of series selectors at the following places: - Inside `if` option at relabeling rules - Inside `match` option at stream aggregation rules Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635	2023-07-24 17:08:52 -07:00
Aliaksandr Valialkin	52c13e9515	lib/streamaggr: follow-up for `736197179e` - Use a byte slice instead of a map for tracking indexes for matching series. This improves performance, since access by slice index is faster than access by map key. - Re-use the byte slice for tracking indexes for matching series. This removes unnecessary memory allocations and improves stream aggregation performance a bit. - Add an ability to return to the previous behvaiour by specifying -remoteWrite.streamAggr.dropInput command-line flag. In this case all the input samples are dropped when stream aggregation is enabled. - Backport the new stream aggregation behaviour from vmagent to single-node VictoriaMetrics when -streamAggr.config option is set. - Improve docs regarding this change at docs/CHANGELOG.md - Document the new behavior at docs/stream-aggregation.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4575	2023-07-24 17:05:26 -07:00
Zakhar Bessarab	736197179e	{lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag (#4575 ) * {lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag Changes default behaviour of keepInput flag to write series which did not match any aggregators to the remote write. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Update app/vmagent/remotewrite/remotewrite.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-24 16:33:30 -07:00
Aliaksandr Valialkin	01a2859f43	lib/streamaggr: skip de-duplication for series, which do not match the configured aggregation rules Previously all the incoming samples were de-duplicated, even if their series doesn't match aggregation rule filters. This could result in increased CPU usage. Now the de-duplication isn't applied to samples for series, which do not match aggregation rule filters. Such samples are just ignored.	2023-07-22 16:42:34 -07:00
Nikolay	544fba6826	lib/storage: pre-create timeseries before indexDB rotation (#4652 ) * lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 * docs/CHANGELOG.md: refer to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-22 15:20:21 -07:00
Aliaksandr Valialkin	9763e2295b	lib/streamaggr: follow up for `70773f53d7` - Round staleness_interval durations to the upper number of seconds. This should prevent from under-calculations for fractional staleness intervals. - Rename stalenessInterval field at *AggrState structs into stalenessSecs, since it holds seconds. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4667	2023-07-20 21:44:24 -07:00
Aliaksandr Valialkin	4cae725edf	lib/encoding/zstd: switch back from atomic.Pointer to atomic.Value for map[...]... The map[...]... is already a pointer type, so atomic.Pointer[map[...]...] results in double pointer. This is a follow-up for `140e7b6b74`	2023-07-20 20:56:11 -07:00
Aliaksandr Valialkin	49bd2905fa	lib/promscrape: follow-up after `6aa50ca954` - Improve docs - Hide `debug relabeling` column when -promscrape.dropOriginalLabels command-line flag is set - Inline the code from the added template functions, since the code is harder to follow with the template functions, especially when these functions have misleading names. Also, these functions are used only in one place, e.g. they do not reduce the amounts of code. - Hide `click to show original labels` title at `labels` column when original labels aren't available. - Show the reason on whey original labels aren't available at /service-discovery page. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4597	2023-07-20 19:14:33 -07:00
Aliaksandr Valialkin	f548adce0b	app/vlinsert/loki: follow-up after `09df5b66fd` - Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki - Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header - Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler - Check JSON field types more strictly. - Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients, which store timestamps in float64 instead of int64. - Optimize parsing of Loki labels in Prometheus text exposition format. - Simplify tests. - Remove lib/slicesutil, since there are no more users for it. - Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels as stream fields in most Loki setups. - Allow empty of missing timestamps in the ingested logs. The current timestamp at VictoriaLogs side is then used for the ingested logs. This simplifies debugging and testing of the provided HTTP-based data ingestion APIs. The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 . This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation similar to the one used at lib/prompb or lib/prompbmarshal .	2023-07-20 16:48:21 -07:00
Alexander Marshalov	70773f53d7	allow configuring staleness interval in stream aggregation (#4667 ) (#4670 ) --------- Signed-off-by: Alexander Marshalov <_@marshalov.org> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-07-20 16:07:33 +02:00
Haleygo	da60a68d09	vmalert: init unit test (#4596 ) vmalert: support unit tests See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-07-20 15:07:10 +02:00
Dmytro Kozlov	6aa50ca954	app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used (#4616 ) * app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used * app/vmagent: hide links if OriginalLabels was dropped * app/vmagent: update CHANGELOG.md and added information to the docs * app/vmagent: fix comments	2023-07-20 10:13:39 +02:00
Zakhar Bessarab	09df5b66fd	app/vlinsert: add support of loki push protocol (#4482 ) * app/vlinsert: add support of loki push protocol - implemented loki push protocol for both Protobuf and JSON formats - added examples in documentation - added example docker-compose Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert: move protobuf metric into its own file Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: update reference to docker image Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: make volume name unique Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: add license reference Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: fix volume name Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/VictoriaLogs/data-ingestion: add stream fields for loki JSON ingestion example Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: move entities to places where those are used Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: refactor to use common components - use CommonParameters from insertutils - stop ingestion after first error similar to elasticsearch and jsonline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: address review feedback - add missing logstorage.PutLogRows calls - refactor tenant ID parsing to use common function - reduce number of allocations for parsing by reusing logfields slices - add tests and benchmarks for requests processing funcs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-20 10:10:55 +02:00
Aliaksandr Valialkin	140e7b6b74	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:42:06 -07:00
Roman Khavronenko	c32a01c52e	docs: follow-up after `aec4b5db81` (#4638 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-07-19 10:10:51 +02:00
Aliaksandr Valialkin	163572ea97	lib/logstorage: `go fmt` after `a8000b74c5`	2023-07-18 16:04:51 -07:00
Aliaksandr Valialkin	a8000b74c5	lib/logstorage: properly encode `"offset"` search word just after _time filter	2023-07-18 16:00:06 -07:00
Aliaksandr Valialkin	ed00b03ecb	lib/logstorage: add abilty to speficy offset for the selected _time filter The following syntax is supported: _time:filter offset off For example: - _time:5m offset 1h - 5-minute duration one hour before the current time - _time:2023 offset 2w - 2023 year with the 2 weeks offset in the past	2023-07-17 19:07:42 -07:00
Aliaksandr Valialkin	118b093bdd	lib/logstorage: log the -retentionPeriod and -futureRetention values when the ingested log entry has timestamp outside the configured retention This should simplify debugging	2023-07-17 19:07:41 -07:00
Aliaksandr Valialkin	bdfb80668d	lib/logstorage: support for short form of _time:(now-duration, now] filter: _time:duration	2023-07-17 19:07:40 -07:00
Aliaksandr Valialkin	3bf58326e7	lib/logstorage: LogsQL: replace exact_prefix("...") with exact("...") This makes LogsQL queries more consistent with i("...") and i("...") syntax	2023-07-17 19:07:40 -07:00
Aliaksandr Valialkin	8815080030	app/vmselect/promql: add the ability to copy all the labels from `one` side of group_left()/group_right() operation This is performed by specifying `` inside group_left()/group_right(). Also allow specifying prefix for the copied labels via `group_left(...) prefix "..."` and `group_right(...) prefix "..."` syntax. For example, the following query adds all the namespace-related labels to pod info, and prefixes all the copied label names with "ns_" prefix: kube_pod_info on(namespace) group_left(*) prefix "ns_" kube_namespace_labels This resolves the following StackOverflow questions: - https://stackoverflow.com/questions/76661818/how-to-add-namespace-labels-to-pod-labels-in-prometheus - https://stackoverflow.com/questions/76653997/how-can-i-make-a-new-copy-of-kube-namespace-labels-metric-with-a-different-name	2023-07-17 19:07:39 -07:00
Aliaksandr Valialkin	4cb024d8a3	all: add support for `or` filters in series selectors This commit adds ability to select series matching distinct filters via a single series selector. For example, the following selector selects series with either {env="prod",job="a"} or {env="dev",job="b"} labels: {env="prod",job="a" or env="dev",job="b"} The `or` filter is supported in all the VictoriaMetrics tools now. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3997 Uses https://github.com/VictoriaMetrics/metricsql/pull/14	2023-07-16 00:06:33 -07:00
Aliaksandr Valialkin	6685f6ce7c	lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache This makes the code more clear and easier to read This is a follow-up for `7094fa38bc`	2023-07-13 23:13:23 -07:00
Aliaksandr Valialkin	0c49552849	lib/mergeset: skip common prefix in binarySearchKey() function This should improve performance a bit when the search if performed among items with long common prefix	2023-07-13 22:04:59 -07:00
Aliaksandr Valialkin	3dacdcb707	lib/storage: optimize BenchmarkIndexDBGetTSIDs() - Sort MetricName tags only once before the benchmark loop. - Obtain indexSearch per each benchmark loop in order to give a chance for background merge for the recently created parts	2023-07-13 21:56:53 -07:00
Aliaksandr Valialkin	443661a5da	lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage()	2023-07-13 17:13:24 -07:00
Aliaksandr Valialkin	7094fa38bc	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 16:07:30 -07:00
Aliaksandr Valialkin	3b50b94f7a	lib/storage: fix possible test failure in TestStorageAddRowsConcurrent The number of parts in the snapshot partition may be zero if concurrent goroutine just started creating new partition, but didn't put data into it yet when the current goroutine made a snapshot.	2023-07-13 15:03:45 -07:00
Aliaksandr Valialkin	4ba19f6b32	lib/mergeset: simplify fulsuhInmemoryParts() a bit	2023-07-13 12:33:30 -07:00
Aliaksandr Valialkin	a79e53d82a	lib/logstorage: fix TestValuesEncoder() on 32-bit architectures	2023-07-13 11:27:13 -07:00
Dmytro Kozlov	79c42814cf	lib/logstorage: fix panic (#4620 )	2023-07-13 09:53:41 +02:00
Zakhar Bessarab	51a9cc9783	docs: make `httpAuth.` flags description less ambiguous (#4588 ) docs: make `httpAuth.` flags description less ambiguous Currently, it may confuse users whether `httpAuth.` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix a typo Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-07 13:50:13 +02:00
Aliaksandr Valialkin	152ca00fb8	docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix This is a follow-up for `5eb5df96e2`	2023-07-06 17:09:03 -07:00
Aliaksandr Valialkin	8a07621a0c	lib/promscrape: disable support for service discovery and metrics scrape via http2 Reasons for disabling http2: - http2 is used very rarely comparing to http for Prometheus metrics exposition and service discovery - http2 is much harder to debug than http - http2 has very bad security record because of its complexity - see https://portswigger.net/research/http2 VictoriaMetrics components are compiled with nethttpomithttp2 tag because of these issues. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4274 This is a follow-up for `72c3cd47eb`	2023-07-06 16:03:37 -07:00
Alexander Marshalov	af53c7cc78	fix removing storage data dir before restoring from backup (#598 ) * fix removing storage data dir before restoring from backup Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fixes after merge with `enterprise-single-node` branch Signed-off-by: Alexander Marshalov <_@marshalov.org> --------- Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-07-06 14:16:18 -07:00
Aliaksandr Valialkin	3286ca3318	lib/backup/actions: remove misleading comment about the default value for Concurrency field	2023-07-06 14:07:08 -07:00
Aliaksandr Valialkin	792860db10	lib/promscrape/discoveryutils: re-use checkRedirect function for both client and blockingClient Also document follow_redirects option at https://docs.victoriametrics.com/sd_configs.html#http-api-client-options This is a follow-up for `b3d0ff463a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-07-06 10:51:33 -07:00
Alexander Marshalov	fc67d94e86	vmbackupmanager bugfixes: (#577 ) - error on running with empty -dst dir and without -runOnStart - error on restoring with backup, created before v1.90.0	2023-07-05 22:07:15 -07:00
Aliaksandr Valialkin	3c5623ce7f	lib/logstorage: go fmt	2023-07-04 14:13:14 -07:00
Aliaksandr Valialkin	6d35d21f60	lib/logstorage: fix `make test-pure` tests	2023-07-04 13:14:30 -07:00
Aliaksandr Valialkin	d1dd25122a	lib/httputils: fix test after `b49d04b3dc` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-07-04 09:40:12 -07:00
Haleygo	5fc0ee43d4	fix parse for invalid partial RFC3339 format (#4539 ) The validation was needed for covering corner cases when storage is tested with data from 1970. This resulted into unexpected search results, as year was parsed incorrectly from the given timestamp. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-07-03 13:11:49 +02:00
Alexander Marshalov	1cc06e39cd	show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading (#4460 ) (#4530 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-28 14:44:45 +02:00
Aliaksandr Valialkin	83aa78dfb4	app/vlstorage: export vl_active_merges and vl_merges_total metrics	2023-06-21 20:58:57 -07:00
Aliaksandr Valialkin	dde9ceed07	app/vlinsert/jsonline: code prettifying	2023-06-21 19:39:22 -07:00
Aliaksandr Valialkin	7346bb4f03	app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize	2023-06-21 01:11:25 -07:00
Aliaksandr Valialkin	00c3dbd15d	app/victoria-logs: add ability to debug data ingestion by passing `debug` query arg to data ingestion API	2023-06-20 20:02:46 -07:00
Aliaksandr Valialkin	87b66db47d	app/victoria-logs: initial code release	2023-06-19 22:55:12 -07:00
Aliaksandr Valialkin	aeac39cfd1	lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level	2023-06-19 22:48:37 -07:00
Aliaksandr Valialkin	0f01eea4e9	lib/netutil: ignore arificial timeout generated by net/http.Server This prevents from the inflated vm_tcplistener_read_timeouts_total counter	2023-06-19 22:46:40 -07:00
Aliaksandr Valialkin	298aab3f54	lib/mergeset: do not create flock.lock file at mergeset table, since it is created at the lib/storage.Storage level	2023-06-19 22:45:31 -07:00
Aliaksandr Valialkin	371182f299	lib/fs: add ReaderAt.Path() function This function is going to be used in VictoriaLogs	2023-06-19 22:42:27 -07:00
Aliaksandr Valialkin	497ec3f3e6	lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions These functions are going to be used by VictoriaLogs	2023-06-19 22:40:55 -07:00
Aliaksandr Valialkin	3409317a67	lib/cgroup: add SetGOGC() function This function is going to be used by VictoriaLogs	2023-06-19 22:39:00 -07:00
Aliaksandr Valialkin	c1bed35b39	lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions This is needed for the upcoming VictoriaLogs	2023-06-19 22:37:26 -07:00
Aliaksandr Valialkin	78eaa056c0	app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs	2023-06-19 22:34:20 -07:00
Aliaksandr Valialkin	b49d04b3dc	lib/promutils.ParseTime(): add support for timestamps in milliseconds See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-06-19 22:25:04 -07:00
Nikolay	5eb5df96e2	lib/storage: creates parts.json on start-up if it not exists. (#4450 ) * lib/storage: creates parts.json on start-up if it not exists. It fixes migrations from versions below v1.90.0. Previously parts.json was created only after successful merge. But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted. Since VM cannot check if it must be removed or not. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/storage/partition.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-06-15 11:19:22 +02:00
Roman Khavronenko	f50f35a8e0	lib/storage: add comment for how `mustBeDeleted` field should be used (#4454 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-15 11:17:45 +02:00
Roman Khavronenko	f71cc99a8c	lib/mergeset: add comment for how `mustBeDeleted` field should be used (#4449 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-14 18:13:16 +02:00
Alexander Marshalov	40d12be607	fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390 ) (#4439 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-12 16:16:43 +02:00
Roman Khavronenko	dfe53a36fc	lib/promscrape/discoveryutils: properly check for net.ErrClosed (#4426 ) This error may be wrapped in another error, and should normally be tested using `errors.Is(err, net.ErrClosed)`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-09 09:26:33 +02:00
Roman Khavronenko	3305a6901c	app/vmagent: mention `enable_http2` in changelog (#4403 ) Follow-up after `72c3cd47eb` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-05 16:31:58 +02:00
Haleygo	72c3cd47eb	vmagent:scrape config support enable_http2 (#4295 ) app/vmagent: support `enable_http2` in scrape config This change adds HTTP2 support for scrape config and improves compatibility with Prometheus config. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283	2023-06-05 15:56:49 +02:00
Nikolay	f263031fe9	app/vmauth: properly handle LOCAL proxy protocol command (#4373 ) app/vmauth: properly handle LOCAL proxy protocol command It is required for handling health checks from load balancers https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335	2023-05-31 15:37:59 +02:00
Haleygo	b3d0ff463a	vmagent:support follow_redirects on SD level (#4286 ) * vmagent:support follow_redirects on SD level * fix follow_redirects on sd level https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-05-26 09:39:45 +02:00
Aliaksandr Valialkin	1f2f74e70e	lib/promrelabel: use monospace font at textarea for writing relabel configs on /metric-relabel-debug and /target-relabel-debug pages This simplifies visual inspection of indentation in yaml configs	2023-05-18 20:48:41 -07:00
Aliaksandr Valialkin	1f28b46ae9	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Possible solution - to create the new indexdb in one hour before the indexdb rotation and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation. Then the new indexdb will contain all the needed data just after the rotation, so it won't trigger increased CPU and disk IO.	2023-05-18 11:30:49 -07:00
Haleygo	1531d757ea	fix lint check	2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin	e0e16a2d36	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:19:27 -07:00
Roman Khavronenko	2ce02a7fe6	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin	278278af95	lib/storage: reduce the unimportant logging during Storage start / stop This should improve the visibility of potentially important logs	2023-05-16 15:14:21 -07:00
Aliaksandr Valialkin	d330c7e6fc	lib/mergeset: remove superflouos logging when opening and closing the Table The logged messages had little useful info, while they were polluting log output during VictoriaMetrics start/stop	2023-05-16 15:01:25 -07:00
Aliaksandr Valialkin	3cbc0975f6	lib/mergeset: close and open the table before making snapshots at TestTableCreateSnapshotAt() This gives guarantees that all the in-memory data is written to disk at the snapshot time. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272 See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4316	2023-05-16 14:55:11 -07:00
Aliaksandr Valialkin	09b403d38a	lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk DebugFlush() makes sure that the recently ingested data becomes visible to search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272	2023-05-16 11:50:17 -07:00
Alexander Marshalov	3b2dc2b098	backup metadata are written in separate file (#560 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-16 11:24:54 -07:00
Zakhar Bessarab	242050ba94	lib/storage: follow-up after `a50d63c376` (#4289 ) * lib/storage: follow-up after `a50d63c376` - ensure retentionMsecs is rounded to day - remove localTimeOffset in test as localOffset is ignored when using `UnixMilli` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: restore retention timezone offset effect on retention deadline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-16 17:14:08 +02:00
Aliaksandr Valialkin	1c47acda11	lib/promutils: add ParseTimeAt() function	2023-05-13 20:12:31 -07:00
Aliaksandr Valialkin	616175b1ce	lib/promutils: properly return error when incorrect Prometheus label names are passed to NewLabelsFromString() Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304	2023-05-12 16:52:29 -07:00
Aliaksandr Valialkin	318a87c36f	Revert "lib/promrelabel: show error message if labels not in prometheus exposition format (#4304 )" This reverts commit `193a9c3328`. Reason for revert: the commit doesn't fix the real issue with promutils.NewLabelsFromString() function, which must return error when improperly formatted Prometheus metric with labels is passed to it. See https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#text-format-example E.g. the promutils.NewLabelsFromString() must return error when the following strings are passed to it: - `{foo:"bar"}`, since `:` is disallowed in Prometheus text exposition format. The corect value is `{foo="bar"}` - `{"foo":"bar"}`, since label name shouldn't be quoted. The correct value is `{foo="bar"}`. The reverted commit introduces another set of bugs, which happily accept the following invalid input: - `{foo=~"bar"}` - `{foo!="bar"}` - `{foo!~"bar"}` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304	2023-05-12 16:07:37 -07:00
Aliaksandr Valialkin	160453b86c	lib/protoparser/csvimport: properly parse the last empty column in CSV line Do not ignore the last empty column in CSV line. While at it, properly parse CSV columns in single quotes, e.g. `'foo,bar',baz` is parsed as two columns - `foo,bar` and `baz` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298	2023-05-12 15:51:41 -07:00
Aliaksandr Valialkin	b7fe7b801c	Revert "lib/protoparser: fix skip csv line when metric can be collect from the line (#4298 )" This reverts commit `410ae99c2e`. Reason for revert: the commit masks the real issue instead of fixing it. The real issue is that the scanner.NextColumn() skips the last column if it is empty. The commit also introduces two bugs: - a panic if all the metric values in CSV line are empty - silent import of CSV lines with too small number of columns Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048 See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298	2023-05-12 15:22:27 -07:00
Dmytro Kozlov	193a9c3328	lib/promrelabel: show error message if labels not in prometheus exposition format (#4304 ) lib/promrelabel: show error message if labels not in prometheus exposition format https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284	2023-05-12 10:42:56 +02:00
Dmytro Kozlov	410ae99c2e	lib/protoparser: fix skip csv line when metric can be collect from the line (#4298 ) * lib/protoparser: fix skip csv line when metric can be collect from the line * lib/protoparser: fix comment	2023-05-12 11:04:16 +03:00
Alexander Marshalov	9855b38da2	fixed error with double slash in vmbackupmanager (#557 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-11 13:38:07 -07:00
Aliaksandr Valialkin	73812c71a5	lib/promutils: properly parse time strings with timezones at ParseTime()	2023-05-11 13:24:00 -07:00
Aliaksandr Valialkin	da037cafc5	lib/bytesutil: `go fmt` after `2ec17bed2c`	2023-05-10 20:29:03 -07:00
Aliaksandr Valialkin	2ec17bed2c	lib/bytesutil: add benchmarks for ToUnsafeString() and ToUnsafeBytes()	2023-05-10 12:59:26 -07:00
Alexander Marshalov	2e494e2375	fixed typos in documentation and commandline flags descriptions (#4275 )	2023-05-10 09:50:41 +02:00
Aliaksandr Valialkin	b9bb64ce55	lib/promscrape/discovery/consulagent: substitute metaPrefix with the `__meta_consulagent_` plaintext string This simplifies future code navigation and search for the specific meta-label starting from __meta_consulagent_* prefix. For example, `grep __meta_consulagent_namespace` finds the exact place where this label is defined. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3953 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4217	2023-05-08 23:40:13 -07:00
Aliaksandr Valialkin	7db647e924	lib/fs: move common code outside arch-specific implementations of mustRemoveDirAtomic() This is a follow-up for `73b6c23271` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-08 23:10:20 -07:00
Aliaksandr Valialkin	887555669e	Revert "lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199 )" This reverts commit `9e99f2f5b3`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4068 Reason for revert: this breaks valid use cases: - If timestamps aren't specified in the incoming samples on purpose. For example, if stream aggregation is used as StatsD replacement. StatsD protocol has no timestamp concept for incoming samples. See https://github.com/b/statsd_spec - If all the samples must be aggregated, even if they contain stale timestamps. for example, if the stream aggregation produces some counter of some events, it may be better to count all the events even if they were delayed before being ingested into VictoriaMetrics. Is is also unclear how to determine whether the sample becomes stale. For example, if the aggregation interval equals to 1h, and the previous aggregation cycle just finished 10 minutes ago, what to do with the newly incoming sample with the timestamp 30 minutes older than the current time? The answer highly depends on the context, so it is unsafe to uncoditionally use a single logic for dropping the old samples here.	2023-05-08 16:52:27 -07:00
Aliaksandr Valialkin	74155afb71	docs: clarify docs after `5ee344824f` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183	2023-05-08 16:11:44 -07:00
Aliaksandr Valialkin	ec3943d14a	app/vmselect: small cleanup after `4f3f9950d0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3807	2023-05-08 14:57:11 -07:00
Aliaksandr Valialkin	80946f06c2	app/{vmselect,vmctl}: move ParseTime() to lib/promutils Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4091 This is a follow-up for `e2053baf32`	2023-05-08 14:17:57 -07:00
Alexander Marshalov	8225a48b56	fixed `vm_promscrape_config_last_reload_successful` metric value recovery after successful reloading with unchanged content (#4260 ) (#4268 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-08 13:32:51 +02:00
Nikolay	8f4de6fa47	lib/storage: properly update link for entry at dateMetricID cache (#4258 ) previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated it corrupts cache for backfilling metrics and increased cpu load	2023-05-05 21:45:47 -07:00
Zakhar Bessarab	4e71003620	lib/promscrape/discovery/kubernetes: follow-up for `d5e94721db` (#4255 ) - add changelog reference to an author - fix tests - add metadata to match Prometheus behavior Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-05 14:41:17 +02:00
Vasilchenko Anton	22e65402af	Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253 ) Signed-off-by: Vasilchenko Anton <vasilchenko-as@yandex.ru>	2023-05-05 15:46:07 +04:00
Zakhar Bessarab	aca256735c	lib/storage: fix indexdb rotation infinite loop (#4249 ) When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop. Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs. See: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207 - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-05-04 17:16:48 +02:00
Alexander Marshalov	56b84140a9	added new consulagent service discovery (#3953 ) (#4217 )	2023-05-04 11:36:21 +02:00
Alexander Marshalov	2eb27ddb22	max value for `memory.allowedPercent` changed from 200 to 100 (#4171 ) (#4251 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-04 11:34:57 +02:00
justcompile	49b77ec01a	squash commits (#4166 )	2023-05-03 10:51:08 +02:00
Nikolay	4786f036de	lib/backup: fixes path generation for windows (#4133 ) replaces custom fsync function with standard Fsync methods for files. fixes pattern matching for parts and properly generate backup path for local fs. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-03 10:48:53 +02:00
Nikolay	73b6c23271	lib/fs: do not panic at windows at dir deletion (#4132 ) Windows doesn't allow to remove dir with opened files. Usually it's a case for snapshots, hard cannot be removed if file is openned. With this change, dir will be renamed and properly deleted at the next process start. It's recommended to restart vmstorage/vmsingle for snapshots deletion completion periodically. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-03 10:47:02 +02:00
Zakhar Bessarab	bf3b6732bd	lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints (#4235 ) * lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints Sets `__meta_kubernetes_endpoints_name` and `__meta_kubernetes_namespace` labels to all ports of pod. Prometheus sets those labels to all ports in pod (`0ab9553611/discovery/kubernetes/endpoints.go (L267C15-L269)`) even if port is not matching any service. See: #4154 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape/discovery/kubernetes: fix test for updated discovery logic Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-03 02:17:33 +02:00
Roman Khavronenko	eb746a4dab	Revert "http server: limit max concurrent requests (#4185 )" (#4215 ) This reverts commit `77f76371` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-04-27 13:02:47 +02:00
Zakhar Bessarab	9e99f2f5b3	lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199 ) * lib/streamaggr: discard samples with timestamps not matching aggregation interval Samples with timestamps lower than `now - aggregation_interval` are likely to be written via backfilling and should not be used for calculation of aggregation. See #4068 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/streamaggr: make log message more descriptive, fix imports Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-04-27 11:59:49 +02:00
Haleygo	03150c8973	lib/opentsdbhttp: fix a typo preventing from using writeconcurrencylimiter (#4208 )	2023-04-27 09:22:42 +02:00
Nikolay	5ee344824f	lib/promscrape: adds filter for consul_sd_configs: (#4184 ) * lib/promscrape: adds filter for consul_sd_configs: it allows advanced filtering for consul service discovery requests https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183 * typo fix * removes deprecation mentions since it's not relevant * Update docs/CHANGELOG.md Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-04-26 19:16:27 +02:00
Dmytro Kozlov	bc17f4828c	app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB (#4196 ) * app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB * app/vmagent,lib/persistentqueue: linter fix * app/vmagent,lib/persistentqueue: fix comment	2023-04-26 13:23:01 +03:00
Yury Molodov	4f3f9950d0	vmui: add metric relabel debug (#3889 ) * feat: add metric relabel debug (#3807) * fix: add link to relabeling cookbook * lib/promrelabel: merge, fix conflicts * lib/promrelabel: fix diff * docs/vmui: add metric relabel playground --------- Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>	2023-04-26 11:53:29 +03:00
Roman Khavronenko	77f76371d0	http server: limit max concurrent requests (#4185 ) * lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag Introduce `-http.maxConcurrentRequests` command-line flag to protect VM components from resource exhaustion during unexpected spikes of HTTP requests. By default, the new flag's value is set to 0 which means no limits are applied. Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/httpserver: mention http.maxConcurrentRequests in docs Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-04-24 14:52:06 +02:00
Zakhar Bessarab	472fe3fd03	lib/httpserver: add handler to serve `/robots.txt` and deny search indexing (#4143 ) This handler will instruct search engines that indexing is not allowed for the content exposed to the internet. This should help to address issues like #4128 when instances are exposed to the internet without authentication.	2023-04-18 16:47:26 +04:00
Aliaksandr Valialkin	2a4c48c59d	lib/{mergeset,storage}: make mustReadPartNames() code more clear	2023-04-14 23:16:59 -07:00
Aliaksandr Valialkin	52006149b2	lib/storage: replace OpenStorage() with MustOpenStorage() Callers of OpenStorage() log the returned error and exit. The error logging and exit can be performed inside MustOpenStorage() alongside with printing the stack trace for better debuggability. This simplifies the code at caller side.	2023-04-14 23:02:40 -07:00
Aliaksandr Valialkin	2a2036160d	lib/storage: fix a bug, which prevents from reading pre-v1.90.0 parts The bug has been introduced in `c0b852d50d`	2023-04-14 22:33:08 -07:00
Aliaksandr Valialkin	3727251910	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:10:46 -07:00
Aliaksandr Valialkin	60d92894c5	lib/storage: validate rows in partition.AddRows() only during tests	2023-04-14 20:52:36 -07:00
Aliaksandr Valialkin	df619bdff0	all: consistently use fs.MustClose() for closing lock files	2023-04-14 20:14:21 -07:00
Aliaksandr Valialkin	2a3b19e1d2	lib/fs: convert CreateFlockFile to MustCreateFlockFile Callers of CreateFlockFile log the returned err and exit. It is better to log the error inside the MustCreateFlockFile together with the path to the specified directory and the call stack. This simplifies the code at the callers' side while leaving the debuggability at the same level.	2023-04-14 19:50:01 -07:00
Aliaksandr Valialkin	c0b852d50d	lib/{storage,mergeset}: convert InitFromFilePart to MustInitFromFilePart Callers of InitFromFilePart log the error and exit. It is better to log the error with the path to the part and the call stack directly inside the MustInitFromFilePart() function. This simplifies the code at callers' side while leaving the same level of debuggability.	2023-04-14 15:46:12 -07:00
Aliaksandr Valialkin	9183a439c7	lib/filestream: change Create() to MustCreate() Callers of this function log the returned error and exit. It is better logging the error together with the path to the filename and call stack directly inside the function. This simplifies the code at callers' side without reducing the level of debuggability	2023-04-14 15:12:48 -07:00
Aliaksandr Valialkin	5eb163a08a	lib/filestream: transform Open() -> MustOpen() Callers of this function log the returned error and exit. Let's log the error with the path to the filename and call stack inside the function. This simplifies the code at callers' side without reducing the level of debuggability.	2023-04-14 15:03:42 -07:00
Aliaksandr Valialkin	fda1a54343	lib/fs: improve error logging at ReaderAt.MustReadAt() - Add 'BUG:' prefix to error messages related to programming errors aka bugs. - Consistently log the path to the file in all the messages in order to improve debuggability.	2023-04-14 14:51:06 -07:00
Aliaksandr Valialkin	f341b7b3f8	lib/fs: substitute ReadFullData with MustReadData Callers of ReadFullData() log the error and then exit. So let's log the error with the path to the filename and the call stack inside MustReadData(). This simplifies the code at callers' side, while leaving the debuggability at the same level.	2023-04-14 14:39:29 -07:00
Aliaksandr Valialkin	bd6de6406a	lib/fs: improve error logging inside MustWriteData Log the path to file on errors inside MustWriteData(). This improves debuggability of errors, which may occur inside MustWriteData().	2023-04-14 14:32:45 -07:00
Aliaksandr Valialkin	e0595af2bf	lib/{mergeset,storage}: remove isInMerge flag from parts only when they werent removed yet from the list of active parts This prevents from possible panic during access to pw.p when it is set to nil at partWrapper.decRef() called inside swapSrcWithDstParts()	2023-04-14 00:08:11 -07:00
Aliaksandr Valialkin	9f8209d593	docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores This reduces the probability of sudden spike in the number of small parts when all the background mergers are busy with big merges.	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	550d5c7ea4	lib/{mergeset,storage}: make sure that getFlushToDiskDeadline() takes into account only in-memory parts	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	809fbaeaac	lib/fs: add Must prefix to CopyDirectory and CopyFile functions Callers of these functions log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 23:02:59 -07:00
Aliaksandr Valialkin	780abc3b3b	lib/fs: rename SymlinkRelative to MustSymlinkRelative Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:52:55 -07:00
Aliaksandr Valialkin	5f487ed996	lib/fs: rename HardLinkFiles to MustHardLinkFiles Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:48:07 -07:00
Aliaksandr Valialkin	30425ca81a	lib/fs: rename WriteFileAtomically to MustWriteAtomic Callers of this function log the returned error and exit. So let's just log the error with the given filepath and the call stack inside the function itself and then exit. This simplifies the code at callers' place while leaves the same level of debuggability in case of errors.	2023-04-13 22:41:15 -07:00
Aliaksandr Valialkin	036a7b7365	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:11:59 -07:00
Aliaksandr Valialkin	344209e5e6	lib/fs: rename MustWriteFileAndSync to MustWriteSync in order to improve readability a bit This is a follow-up for `2a8395be05`	2023-04-13 21:43:32 -07:00
Aliaksandr Valialkin	b15c5961ab	lib/{mergeset,storage}: remove unused `path` field from blockStreamWriter This is a follow-up after `42bba64aa7`	2023-04-13 21:39:59 -07:00
Aliaksandr Valialkin	2a8395be05	lib/fs: replace WriteFileAndSync with MustWriteAndSync When WriteFileAndSync fails, then the caller eventually logs the error message and exits. The error message returned by WriteFileAndSync already contains the path to the file, which couldn't be created. This information alongside the call stack is enough for debugging the issue. So just use log.Panicf("FATAL: ...") inside MustWriteAndSync(). This simplifies error handling at caller side a bit.	2023-04-13 21:33:19 -07:00
Aliaksandr Valialkin	25f089de9d	lib/{mergeset,storage}: properly fsync part directory listing after writing in-memory part to disk This is a follow-up after `42bba64aa7` Previously the part directory listing was fsync'ed implicitly inside partHeader.WriteMetadata() by calling fs.WriteFileAtomically(). Now it must be fsync'ed explicitly. There is no need in fsync'ing the parent directory, since it is fsync'ed by the caller when updating parts.json file.	2023-04-13 21:19:04 -07:00
Aliaksandr Valialkin	42bba64aa7	lib/{mergeset,storage}: explicitly fsync the created part directory listing Previously the created part directory listing was fsynced implicitly when storing metadata.json file in it. Also remove superflouous fsync for part directory listing, which was called at blockStreamWriter.MustClose(). After that the metadata.json file is created, so an additional fsync for the directory contents is needed.	2023-04-13 21:03:08 -07:00
Aliaksandr Valialkin	e1211a1187	app/vmstorage: deprecate -bigMergeConcurrency command-line flag Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled growth of unmerged parts, which, in turn, increases CPU usage and query durations. So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag can be used instead for controlling the concurrency of background merges.	2023-04-13 20:40:24 -07:00
Aliaksandr Valialkin	ca54e58c1f	lib/{fs,persistentqueue}: use filepath.Join() instead of concatenating path parts with `/` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014	2023-04-13 20:13:45 -07:00
Aliaksandr Valialkin	90b876cd1e	app/vmbackupmanager: sync with enterprise-single-node branch after 41a54c775891c87e3d5ed59ff0769c869dd2fe71	2023-04-13 19:29:06 -07:00
Zakhar Bessarab	81f28f0f1f	lib/backup/actions: store metadata(creation and completion time) in backup files (#4117 ) This makes it easier to understand exact point in time which is included in this backup. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-04-12 18:51:27 +02:00
Haleygo	0ad6010c91	fix sort pendingDateMetricsIDs (#4102 )	2023-04-10 10:23:12 -07:00
Dmytro Kozlov	244c18fa38	app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names (#4063 ) * app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names * app/vmctl: fix comments * app/vmctl: move function buildMatchWithFilter to the correct place * app/vmctl: update CHANGELOG.md * app/vmctl: fix CI, remove error wrapping * app/vmctl: fix CI, simplify `Set()`	2023-04-06 15:06:52 -07:00
Aliaksandr Valialkin	593c151831	lib/encoding: fix test after `4725549cb2`	2023-04-05 21:38:37 -07:00
Aliaksandr Valialkin	19b189e9b7	lib/storage: use shorter code after `03bde173b7`	2023-04-02 21:35:52 -07:00
faceair	38fc55976e	lib/storage: fix reuse pendingMetricRow (#4049 )	2023-04-02 21:35:50 -07:00
faceair	f3af8331ec	lib/storage: remove unused code (#4050 )	2023-04-02 21:24:42 -07:00
Aliaksandr Valialkin	f638496298	lib/promscrape: do not re-use previously loaded scrape targets on failed attempt to load updated scrape targets at file_sd_configs The logic employed for re-using the previously loaded scrape target was broken initially. The commit `cc0427897c` tried to fix it, but the new logic became too complex and fragile. So it is better to just remove this logic, since the targets from temporarily broken file should be eventually loaded on next attempts every -promscrape.fileSDCheckInterval This also allows removing fragile hacks around __vm_filepath label. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3989	2023-04-02 21:05:28 -07:00
Dmytro Kozlov	cc0427897c	lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read (#4027 ) * lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read * lib/promscrape: clarified comment * lib/promscrape: made better approach to handle a problem with growing []ScrapeWork on each error when loading config lib/promscrape: added CHANGELOG.md * Update docs/CHANGELOG.md --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-04-02 20:26:13 -07:00
Roman Khavronenko	27b958ba8b	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:27 -07:00
Aliaksandr Valialkin	4d00107b92	lib/fs: follow-up for `ec45f1bc5f` Properly close response body before checking for the response code. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4034	2023-03-31 22:42:10 -07:00
Aliaksandr Valialkin	d577657fb7	lib/streamaggr: follow-up for `ff72ca14b9` - Make sure that the last successfully loaded config is used on hot-reload failure - Properly cleanup resources occupied by already initialized aggregators when the current aggregator fails to be initialized - Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config This should simplify monitoring and debugging failed reloads - Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa could be in use at realoadSaConfig() - Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push(). - Remove fine-grained aggregator reload - reload all the aggregators on config change instead. This simplifies the code a bit. The fine-grained aggregator reload may be returned back if there will be demand from real users for it. - Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag - Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639	2023-03-31 22:30:38 -07:00
Zakhar Bessarab	ec45f1bc5f	lib/fs: verify response code when reading configuration over HTTP (#4036 ) Verifying status code helps to avoid misleading errors caused by attempt to parse unsuccessful response. Related issue: #4034 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-30 13:18:00 +02:00
Alexander Marshalov	ff72ca14b9	added hot reload support for stream aggregation configs (#3969 ) (#3970 ) added hot reload support for stream aggregation configs (#3969) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-03-29 18:05:58 +02:00
Aliaksandr Valialkin	94cabf29b0	lib/flagutil: ArrayString: support commas inside quoted strings and inside `[]`, `{}` and `()` braces Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3915	2023-03-28 21:22:55 -07:00
Aliaksandr Valialkin	7048a316aa	lib/persistentqueue: typo fix after `aea6df8197`	2023-03-27 20:06:04 -07:00
Aliaksandr Valialkin	aea6df8197	app/vmagent/remotewrite: cosmetic updates after `f3a51e8b1d` - Compare directory names instead of paths to directory when determining which persistent queues must be deleted This is less error-prone solution, since paths to the same directory can differ, which could lead to accidental directory removal for the existing -remoteWrite.url - Log the `removed %d dangling queues` message when at least a single queue has been removed - Consistently use filepath.Join() for creating paths to persistent queues. This is needed for Windows support (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70 ) - Clarify the description of the change at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014	2023-03-27 18:33:07 -07:00
Zakhar Bessarab	f3a51e8b1d	app/vmagent: add `-remoteWrite.removeDanglingQueues` flag (#4017 ) * app/vmagent: add `-remoteWrite.removeDanglingQueues` flag which allows to automatically remove dangling persistent queue contents Related issue: #4014 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmagent: address review feedback - remove persistent queues files by default - rename `remoteWrite.removeDanglingQueues` to `remoteWrite.keepDanglingQueues` - update docs to reflect changed behaviour Related issue: #4014 * Apply suggestions from code review --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-27 18:15:28 -07:00
Aliaksandr Valialkin	5832242b44	app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419	2023-03-25 16:37:07 -07:00
Aliaksandr Valialkin	c8f2febaa1	lib/storage: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 14:33:58 -07:00
Aliaksandr Valialkin	36bbdd7d4b	lib/mergeset: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 13:39:41 -07:00
Aliaksandr Valialkin	b14d96618c	all: follow-up after `34634ec357` - Use windows.FlushFileBuffers() instead of windows.Fsync() at streamTracker.adviseDontNeed() for consistency with implementations for other architectures. - Use filepath.Base() instead of filepath.Split(), since the dir part isn't used. This simplifies the code a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:57:39 -07:00
Nikolay	34634ec357	lib/fs: adds memory map for windows (#3988 ) This is a follow-up for `43b24164ef` * lib/fs: adds memory map for windows it should improve performance for file reading * lib/storage: replace '/' with os specific separator it must fix an errors for windows * lib/fs: mention windows fsync support * lib/filestream: adds fdatasync for windows writes Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:43:19 -07:00
Alexander Marshalov	7c86dcc4fa	allowed using dashes and dots in environment variables names (#4009 ) * allowed using dashes and dots in environment variables names for templating config files with envtemplate (#3999) Signed-off-by: Alexander Marshalov <_@marshalov.org> * Apply suggestions from code review --------- Signed-off-by: Alexander Marshalov <_@marshalov.org> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-24 15:43:05 -07:00
Nikolay	a2f716b6cc	lib/netutil: log only parsing errors for proxy-protocol (#3985 ) * lib/netutil: log only parsing errors for proxy-protocol Previosly every error was logged. With configured TCP health checks at load-balancer or kubernetes, vmauth spams a lot of false positive error message into logs * Update docs/CHANGELOG.md Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/netutil/tcplistener.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-03-21 10:22:39 -07:00
Dmytro Kozlov	e79cd24807	lib/promrelabel: make target url from labels on target relabel page (#3882 ) * lib/promrelabel: make target url from labels on target relabel page * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-20 22:07:52 -07:00
Dmytro Kozlov	5c92022cc6	lib/storage: fix collect downsampling metrics (#489 ) * lib/storage: fix downsampling * lib/storage: update logic * lib/storage: fix comments, removed unneeded check	2023-03-19 23:34:46 -07:00
Aliaksandr Valialkin	43b24164ef	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 01:36:51 -07:00
Aliaksandr Valialkin	6460475e3b	lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873	2023-03-19 00:15:30 -07:00
Aliaksandr Valialkin	a26c6628fd	lib/{fs,mergeset,storage}: substitute os.Open()+os.File.Readdir() with os.ReadDir() This simplifies code a bit	2023-03-17 21:03:37 -07:00
Zakhar Bessarab	6a5d236245	lib/storage: log original labels set when label value is truncated (#3952 ) lib/storage: log original labels set when label value is truncated	2023-03-14 10:59:40 +01:00
Nikolay	927d9da270	lib/storage: correctly handle io.EOF error for pre-fetched metrics (#3946 ) io.EOF shouldn't be returned from this function. It breaks all search API logic and may result in empty query results.	2023-03-11 23:29:43 -08:00
Nikolay	7a3e16e774	lib/netutil: fixes panic at proxy protocol (#3905 ) it may occur if non proxy protocol message received by tcp server. Listener Accept method must return only non-recoverable errors. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335	2023-03-07 08:50:18 -08:00

... 5 6 7 8 9 ...

2497 commits