github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	9004bc098e	all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices This makes the code a bit clear.	2024-04-20 21:00:24 +02:00
Aliaksandr Valialkin	513e69c55e	all: replace old https://docs.victoriametrics.com/sd_configs.html url with the new one - https://docs.victoriametrics.com/sd_configs/	2024-04-18 02:28:26 +02:00
Aliaksandr Valialkin	0211a04a52	all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/	2024-04-18 01:32:57 +02:00
Aliaksandr Valialkin	284d99e269	app/vmagent: support for DNS SRV urls at -remoteWrite.url, scrape target urls and service discovery urls Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053	2024-04-17 20:56:23 +02:00
Aliaksandr Valialkin	a9ec77fee5	lib/promscrape/discovery/consul: typo fix in the comment: enteprise -> enterprise	2024-04-17 12:08:01 +02:00
wanshuangcheng	52a4ae0b28	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 15:38:51 +02:00
Aliaksandr Valialkin	ecd782c75e	app/vmagent: follow-up for `b3b29ba6ac` - Automatically reload changed TLS root CA pointed by -remoteWrite.tlsCAFile command-line flag - Automatically reload changed TLS root CA configured via oauth2.tsl_config.ca_file option at -promscrape.config - Document the change as a feature instead of a bug at docs/CHANGELOG.md - Simplify the code at lib/promauth, which is responsible for reloading changed TLS root CA files. - Simplify the usage of lib/promauth.Config.NewRoundTripper() - now it accepts the base http.Transport instead of a callback, which can change the internal http.Transport. - Reuse the default tls config if lib/promauth.Config doesn't contain tls-specific configs. This should reduce memory usage a bit when tls isn't used for scraping big number of targets. - Do not re-read TLS root CA files on every processed request. Re-read them once per second. This should reduce CPU usage when scraping big number of targets over https. - Do not store cert.pem and key.pem files in TestTLSConfigWithCertificatesFilesUpdate, since they can be loaded from byte slices via crypto/tls.X509KeyPair(). - Remove obsolete comparisons of string representations for authConfig and proxyAuthConfig at areEqualScrapeConfigs(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5725 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171	2024-04-04 01:26:38 +03:00
Zakhar Bessarab	80315e07b1	lib/promscrape/config: fix missing timeout for http client (#6063 ) Follow-up for `b3b29ba6` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-04-04 00:40:48 +03:00
Zakhar Bessarab	da4352fe7b	lib/{promauth,promscrape}: automatically refresh root CA certificates after changes on disk (#5725 ) * lib/{promauth,promscrape}: automatically refresh root CA certificates after changes on disk Added a custom `http.RoundTripper` implementation which checks for root CA content changes and updates `tls.Config` used by `http.RoundTripper` after detecting CA change. Client certificate changes are not tracked by this implementation since `tls.Config` already supports passing certificate dynamically by overriding `tls.Config.GetClientCertificate`. This change implements dynamic reload of root CA only for streaming client used for scraping. Blocking client (`fasthttp.HostClient`) does not support using custom transport so can't use this implementation. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: update NewRoundTripper API Update API to allow user to update only parameters required for transport. Add warning log when reloading Root CA failed. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: fix mutex acquire logic Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: replace RWMutex with regular mutex to simplify the code - remove additional mutex used for getRootCABytes - require callee to use mutex - replace RWMutex with regular mutex Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: refactor - hold the mutex lock to avoid round tripper being re-created twice - move recreation logic into separate func to simplify the code Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-04-04 00:34:43 +03:00
Aliaksandr Valialkin	00f59d6ddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-03 03:00:45 +03:00
Aliaksandr Valialkin	c38c45d71f	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:47:23 +02:00
Aliaksandr Valialkin	d845edc24b	lib: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:10:04 +02:00
Aliaksandr Valialkin	82e38e1627	lib/promscrape: add support for `enable_compression` option in the same way as Prometheus does Updates https://github.com/prometheus/prometheus/pull/13166 Updates https://github.com/prometheus/prometheus/issues/12319 Do not document enable_compression option at docs/sd_configs.md, since vmagent already supports more clear disable_compression option - see https://docs.victoriametrics.com/vmagent/#scrape_config-enhancements	2024-02-18 19:42:09 +02:00
Aliaksandr Valialkin	f0db7d474f	lib/promscrape/discovery/kuma: add support for `client_id` option See https://github.com/prometheus/prometheus/pull/13278	2024-02-18 19:19:55 +02:00
Aliaksandr Valialkin	61608b6303	lib/promscrape: avoid copying labels when -promscrape.dropOriginalLabels command-line flag is set This should save some CPU This regression has been introduced in `487f6380d0` when working on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2024-02-14 03:26:32 +02:00
helen	99ea84f0fd	clean unused code (#5735 ) Signed-off-by: helen <haitao.zhang@daocloud.io>	2024-01-31 19:51:35 +02:00
Aliaksandr Valialkin	09c388a8e4	lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics), it doesn't work well when scraping targets with big responses such as kube-state-metrics. In this case it could use big amounts of additional memory comparing to net/http.Client, since fasthttp.Client reads the full response in memory and then tries re-using the large buffer for further scrapes. Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects and scrape timeouts like the following ones: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5425 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017 This should help reducing memory usage for the case when target returns big response and this response is scraped by fasthttp.Client at first before switching to stream parsing mode for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape after reading the response body in memory and noticing that its size exceeds the value passed to -promscrape.minResponseSizeForStreamParse command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567 Overrides https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4931	2024-01-30 18:39:55 +02:00
Aliaksandr Valialkin	645365b2d1	lib/promscrape: fix BenchmarkScrapeWorkScrapeInternal, which has been broken by the commit `65bc460323`	2024-01-30 16:07:40 +02:00
Aliaksandr Valialkin	b8c7f0d3bc	lib/promscrape/discovery/kubernetes: typo fix in the comment for ContainerStateTerminated struct This is a follow-up for `ef12598ad4`	2024-01-24 15:10:47 +02:00
Aliaksandr Valialkin	1e364c992d	lib/promscrape/discovery/kubernetes: do not generate targets for already terminated pods and containers Already terminated pods and containers cannot be scraped and will never resurrect, so there is zero sense in creating scrape targets for them.	2024-01-24 14:58:51 +02:00
Roman Khavronenko	8461add541	lib/promscrape: respect `0` value for `series_limit` param (#5663 ) * lib/promscrape: respect `0` value for `series_limit` param Respect `0` value for `series_limit` param in `scrape_config` even if global limit was set via `-promscrape.seriesLimitPerTarget`. Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`. This behavior aligns with possibility to override `series_limit` value via relabeling with `__series_limit__` label. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 13:09:36 +02:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	c6f6f094c5	Revert "lib/promscrape: do not store last scrape response when stale markers … (#5577 )" This reverts commit `cfec258803`. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5577	2024-01-22 01:46:12 +02:00
Hui Wang	49fa92c1d0	lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557 ) * lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long. * remove mislead comment * docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 * wip * lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds. But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice` is being registered and the discovery of the associated `pod` and/or `service` objects takes longer than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details. Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher if the number of in-flight calls is non-zero. P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets. * typo fix --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-22 01:33:17 +02:00
Aliaksandr Valialkin	885ee160c2	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-22 01:23:23 +02:00
Aliaksandr Valialkin	be5faef552	lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics This cleanup has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5557/files#diff-6b205cf6637d7b65a5c45d9417d08822d4efad94227268cb196f61aa2a0fc0f7	2024-01-22 01:12:56 +02:00
Aliaksandr Valialkin	e15f07d989	all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually	2024-01-22 01:11:59 +02:00
Aliaksandr Valialkin	2c7c812a9d	lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag This flag allows setting attach_metadata.node=true for all the kubernetes_sd_configs defined at -promscrape.config Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 Thanks to wasim-nihal for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5593	2024-01-22 01:08:52 +02:00
Aliaksandr Valialkin	c05982bfa7	lib/promscrape/discovery/hetzner: follow-up after `03a97dc678` - docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names, document missing __meta_hetzner_role label. - lib/promscrape/config.go: added missing MustStop() call for Hetzner SD, and moved the code to the correct place according to alphabetical order of SD names. - lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses, populate missing __meta_hetzner_role label like Prometheus does. - Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does. - Remove unused SDConfig.Token. - Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154	2024-01-22 00:53:23 +02:00
Hui Wang	66eb013b54	lib/promscrape: do not store last scrape response when stale markers … (#5577 ) * lib/promscrape: do not store last scrape response when stale markers are disabled * update changelog	2024-01-22 00:52:25 +02:00
Aliaksandr Valialkin	09f23b0296	lib/promscrape: cosmetic changes after `3ac44baebe` - Rename mustLoadScrapeConfigFiles() to loadScrapeConfigFiles(), since now it may return error. - Split too long line with the error message into two lines in order to improve readability a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5508 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5560	2024-01-17 01:07:16 +02:00
Hui Wang	2f40ed3aac	exit vmagent if there is config syntax error in `scrape_config_files` when `-promscrape.config.strictParse=true` (#5560 )	2024-01-16 22:35:18 +02:00
Aliaksandr Valialkin	0196902b2e	lib/promscrape/discovery/hetzner: fix golangci-lint warnings after `03a97dc678` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550	2024-01-16 21:51:48 +02:00
Aleksandr Stepanov	3a6e3adc7d	vmagent: added hetzner sd config (#5550 ) * added hetzner robot and hetzner cloud sd configs * remove gettoken fun and update docs * Updated CHANGELOG and vmagent docs * Updated CHANGELOG and vmagent docs --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-16 21:13:20 +02:00
Aliaksandr Valialkin	d566aa7d78	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-16 20:48:30 +02:00
Aliaksandr Valialkin	261c173f4b	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:25:44 +02:00
hagen1778	14117f2f90	lib/promscrape: comsetic changes after `e373bb84d5` * fix typos in docs * add `shard-` prefix to generated links when `-promscrape.cluster.memberURLTemplate` is enabled Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `e0fc5ef140`)	2023-12-12 13:45:34 +01:00
Aliaksandr Valialkin	475ae2a1be	lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets	2023-12-08 19:04:29 +02:00
Aliaksandr Valialkin	9074ab68d4	lib/promscrape: add `-promscrape.cluster.memberURLTemplate` command-line flag for creating direct links to vmagent instances at /service-discovery page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018#issuecomment-1843811569	2023-12-07 16:05:03 +02:00
Aliaksandr Valialkin	896a0f32cd	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:11:30 +02:00
Aliaksandr Valialkin	e8dfecb3f1	lib/promscrape: show `never scraped` message for never scraped targets at /targets page	2023-12-06 22:33:27 +02:00
Aliaksandr Valialkin	8b6bce61e4	lib/promscrape: follow-up for `97373b7786` Substitute O(N^2) algorithm for exposing the `vm_promscrape_scrape_pool_targets` metric with O(N) algorithm, where N is the number of scrape jobs. The previous algorithm could slow down /metrics exposition significantly when -promscrape.config contains thousands of scrape jobs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5311 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5335	2023-12-06 17:36:48 +02:00
Hui Wang	065f5a7f9e	vmagent: add `vm_promscrape_scrape_pool_targets` for scrape jobs like… (#5335 ) * vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers * add extra panel for new metric	2023-12-06 14:46:02 +02:00
Aliaksandr Valialkin	b6d6a3a530	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-04 17:42:46 +02:00
Aliaksandr Valialkin	2f14394335	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:13:39 +02:00
Nikolay	25ac2aac31	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-25 12:12:29 +02:00
Aliaksandr Valialkin	15dda54e79	lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller This allows substituting FATAL panics with recoverable runtime errors such as missing or invalid TLS CA file and/or missing/invalid /var/run/secrets/kubernetes.io/serviceaccount/namespace file. Now these errors are logged instead of PANIC'ing, so they can be fixed by updating the corresponding files without the need to restart vmagent. This is a follow-up for `90427abc65` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5243	2023-10-27 20:27:58 +02:00
Hui Wang	a37125d043	lib/promscrape/discovery/kubernetes: avoid possible panic if given caFile under kubernetes.SDConfig.HTTPClientConfig is not exist (#5243 ) follow up `d5a599badc`	2023-10-27 20:27:58 +02:00
Aliaksandr Valialkin	20aeb8b65d	lib/promscrape/discovery/kubernetes: stop all the url watchers, which belong to a particular groupWatcher, at once Previously url watchers for pod, service and node objects could be mistakenly closed when service discovery was set up only for endpoints and endpointslice roles, since watchers for these roles may start start pod, service and node url watchers with nil apiWatcher passed to groupWatcher.startWatchersForRole(). Now all the url watchers, which belong to a particular groupWatcher, are stopped at once when this groupWatcher has no apiWatcher subscribers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5216 The issue has been introduced in v1.93.5 when addressing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-10-27 14:34:25 +02:00
Hui Wang	69f4a58f76	do not print redundant error logs when failed to scrape consul or no… (#5239 ) * do not print redundant error logs when failed to scrape consul or nomad target prometheus performs the same because it uses consul lib which just drops the error(`1806bcb38c/api/api.go (L1134)`)	2023-10-27 14:18:47 +02:00

1 2 3 4 5 ...

678 commits