github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7531e9084a	all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices This makes the code a bit clear.	2024-04-20 21:00:03 +02:00
Aliaksandr Valialkin	828e78ceb4	all: replace old https://docs.victoriametrics.com/sd_configs.html url with the new one - https://docs.victoriametrics.com/sd_configs/	2024-04-18 02:27:47 +02:00
Aliaksandr Valialkin	c81a633b02	all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/	2024-04-18 01:31:37 +02:00
Aliaksandr Valialkin	dc326f70b4	app/vmagent: support for DNS SRV urls at -remoteWrite.url, scrape target urls and service discovery urls Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053	2024-04-17 20:54:39 +02:00
Aliaksandr Valialkin	e3a26c0db6	lib/promscrape/discovery/consul: typo fix in the comment: enteprise -> enterprise	2024-04-16 19:34:18 +02:00
wanshuangcheng	83216e956c	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 01:11:12 -07:00
Aliaksandr Valialkin	967d5496cf	app/vmagent: follow-up for `b3b29ba6ac` - Automatically reload changed TLS root CA pointed by -remoteWrite.tlsCAFile command-line flag - Automatically reload changed TLS root CA configured via oauth2.tsl_config.ca_file option at -promscrape.config - Document the change as a feature instead of a bug at docs/CHANGELOG.md - Simplify the code at lib/promauth, which is responsible for reloading changed TLS root CA files. - Simplify the usage of lib/promauth.Config.NewRoundTripper() - now it accepts the base http.Transport instead of a callback, which can change the internal http.Transport. - Reuse the default tls config if lib/promauth.Config doesn't contain tls-specific configs. This should reduce memory usage a bit when tls isn't used for scraping big number of targets. - Do not re-read TLS root CA files on every processed request. Re-read them once per second. This should reduce CPU usage when scraping big number of targets over https. - Do not store cert.pem and key.pem files in TestTLSConfigWithCertificatesFilesUpdate, since they can be loaded from byte slices via crypto/tls.X509KeyPair(). - Remove obsolete comparisons of string representations for authConfig and proxyAuthConfig at areEqualScrapeConfigs(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5725 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171	2024-04-04 01:27:35 +03:00
Zakhar Bessarab	f80ac120f3	lib/promscrape/config: fix missing timeout for http client (#6063 ) Follow-up for `b3b29ba6` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-04-03 18:18:48 +02:00
Zakhar Bessarab	b3b29ba6ac	lib/{promauth,promscrape}: automatically refresh root CA certificates after changes on disk (#5725 ) * lib/{promauth,promscrape}: automatically refresh root CA certificates after changes on disk Added a custom `http.RoundTripper` implementation which checks for root CA content changes and updates `tls.Config` used by `http.RoundTripper` after detecting CA change. Client certificate changes are not tracked by this implementation since `tls.Config` already supports passing certificate dynamically by overriding `tls.Config.GetClientCertificate`. This change implements dynamic reload of root CA only for streaming client used for scraping. Blocking client (`fasthttp.HostClient`) does not support using custom transport so can't use this implementation. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: update NewRoundTripper API Update API to allow user to update only parameters required for transport. Add warning log when reloading Root CA failed. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: fix mutex acquire logic Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: replace RWMutex with regular mutex to simplify the code - remove additional mutex used for getRootCABytes - require callee to use mutex - replace RWMutex with regular mutex Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promauth/config: refactor - hold the mutex lock to avoid round tripper being re-created twice - move recreation logic into separate func to simplify the code Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-04-03 10:01:43 +02:00
Aliaksandr Valialkin	918cccaddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-02 23:16:29 +03:00
Aliaksandr Valialkin	ed523b5bbc	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:45:30 +02:00
Aliaksandr Valialkin	7e1dd8ab9d	lib: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:07:53 +02:00
Aliaksandr Valialkin	c42ddce159	lib/promscrape: add support for `enable_compression` option in the same way as Prometheus does Updates https://github.com/prometheus/prometheus/pull/13166 Updates https://github.com/prometheus/prometheus/issues/12319 Do not document enable_compression option at docs/sd_configs.md, since vmagent already supports more clear disable_compression option - see https://docs.victoriametrics.com/vmagent/#scrape_config-enhancements	2024-02-18 19:40:39 +02:00
Aliaksandr Valialkin	5a092e161c	lib/promscrape/discovery/kuma: add support for `client_id` option See https://github.com/prometheus/prometheus/pull/13278	2024-02-18 19:19:40 +02:00
Aliaksandr Valialkin	b564729d75	lib/promscrape: avoid copying labels when -promscrape.dropOriginalLabels command-line flag is set This should save some CPU This regression has been introduced in `487f6380d0` when working on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2024-02-14 03:25:36 +02:00
helen	c8a96ac241	clean unused code (#5735 ) Signed-off-by: helen <haitao.zhang@daocloud.io>	2024-01-31 17:50:36 +00:00
Aliaksandr Valialkin	bc7cf4950b	lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics), it doesn't work well when scraping targets with big responses such as kube-state-metrics. In this case it could use big amounts of additional memory comparing to net/http.Client, since fasthttp.Client reads the full response in memory and then tries re-using the large buffer for further scrapes. Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects and scrape timeouts like the following ones: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5425 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017 This should help reducing memory usage for the case when target returns big response and this response is scraped by fasthttp.Client at first before switching to stream parsing mode for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape after reading the response body in memory and noticing that its size exceeds the value passed to -promscrape.minResponseSizeForStreamParse command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567 Overrides https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4931	2024-01-30 18:39:10 +02:00
Aliaksandr Valialkin	c2373a8109	lib/promscrape: fix BenchmarkScrapeWorkScrapeInternal, which has been broken by the commit `65bc460323`	2024-01-30 16:06:06 +02:00
Aliaksandr Valialkin	ac5b740750	lib/promscrape/discovery/kubernetes: typo fix in the comment for ContainerStateTerminated struct This is a follow-up for `ef12598ad4`	2024-01-24 15:06:46 +02:00
Aliaksandr Valialkin	ef12598ad4	lib/promscrape/discovery/kubernetes: do not generate targets for already terminated pods and containers Already terminated pods and containers cannot be scraped and will never resurrect, so there is zero sense in creating scrape targets for them.	2024-01-24 14:57:53 +02:00
Roman Khavronenko	89e3c70ccd	lib/promscrape: respect `0` value for `series_limit` param (#5663 ) * lib/promscrape: respect `0` value for `series_limit` param Respect `0` value for `series_limit` param in `scrape_config` even if global limit was set via `-promscrape.seriesLimitPerTarget`. Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`. This behavior aligns with possibility to override `series_limit` value via relabeling with `__series_limit__` label. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 13:09:14 +02:00
Aliaksandr Valialkin	3449d563bd	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:40:32 +02:00
Aliaksandr Valialkin	d3ee3e0ef5	Revert "lib/promscrape: do not store last scrape response when stale markers … (#5577 )" This reverts commit `cfec258803`. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5577	2024-01-22 00:43:48 +02:00
Hui Wang	4e3242b02d	lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557 ) * lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long. * remove mislead comment * docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 * wip * lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds. But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice` is being registered and the discovery of the associated `pod` and/or `service` objects takes longer than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details. Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher if the number of in-flight calls is non-zero. P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets. * typo fix --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 23:13:15 +02:00
Aliaksandr Valialkin	1f105dde98	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-21 22:03:38 +02:00
Aliaksandr Valialkin	4eb9926125	lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics This cleanup has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5557/files#diff-6b205cf6637d7b65a5c45d9417d08822d4efad94227268cb196f61aa2a0fc0f7	2024-01-21 05:18:22 +02:00
Aliaksandr Valialkin	12f2c5679b	all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually	2024-01-21 05:11:05 +02:00
Aliaksandr Valialkin	7fba73ce11	lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag This flag allows setting attach_metadata.node=true for all the kubernetes_sd_configs defined at -promscrape.config Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 Thanks to wasim-nihal for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5593	2024-01-21 03:13:56 +02:00
Aliaksandr Valialkin	74448a7e57	lib/promscrape/discovery/hetzner: follow-up after `03a97dc678` - docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names, document missing __meta_hetzner_role label. - lib/promscrape/config.go: added missing MustStop() call for Hetzner SD, and moved the code to the correct place according to alphabetical order of SD names. - lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses, populate missing __meta_hetzner_role label like Prometheus does. - Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does. - Remove unused SDConfig.Token. - Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154	2024-01-20 17:01:53 +02:00
Hui Wang	cfec258803	lib/promscrape: do not store last scrape response when stale markers … (#5577 ) * lib/promscrape: do not store last scrape response when stale markers are disabled * update changelog	2024-01-20 00:53:41 +08:00
Aliaksandr Valialkin	41932db848	lib/promscrape: cosmetic changes after `3ac44baebe` - Rename mustLoadScrapeConfigFiles() to loadScrapeConfigFiles(), since now it may return error. - Split too long line with the error message into two lines in order to improve readability a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5508 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5560	2024-01-16 22:29:09 +02:00
Hui Wang	3ac44baebe	exit vmagent if there is config syntax error in `scrape_config_files` when `-promscrape.config.strictParse=true` (#5560 )	2024-01-16 17:30:02 +08:00
Aliaksandr Valialkin	4b42c8abbb	lib/promscrape/discovery/hetzner: fix golangci-lint warnings after `03a97dc678` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550	2024-01-15 17:12:40 +02:00
Aleksandr Stepanov	03a97dc678	vmagent: added hetzner sd config (#5550 ) * added hetzner robot and hetzner cloud sd configs * remove gettoken fun and update docs * Updated CHANGELOG and vmagent docs * Updated CHANGELOG and vmagent docs --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-15 10:13:22 +01:00
Aliaksandr Valialkin	d2c94a0663	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin	5a88bc973f	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:23:42 +02:00
hagen1778	e0fc5ef140	lib/promscrape: comsetic changes after `e373bb84d5` * fix typos in docs * add `shard-` prefix to generated links when `-promscrape.cluster.memberURLTemplate` is enabled Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-12 11:28:18 +01:00
Aliaksandr Valialkin	b05e1512d4	lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets	2023-12-08 19:03:51 +02:00
Aliaksandr Valialkin	e373bb84d5	lib/promscrape: add `-promscrape.cluster.memberURLTemplate` command-line flag for creating direct links to vmagent instances at /service-discovery page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018#issuecomment-1843811569	2023-12-07 16:04:21 +02:00
Aliaksandr Valialkin	7cb8ed8271	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:05:32 +02:00
Aliaksandr Valialkin	67468a0c46	lib/promscrape: show `never scraped` message for never scraped targets at /targets page	2023-12-06 22:33:39 +02:00
Aliaksandr Valialkin	65bc460323	lib/promscrape: follow-up for `97373b7786` Substitute O(N^2) algorithm for exposing the `vm_promscrape_scrape_pool_targets` metric with O(N) algorithm, where N is the number of scrape jobs. The previous algorithm could slow down /metrics exposition significantly when -promscrape.config contains thousands of scrape jobs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5311 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5335	2023-12-06 17:35:50 +02:00
Hui Wang	97373b7786	vmagent: add `vm_promscrape_scrape_pool_targets` for scrape jobs like… (#5335 ) * vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers * add extra panel for new metric	2023-12-06 15:44:39 +08:00
Aliaksandr Valialkin	487f6380d0	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-01 16:48:48 +02:00
Aliaksandr Valialkin	5034aa0773	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:09:44 +02:00
Nikolay	090cb2c9de	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:42:11 +01:00
Aliaksandr Valialkin	613b545dfd	lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller This allows substituting FATAL panics with recoverable runtime errors such as missing or invalid TLS CA file and/or missing/invalid /var/run/secrets/kubernetes.io/serviceaccount/namespace file. Now these errors are logged instead of PANIC'ing, so they can be fixed by updating the corresponding files without the need to restart vmagent. This is a follow-up for `90427abc65` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5243	2023-10-27 20:24:46 +02:00
Hui Wang	90427abc65	lib/promscrape/discovery/kubernetes: avoid possible panic if given caFile under kubernetes.SDConfig.HTTPClientConfig is not exist (#5243 ) follow up `d5a599badc`	2023-10-27 20:20:22 +02:00
Aliaksandr Valialkin	632d788b63	lib/promscrape/discovery/kubernetes: stop all the url watchers, which belong to a particular groupWatcher, at once Previously url watchers for pod, service and node objects could be mistakenly closed when service discovery was set up only for endpoints and endpointslice roles, since watchers for these roles may start start pod, service and node url watchers with nil apiWatcher passed to groupWatcher.startWatchersForRole(). Now all the url watchers, which belong to a particular groupWatcher, are stopped at once when this groupWatcher has no apiWatcher subscribers. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5216 The issue has been introduced in v1.93.5 when addressing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4850	2023-10-27 13:51:35 +02:00
Hui Wang	7c90ce39cb	do not print redundant error logs when failed to scrape consul or no… (#5239 ) * do not print redundant error logs when failed to scrape consul or nomad target prometheus performs the same because it uses consul lib which just drops the error(`1806bcb38c/api/api.go (L1134)`)	2023-10-27 13:31:55 +08:00

1 2 3 4 5 ...

677 commits