github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7e1dd8ab9d	lib: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:07:53 +02:00
helen	c8a96ac241	clean unused code (#5735 ) Signed-off-by: helen <haitao.zhang@daocloud.io>	2024-01-31 17:50:36 +00:00
Aliaksandr Valialkin	bc7cf4950b	lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics), it doesn't work well when scraping targets with big responses such as kube-state-metrics. In this case it could use big amounts of additional memory comparing to net/http.Client, since fasthttp.Client reads the full response in memory and then tries re-using the large buffer for further scrapes. Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects and scrape timeouts like the following ones: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5425 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017 This should help reducing memory usage for the case when target returns big response and this response is scraped by fasthttp.Client at first before switching to stream parsing mode for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape after reading the response body in memory and noticing that its size exceeds the value passed to -promscrape.minResponseSizeForStreamParse command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567 Overrides https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4931	2024-01-30 18:39:10 +02:00
Aliaksandr Valialkin	74448a7e57	lib/promscrape/discovery/hetzner: follow-up after `03a97dc678` - docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names, document missing __meta_hetzner_role label. - lib/promscrape/config.go: added missing MustStop() call for Hetzner SD, and moved the code to the correct place according to alphabetical order of SD names. - lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses, populate missing __meta_hetzner_role label like Prometheus does. - Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does. - Remove unused SDConfig.Token. - Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154	2024-01-20 17:01:53 +02:00
Aleksandr Stepanov	03a97dc678	vmagent: added hetzner sd config (#5550 ) * added hetzner robot and hetzner cloud sd configs * remove gettoken fun and update docs * Updated CHANGELOG and vmagent docs * Updated CHANGELOG and vmagent docs --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-15 10:13:22 +01:00
Aliaksandr Valialkin	5a88bc973f	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:23:42 +02:00
Aliaksandr Valialkin	7cb8ed8271	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:05:32 +02:00
Aliaksandr Valialkin	487f6380d0	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-01 16:48:48 +02:00
Aliaksandr Valialkin	5034aa0773	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:09:44 +02:00
Nikolay	090cb2c9de	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:42:11 +01:00
Aliaksandr Valialkin	d5a599badc	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-25 23:19:37 +02:00
Hui Wang	e16d3f5639	fix inconsistent behaviors with prometheus when scraping (#5153 ) * fix inconsistent behaviors with prometheus when scraping 1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting; 2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header; 3. don't send requests if there are wrong auth configs in: 1. vmagent remoteWrite; 2. vmalert datasource/remoteRead/remoteWrite/notifier. * add changelogs * address review comments * fix ut	2023-10-17 17:58:19 +08:00
Aliaksandr Valialkin	140e7b6b74	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:42:06 -07:00
Alexander Marshalov	2e494e2375	fixed typos in documentation and commandline flags descriptions (#4275 )	2023-05-10 09:50:41 +02:00
Alexander Marshalov	8225a48b56	fixed `vm_promscrape_config_last_reload_successful` metric value recovery after successful reloading with unchanged content (#4260 ) (#4268 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-08 13:32:51 +02:00
Alexander Marshalov	56b84140a9	added new consulagent service discovery (#3953 ) (#4217 )	2023-05-04 11:36:21 +02:00
Aliaksandr Valialkin	d577657fb7	lib/streamaggr: follow-up for `ff72ca14b9` - Make sure that the last successfully loaded config is used on hot-reload failure - Properly cleanup resources occupied by already initialized aggregators when the current aggregator fails to be initialized - Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config This should simplify monitoring and debugging failed reloads - Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa could be in use at realoadSaConfig() - Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push(). - Remove fine-grained aggregator reload - reload all the aggregators on config change instead. This simplifies the code a bit. The fine-grained aggregator reload may be returned back if there will be demand from real users for it. - Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag - Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639	2023-03-31 22:30:38 -07:00
Zakhar Bessarab	5fadd58cf6	lib/promscrape: correctly register `vm_promscrape_config_` metrics (#3876 ) lib/promscrape: set `vm_promscrape_config_last_reload_successful` to 1 if there was no promscrape config provided Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape: register `vm_promscrape_config_*` metrics only in case promscrape config is used Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-27 11:53:53 -08:00
Aliaksandr Valialkin	f7ef80aaad	.golangci.yml: properly enable `revive` linter and fix all the warnings it detects	2023-02-26 12:18:59 -08:00
Alexander Marshalov	317fef95f9	add kuma_sd_config for Kuma Control Plane targets discovery (#3389 ) (#3840 )	2023-02-22 13:59:56 +01:00
Zakhar Bessarab	f13a255918	lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (#3791 ) * lib/promscrape: fix cancelling in-flight scrape requests during configuration reload when using `streamParse` mode (see #3747) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-09 11:13:06 -08:00
Karan Sharma	48f371a46c	lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549 ) Add nomad_sd_config support for service discovery	2023-01-05 23:03:58 +01:00
Aliaksandr Valialkin	a8b8e23d68	lib/promscrape: implement target-level and metric-level relabel debugging Target-level debugging is performed by clicking the 'debug' link at the corresponding target on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page. Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407 See https://docs.victoriametrics.com/vmagent.html#relabel-debug	2022-12-10 02:09:44 -08:00
Aliaksandr Valialkin	f325410c26	lib/promscrape: optimize service discovery speed - Return meta-labels for the discovered targets via promutils.Labels instead of map[string]string. This improves the speed of generating meta-labels for discovered targets by up to 5x. - Remove memory allocations in hot paths during ScrapeWork generation. The ScrapeWork contains scrape settings for a single discovered target. This improves the service discovery speed by up to 2x.	2022-11-29 21:26:00 -08:00
Roman Khavronenko	03d88bc066	vmagent: expose metrics for tracking config state (#3375 ) Expose `vm_relabel_config_` and `vm_promscrape_config_` metrics for tracking relabel and scrape configuration hot-reloads. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-11-22 00:38:43 +02:00
Aliaksandr Valialkin	c3f8481011	lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html	2022-08-15 01:40:20 +03:00
Fury	2c553d5a2f	add support to scrape multi tenant metrics (#2950 ) * add support to scrape multi tenant metrics * add support to scrape multi tenant metrics Co-authored-by: 赵福玉 <zhaofuyu@zhaofuyudeMac-mini.local>	2022-08-08 14:10:18 +03:00
Igor Tiunov	6e5ac32fba	YC service discovery (#2923 ) * YC service discovery https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386 * Fixed linter suggestions * fixed golint errors	2022-08-04 20:44:16 +03:00
Nikolay	7301aa678c	lib/promscrape: adds azure service discovery (#2743 ) * lib/promscrape: adds azure service discovery Adds azure service discovery mechanism implements authorization with oauth and msi lists virtual machines and virtual machines managed by scaleSet https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364 * makes linter happy * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-07-13 23:43:18 +03:00
ttyv	bdf9f4669a	lib/promscrape: fix vmagent tickerCh reload behaviour (#2786 ) Co-authored-by: Dmitriy <dab@ttyv.ru>	2022-06-30 12:33:01 +02:00
Roman Khavronenko	63b538ecd1	vmagent: update SD duration histogram metric if SD is active (#2677 ) The change updates histogram for registering SD update duration only SD is considered as `active`. SD is active if at least one scraper for this SD has started. This change supposed to reduce metrics cardinality produced by duration histogram which gets updated even if SD isn't configured. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-07 15:46:44 +03:00
Nikolay	26b78ad707	lib/promscrape: adds job restart method (#2455 ) * lib/promscrape: adds job restart method it must restart only ScrapeConfig with changed content this change greatly reduce time, that needed for job restart and it should decrease possible data loss when config frequently changed at kubernetes based deployments Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-16 20:28:46 +03:00
Aliaksandr Valialkin	f3d4671bb6	lib/promscrape: follow-up after `7e79adfb55`	2022-04-12 12:36:17 +03:00
Nikolay	7e79adfb55	lib/promscrape: allows to use k8s pod name as clusterMemberNum (#2436 ) * lib/promscrape: allows to use k8s pod name as clusterMemberNum it must improve user expirience and simplify clustering scrapers. it must allow to use vmagent cluster with distroless images https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2359 * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-12 12:24:11 +03:00
Corporte Gadfly	ad6bdd78d0	match fileSDCheckInterval with prometheus file_sd_config default (#2188 )	2022-02-15 12:04:26 +02:00
Aliaksandr Valialkin	2968779f16	lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics This feature may be useful when debugging metrics for the given target located in isolated environment	2022-02-03 19:00:55 +02:00
Aliaksandr Valialkin	e4e36383e2	lib/promscrape: do not send staleness markers on graceful shutdown This follows Prometheus behavior. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079	2022-01-07 01:17:57 +02:00
Aliaksandr Valialkin	06642d97f5	app: allow specifying http and https urls in the following command-line flags * -promscrape.config * -relabelConfig * -remoteWrite.relabelConfig * -remoteWrite.urlRelabelConfig	2021-12-03 00:10:02 +02:00
guidao	f05cddd2fc	fix #1830 (#1861 ) Co-authored-by: wangfeng <wangfeng@zhihu.com>	2021-11-30 01:12:24 +02:00
Aliaksandr Valialkin	cbfc7b7c92	app/{vminsert,vmagent}: hide passwords and auth tokens by default at `/config` page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764	2021-11-05 14:41:16 +02:00
Roman Khavronenko	c0a932a55f	lib/promscrape: make errcheck happy (#1703 )	2021-10-13 14:57:30 +03:00
Aliaksandr Valialkin	5a58c041c2	app/vmagent: expose -promscrape.config contents at /config page as Prometheus does See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695	2021-10-12 16:25:37 +03:00
Aliaksandr Valialkin	10f960fa0c	lib/promscrape: add ability to load scrape configs from multiple files See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559	2021-08-26 08:51:16 +03:00
Aliaksandr Valialkin	03c959f1df	lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509	2021-08-17 00:55:51 +03:00
Aliaksandr Valialkin	d826352688	app/vmagent: follow-up after `fe445f753b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491	2021-08-05 09:52:32 +03:00
Omar Ghader	46e27d60a6	feature: Add multitenant for vmagent (#1505 ) * feature: Add multitenant for vmagent * Minor fix * Fix rcs index out of range * Minor fix * Fix multi Init * Fix multi Init * Fix multi Init * Add default multi * Adjust naming * Add TenantInserted metrics * Add TenantInserted metrics * fix: remove unused metrics for vmagent * fix: remove unused metrics for vmagent Co-authored-by: mghader <marc.ghader@ubisoft.com> Co-authored-by: Sebastian YEPES <syepes@gmail.com>	2021-08-05 09:52:31 +03:00
Aliaksandr Valialkin	cb5453953f	lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common This is a follow up after `c85a5b7fcb`	2021-06-25 13:20:20 +03:00
Aliaksandr Valialkin	a69045e440	lib/promscrape: consistently sort service discovery routines This should simplify further maintenance of the code	2021-06-25 12:10:46 +03:00
Lu Jiajing	c85a5b7fcb	Support Docker ServiceDiscovery (#1402 ) * add docker discovery * add test * add labels test and add scrape work * remove TODO * refactor to merge apiConfig and sdConfig * apply suggestion	2021-06-25 11:42:47 +03:00
Nikolay	e307bbb29a	adds http_sd (#1399 ) * adds http_sd * adds X-Prometheus-Refresh-Interval-Seconds header * Update lib/promscrape/discovery/http/api.go Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2021-06-22 13:33:37 +03:00

1 2

97 commits