github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aleksandr Stepanov	3a6e3adc7d	vmagent: added hetzner sd config (#5550 ) * added hetzner robot and hetzner cloud sd configs * remove gettoken fun and update docs * Updated CHANGELOG and vmagent docs * Updated CHANGELOG and vmagent docs --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-16 21:13:20 +02:00
Aliaksandr Valialkin	261c173f4b	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:25:44 +02:00
Aliaksandr Valialkin	896a0f32cd	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:11:30 +02:00
Aliaksandr Valialkin	b6d6a3a530	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-04 17:42:46 +02:00
Aliaksandr Valialkin	2f14394335	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:13:39 +02:00
Nikolay	25ac2aac31	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-25 12:12:29 +02:00
Aliaksandr Valialkin	f03e81c693	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-26 09:55:47 +02:00
Hui Wang	d7dd7614eb	fix inconsistent behaviors with prometheus when scraping (#5153 ) * fix inconsistent behaviors with prometheus when scraping 1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting; 2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header; 3. don't send requests if there are wrong auth configs in: 1. vmagent remoteWrite; 2. vmalert datasource/remoteRead/remoteWrite/notifier. * add changelogs * address review comments * fix ut	2023-10-26 08:56:54 +02:00
Aliaksandr Valialkin	992c300ce9	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:48:26 -07:00
Alexander Marshalov	d321ea91f2	fixed typos in documentation and commandline flags descriptions (#4275 )	2023-05-10 02:22:06 -07:00
Alexander Marshalov	de68e94c91	fixed `vm_promscrape_config_last_reload_successful` metric value recovery after successful reloading with unchanged content (#4260 ) (#4268 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-09 22:17:27 -07:00
Alexander Marshalov	26fc4afff8	added new consulagent service discovery (#3953 ) (#4217 )	2023-05-08 23:43:59 -07:00
Aliaksandr Valialkin	dad13c0a91	lib/streamaggr: follow-up for `ff72ca14b9` - Make sure that the last successfully loaded config is used on hot-reload failure - Properly cleanup resources occupied by already initialized aggregators when the current aggregator fails to be initialized - Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config This should simplify monitoring and debugging failed reloads - Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa could be in use at realoadSaConfig() - Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push(). - Remove fine-grained aggregator reload - reload all the aggregators on config change instead. This simplifies the code a bit. The fine-grained aggregator reload may be returned back if there will be demand from real users for it. - Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag - Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639	2023-03-31 22:54:10 -07:00
Zakhar Bessarab	1db010797e	lib/promscrape: correctly register `vm_promscrape_config_` metrics (#3876 ) lib/promscrape: set `vm_promscrape_config_last_reload_successful` to 1 if there was no promscrape config provided Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape: register `vm_promscrape_config_*` metrics only in case promscrape config is used Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-27 12:06:49 -08:00
Aliaksandr Valialkin	18dd0d1dbf	.golangci.yml: properly enable `revive` linter and fix all the warnings it detects	2023-02-26 12:19:58 -08:00
Alexander Marshalov	173643a771	add kuma_sd_config for Kuma Control Plane targets discovery (#3389 ) (#3840 )	2023-02-22 17:41:43 -08:00
Zakhar Bessarab	bbf663bd04	lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (#3791 ) * lib/promscrape: fix cancelling in-flight scrape requests during configuration reload when using `streamParse` mode (see #3747) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-09 11:18:36 -08:00
Karan Sharma	8f42c5a024	lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549 ) Add nomad_sd_config support for service discovery	2023-01-05 18:07:02 -08:00
Aliaksandr Valialkin	97b41e727c	lib/promscrape: implement target-level and metric-level relabel debugging Target-level debugging is performed by clicking the 'debug' link at the corresponding target on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page. Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407 See https://docs.victoriametrics.com/vmagent.html#relabel-debug	2022-12-10 02:25:56 -08:00
Aliaksandr Valialkin	be6da5053f	lib/promscrape: optimize service discovery speed - Return meta-labels for the discovered targets via promutils.Labels instead of map[string]string. This improves the speed of generating meta-labels for discovered targets by up to 5x. - Remove memory allocations in hot paths during ScrapeWork generation. The ScrapeWork contains scrape settings for a single discovered target. This improves the service discovery speed by up to 2x.	2022-11-29 21:26:23 -08:00
Roman Khavronenko	d1169c1559	vmagent: expose metrics for tracking config state (#3375 ) Expose `vm_relabel_config_` and `vm_promscrape_config_` metrics for tracking relabel and scrape configuration hot-reloads. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-11-22 00:48:12 +02:00
Aliaksandr Valialkin	c27bd63f6c	lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html	2022-08-15 01:40:48 +03:00
Fury	59fdb4cb72	add support to scrape multi tenant metrics (#2950 ) * add support to scrape multi tenant metrics * add support to scrape multi tenant metrics Co-authored-by: 赵福玉 <zhaofuyu@zhaofuyudeMac-mini.local>	2022-08-08 14:49:15 +03:00
Igor Tiunov	0ba86fe87e	YC service discovery (#2923 ) * YC service discovery https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386 * Fixed linter suggestions * fixed golint errors	2022-08-04 22:28:20 +03:00
Nikolay	c007b129cb	lib/promscrape: adds azure service discovery (#2743 ) * lib/promscrape: adds azure service discovery Adds azure service discovery mechanism implements authorization with oauth and msi lists virtual machines and virtual machines managed by scaleSet https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364 * makes linter happy * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-07-13 23:45:43 +03:00
ttyv	00956e585d	lib/promscrape: fix vmagent tickerCh reload behaviour (#2786 ) Co-authored-by: Dmitriy <dab@ttyv.ru>	2022-06-30 13:52:44 +03:00
Roman Khavronenko	2b5e1dee91	vmagent: update SD duration histogram metric if SD is active (#2677 ) The change updates histogram for registering SD update duration only SD is considered as `active`. SD is active if at least one scraper for this SD has started. This change supposed to reduce metrics cardinality produced by duration histogram which gets updated even if SD isn't configured. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-07 15:53:06 +03:00
Nikolay	628905f080	lib/promscrape: adds job restart method (#2455 ) * lib/promscrape: adds job restart method it must restart only ScrapeConfig with changed content this change greatly reduce time, that needed for job restart and it should decrease possible data loss when config frequently changed at kubernetes based deployments Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-16 20:29:33 +03:00
Aliaksandr Valialkin	70ad171070	lib/promscrape: follow-up after `7e79adfb55`	2022-04-12 12:37:03 +03:00
Nikolay	e26bcb8bbb	lib/promscrape: allows to use k8s pod name as clusterMemberNum (#2436 ) * lib/promscrape: allows to use k8s pod name as clusterMemberNum it must improve user expirience and simplify clustering scrapers. it must allow to use vmagent cluster with distroless images https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2359 * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-12 12:37:02 +03:00
Corporte Gadfly	cf8df24227	match fileSDCheckInterval with prometheus file_sd_config default (#2188 )	2022-02-15 12:05:57 +02:00
Aliaksandr Valialkin	678b3e71db	lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics This feature may be useful when debugging metrics for the given target located in isolated environment	2022-02-03 19:02:12 +02:00
Aliaksandr Valialkin	fa89f3e5a5	lib/promscrape: do not send staleness markers on graceful shutdown This follows Prometheus behavior. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079	2022-01-07 01:19:06 +02:00
Aliaksandr Valialkin	d40441947a	app: allow specifying http and https urls in the following command-line flags * -promscrape.config * -relabelConfig * -remoteWrite.relabelConfig * -remoteWrite.urlRelabelConfig	2021-12-03 00:11:47 +02:00
guidao	6fa7ad69fc	fix #1830 (#1861 ) Co-authored-by: wangfeng <wangfeng@zhihu.com>	2021-11-30 01:16:15 +02:00
Aliaksandr Valialkin	847004fa77	app/{vminsert,vmagent}: hide passwords and auth tokens by default at `/config` page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764	2021-11-05 14:42:13 +02:00
Roman Khavronenko	5dab25e8ad	lib/promscrape: make errcheck happy (#1703 )	2021-10-13 15:11:45 +03:00
Aliaksandr Valialkin	aeedfe2fe2	app/vmagent: expose -promscrape.config contents at /config page as Prometheus does See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695	2021-10-12 16:27:37 +03:00
Aliaksandr Valialkin	7fdb4db73d	lib/promscrape: add ability to load scrape configs from multiple files See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559	2021-08-26 08:51:53 +03:00
Aliaksandr Valialkin	db34c40aec	lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509	2021-08-17 01:00:40 +03:00
Aliaksandr Valialkin	b877538622	app/vmagent: follow-up after `fe445f753b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491	2021-08-05 09:51:00 +03:00
Omar Ghader	fe445f753b	feature: Add multitenant for vmagent (#1505 ) * feature: Add multitenant for vmagent * Minor fix * Fix rcs index out of range * Minor fix * Fix multi Init * Fix multi Init * Fix multi Init * Add default multi * Adjust naming * Add TenantInserted metrics * Add TenantInserted metrics * fix: remove unused metrics for vmagent * fix: remove unused metrics for vmagent Co-authored-by: mghader <marc.ghader@ubisoft.com> Co-authored-by: Sebastian YEPES <syepes@gmail.com>	2021-08-05 09:44:29 +03:00
Aliaksandr Valialkin	97d1ccfc8e	lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common This is a follow up after `c85a5b7fcb`	2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin	4461e20e7d	lib/promscrape: consistently sort service discovery routines This should simplify further maintenance of the code	2021-06-25 13:22:16 +03:00
Lu Jiajing	12b4cbb68f	Support Docker ServiceDiscovery (#1402 ) * add docker discovery * add test * add labels test and add scrape work * remove TODO * refactor to merge apiConfig and sdConfig * apply suggestion	2021-06-25 13:22:16 +03:00
Nikolay	e03a3d3a36	adds http_sd (#1399 ) * adds http_sd * adds X-Prometheus-Refresh-Interval-Seconds header * Update lib/promscrape/discovery/http/api.go Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2021-06-22 13:42:09 +03:00
Nikolay	e42da47608	adds digital ocean sd (#1376 ) * adds digital ocean sd config * adds digital ocean sd https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367 * typo fix	2021-06-14 13:19:29 +03:00
Aliaksandr Valialkin	d77db9d813	all: do not skip SIGHUP signal during service initialization This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-21 16:38:20 +03:00
Aliaksandr Valialkin	6dc5d3b357	all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com	2021-04-20 20:20:01 +03:00
Aliaksandr Valialkin	7eca60694e	lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182	2021-04-05 22:05:02 +03:00

1 2

93 commits