VictoriaMetrics/lib/promscrape/discovery
Hui Wang 52d0e776ed
lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-22 01:35:51 +02:00
..
azure lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
consul do not print redundant error logs when failed to scrape consul or no… (#5239) 2023-10-27 14:20:21 +02:00
consulagent lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
digitalocean lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
dns Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 11:15:51 +02:00
docker lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
dockerswarm lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
ec2 Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 11:15:51 +02:00
eureka lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
gce Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 11:15:51 +02:00
http lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
kubernetes lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557) 2024-01-22 01:35:51 +02:00
kuma lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
nomad do not print redundant error logs when failed to scrape consul or no… (#5239) 2023-10-27 14:20:21 +02:00
openstack lib/promscrape/discovery: close unused HTTP connections to service discovery servers 2023-07-27 14:47:55 -07:00
yandexcloud chore: Use http constants to replace numbers (#3846) 2023-02-22 18:59:32 -08:00