VictoriaMetrics/lib/promscrape
Hui Wang 4e3242b02d
lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 23:13:15 +02:00
..
discovery lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557) 2024-01-21 23:13:15 +02:00
discoveryutils lib/promauth: follow-up for e16d3f5639 2023-10-25 23:19:37 +02:00
testdata lib/promscrape: disable support for service discovery and metrics scrape via http2 2023-07-06 16:03:37 -07:00
client.go lib/promauth: follow-up for e16d3f5639 2023-10-25 23:19:37 +02:00
config.go all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-21 22:03:38 +02:00
config_test.go lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page 2023-12-07 00:05:32 +02:00
config_timing_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00
relabel_debug.go app/vmselect: small cleanup after 4f3f9950d0 2023-05-08 14:57:11 -07:00
scraper.go lib/promscrape/discovery/hetzner: follow-up after 03a97dc678 2024-01-20 17:01:53 +02:00
scrapework.go lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics 2024-01-21 05:18:22 +02:00
scrapework_test.go lib/promscrape: follow-up for 97373b7786 2023-12-06 17:35:50 +02:00
scrapework_timing_test.go lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics 2022-11-28 18:37:09 -08:00
statconn.go lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4 2023-10-25 17:57:56 -07:00
statconn_test.go lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4 2023-10-25 17:57:56 -07:00
targetstatus.go lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets 2023-12-08 19:03:51 +02:00
targetstatus.qtpl lib/promscrape: comsetic changes after e373bb84d5 2023-12-12 11:28:18 +01:00
targetstatus.qtpl.go lib/promscrape: comsetic changes after e373bb84d5 2023-12-12 11:28:18 +01:00