VictoriaMetrics/lib/promscrape/discovery/kubernetes
Hui Wang 4e3242b02d
lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 23:13:15 +02:00
..
testdata lib/promscrape/discovery/kubernetes: follow-up after 0b5c874911 (#2672) 2022-06-01 20:44:45 +02:00
api.go lib/promscrape/discovery/kubernetes: propagate possible errors at newAPIWatcher() to the caller 2023-10-27 20:24:46 +02:00
api_watcher.go lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557) 2024-01-21 23:13:15 +02:00
api_watcher_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00
common_types.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00
endpoints.go Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253) 2023-05-05 15:46:07 +04:00
endpoints_test.go lib/promscrape/discovery/kubernetes: follow-up for d5e94721db (#4255) 2023-05-05 14:41:17 +02:00
endpointslice.go Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253) 2023-05-05 15:46:07 +04:00
endpointslice_test.go lib/promscrape/discovery/kubernetes: follow-up for d5e94721db (#4255) 2023-05-05 14:41:17 +02:00
ingress.go Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 10:16:42 +02:00
ingress_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00
kubeconfig.go all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-21 22:03:38 +02:00
kubeconfig_test.go lib/promscrape/discovery/kubernetes/kubeconfig_test.go: make TestParseKubeConfigSuccess test code easier to follow 2023-10-25 23:17:18 +02:00
kubernetes.go lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag 2024-01-21 03:13:56 +02:00
node.go Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 10:16:42 +02:00
node_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00
pod.go lib/promscrape/discovery/kubernetes: follow-up for d5e94721db (#4255) 2023-05-05 14:41:17 +02:00
pod_test.go lib/promscrape/discovery/kubernetes: add support for __meta_kubernetes_pod_container_id 2023-01-27 16:34:06 -08:00
pod_timing_test.go all: consistently use %w instead of %s in when error is passed to fmt.Errorf() 2023-10-25 21:24:03 +02:00
service.go Makefile: update golangci-lint from v1.51.2 to v1.54.2 2023-09-01 10:16:42 +02:00
service_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:00 -08:00