Aliaksandr Valialkin
b186b63e07
lib/promscrape/discovery/kubernetes: allow attaching node-level labels to role: endpoints
and role: endpointlice
targets in the same way as Prometheus does
...
See https://github.com/prometheus/prometheus/pull/10759
2022-07-06 23:18:59 +03:00
Aliaksandr Valialkin
1c4f67c5d2
lib/promauth: add ability to send additional http headers in requests to scrape targets
...
This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header
2022-06-22 20:39:43 +03:00
Roman Khavronenko
d5eb6afe26
lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery ( #2615 )
...
* lib/promscrape/discovery/kubernetes: properly updates discovered scrape works
previously, added or updated scrapeworks may override previuosly
discovered.
it happens because swosByKey may contain small subset of kubernetes
objects with it's labels.
It happens for objectsUpdated and objectsAdded maps, which include only changed elements
* Properly calculate vm_promscrape_discovery_kubernetes_scrape_works
Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-21 01:01:37 +03:00
Aliaksandr Valialkin
123aa4c79e
lib/promscrape: properly implement ScrapeConfig.clone()
...
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.
This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
2022-05-07 00:05:40 +03:00
Aliaksandr Valialkin
6f79b2b68b
lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects
...
Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste
2022-04-22 23:14:17 +03:00
Aliaksandr Valialkin
15190fcdae
lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
...
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:41 +03:00
Aliaksandr Valialkin
57a0aa204d
lib/promscrape/discovery/kubernetes: improve the performance of urlWatcher.reloadObjects() on multi-CPU systems
...
Parallelize the generation of ScrapeWork objects there. Previously they were generated in a single goroutine.
2022-04-22 13:22:01 +03:00
Aliaksandr Valialkin
167d1bea8f
lib/promscrape/discovery/kubernetes: properly update endpoints and endpointslice objects when the related pod or service objects are updated
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
This is a follow-up for 2341bd48d7
2022-04-21 13:06:22 +03:00
Aliaksandr Valialkin
a2de31f8d3
lib/promscrape/discovery/kubernetes: do not pre-allocate memory for ScrapeWork objects
...
There is high chance that ScrapeWork objects won't be generated because of relabeling
2022-04-20 16:40:25 +03:00
Aliaksandr Valialkin
2341bd48d7
lib/promscrape: follow-up after 91e290a8ff
2022-04-20 16:11:37 +03:00
Nikolay
91e290a8ff
lib/promscrape: reduce latency for k8s GetLabels ( #2454 )
...
replaces internStringMap with sync.Map - it greatly reduces lock contention
concurently reload scrape work for api watcher - each object labels added by dedicated CPU
changes can be tested with following script https://gist.github.com/f41gh7/6f8f8d8719786aff1f18a85c23aebf70
2022-04-20 16:09:40 +03:00
Aliaksandr Valialkin
f6d0e5e74a
all: typo fix: Kuberntes -> Kubernetes
2022-04-20 10:50:49 +03:00
Aliaksandr Valialkin
355a63733d
lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace
...
See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881
2022-01-13 22:44:35 +02:00
Aliaksandr Valialkin
5c63d69454
lib/promscrape/discovery/kubernetes: return back support role: endpointslices
, since it is used by VictoriaMetrics operator
...
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:03 +03:00
Aliaksandr Valialkin
31b42b30b6
lib/promscrape/discovery/kubernetes: rename role: endpointslices
to role: endpointslice
to be consistent with Prometheus
...
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:08 +03:00
Aliaksandr Valialkin
2e001db4de
lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress
and role: endpointslices
...
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122
The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:16:59 +03:00
Aliaksandr Valialkin
22585531ad
lib/promscrape/discovery/kubernetes: make golangci-lint
happy by removing empty branches
2021-05-20 12:00:29 +03:00
Aliaksandr Valialkin
eb8093ca6b
lib/promscrape/discovery/kubernetes: reload objects on object parse error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:25:48 +03:00
Aliaksandr Valialkin
0cf1fe19e6
lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey
2021-05-18 15:40:46 +03:00
Aliaksandr Valialkin
bfb61de606
lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
...
Previously it wasn't descreased during config update.
2021-05-18 15:33:45 +03:00
Aliaksandr Valialkin
3166994244
lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:10:48 +03:00
Aliaksandr Valialkin
2eed410466
lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
...
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:47:08 +03:00
Aliaksandr Valialkin
733706e6c6
lib/promscrape: reload auth tokens from files every second
...
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:00:08 +03:00
Aliaksandr Valialkin
10a47af631
app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
...
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:12:24 +03:00
Aliaksandr Valialkin
cfd6aa28e1
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
904bbffc7f
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:50 +03:00
Aliaksandr Valialkin
2ab1266593
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:14:26 +03:00
Aliaksandr Valialkin
2ec7d8b384
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:57:51 +03:00
Aliaksandr Valialkin
eddba29664
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:26:57 +03:00
Aliaksandr Valialkin
ae37cfd528
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:11:40 +03:00
Aliaksandr Valialkin
59f9960992
lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
...
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:07:39 +03:00
Aliaksandr Valialkin
3ec6639bbb
lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws
2021-04-06 11:12:13 +03:00
Lu Jiajing
b59164cf33
fix access to nil *url.URL ( #1180 )
...
* fix access to nil *url.URL
Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
* Update lib/promscrape/discovery/kubernetes/api_watcher.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:25:31 +03:00
Aliaksandr Valialkin
b46194472f
lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:04:30 +03:00
Aliaksandr Valialkin
a51d0ec6ec
lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
...
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:17 +03:00
Aliaksandr Valialkin
f010d773d6
lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
...
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.
There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:20:12 +03:00
Aliaksandr Valialkin
5b08e6fb16
lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
...
This is a follow-up for 12e4785fe8
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:45:32 +03:00
Aliaksandr Valialkin
12e4785fe8
lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:28:30 +03:00
Aliaksandr Valialkin
f39c84b21f
lib/promscrape/discovery/kubernetes: typo fix in error message
2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161
lib/promscrape/discovery/kubernetes: properly handle too old resource version
error message from Kubernetes watch API
2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
b88806ecbf
lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded
2021-03-14 21:55:00 +02:00
Aliaksandr Valialkin
d409898515
lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
...
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-14 21:14:53 +02:00
Aliaksandr Valialkin
7a16e8e3a2
lib/promscrape/discovery: fixes after 133b288681
...
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher
This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:18:51 +02:00
Aliaksandr Valialkin
def014eb75
lib/promscrape/discovery/kubernetes: remove debug lines left after the commit 133b288681
2021-03-12 11:22:33 +02:00
Aliaksandr Valialkin
133b288681
lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
...
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 16:43:04 +02:00
Aliaksandr Valialkin
bebcb8130c
lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
...
This is a follow-up for e772d1c920
2021-03-11 13:08:08 +02:00
Aliaksandr Valialkin
e772d1c920
lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
...
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.
See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:06:35 +02:00
Aliaksandr Valialkin
4b18a4f026
lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
...
Log the number of objects loaded per each watch url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:05:14 +02:00
Aliaksandr Valialkin
ab4f090c63
lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config
role
...
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.
This is a follow-up for 17b87725ed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 19:51:03 +02:00
Aliaksandr Valialkin
17b87725ed
lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
...
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:29:55 +02:00