Commit graph

126 commits

Author SHA1 Message Date
Aliaksandr Valialkin
e08287f017 lib/promscrape: reload auth tokens from files every second
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin
a6cb4f10a7 app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin
027607db3e lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin
e6c19cb09d lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin
421a92983a lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:17:45 +03:00
Nikolay
535b3ff618 vmagent kubernetes_sd tests (#1253)
* first part of tests for kubernetes sd

* makes linter happy

* added more test cases

* adds pub/sub for tests
2021-04-29 10:17:45 +03:00
Aliaksandr Valialkin
b3da457629 lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:59:56 +03:00
Aliaksandr Valialkin
34321e5f8d lib/promscrape/discovery/kubernetes: refresh role: endpoints targets on service object removal as Prometheus does
This is a follow-up for ae37cfd528

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:27:29 +03:00
Aliaksandr Valialkin
db27dbab5e lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:12:22 +03:00
Aliaksandr Valialkin
02b83e0957 lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:10:04 +03:00
Aliaksandr Valialkin
db56ee0e28 lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws 2021-04-06 11:11:53 +03:00
Aliaksandr Valialkin
edd66b7e82 lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions 2021-04-05 22:43:49 +03:00
Lu Jiajing
4ee6def68b fix access to nil *url.URL (#1180)
* fix access to nil *url.URL

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>

* Update lib/promscrape/discovery/kubernetes/api_watcher.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:26:43 +03:00
Aliaksandr Valialkin
7eca60694e lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:05:02 +03:00
Aliaksandr Valialkin
9da2ef3d8f lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:22 +03:00
Aliaksandr Valialkin
fe084fdd33 lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.

There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:37:07 +03:00
Aliaksandr Valialkin
87700f1259 lib/promscrape: add support for authorization config in -promscrape.config as Prometheus 2.26 does
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:20:37 +03:00
Aliaksandr Valialkin
245eba8896 lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
This is a follow-up for 12e4785fe8

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:46:34 +03:00
Aliaksandr Valialkin
eee860f83d lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:29:24 +03:00
Aliaksandr Valialkin
7d87d42a91 lib/promscrape/discovery/kubernetes: typo fix in error message 2021-03-26 12:46:33 +02:00
Aliaksandr Valialkin
a920e71809 lib/promscrape/discovery/kubernetes: properly handle too old resource version error message from Kubernetes watch API 2021-03-26 12:28:35 +02:00
Aliaksandr Valialkin
894246176f lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded 2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
b0b28eeb93 lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
2021-03-14 21:16:41 +02:00
Aliaksandr Valialkin
620f05cd2c lib/promscrape/discovery: fixes after 133b288681
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher

This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:22:38 +02:00
Aliaksandr Valialkin
8fc29ffc67 lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 17:04:14 +02:00
Aliaksandr Valialkin
41f641b132 lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
This is a follow-up for e772d1c920
2021-03-11 13:08:56 +02:00
Aliaksandr Valialkin
6c9cd3f7c1 lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.

See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:08:40 +02:00
Aliaksandr Valialkin
7b66c8cbf8 lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
Log the number of objects loaded per each watch url

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:07:12 +02:00
Aliaksandr Valialkin
c04505e585 lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config role
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.

This is a follow-up for 17b87725ed

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
5807ff57f3 lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197 lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file 2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
bae7a1b47a lib/promscrape/discovery/kubernetes: fix tests after e154f4a644 2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd Fix ingress discovery api (#1110) 2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1

This fixes a panic when the ScrapeWork is filtered out in swcFunc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
ac5c47a9f5 lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric 2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
f9c1fe3852 lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:44:19 +02:00
Aliaksandr Valialkin
b89a4fac2f lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c 2021-03-01 14:31:44 +02:00
Aliaksandr Valialkin
c3bf72992f lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:15:16 +02:00
Aliaksandr Valialkin
3e44d9947e lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
A follow-up for b21b110b7a
2021-02-28 17:06:49 +02:00
Aliaksandr Valialkin
0ef7a94056 lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_* metrics for monitoring internal state of k8s service discovery 2021-02-28 16:58:45 +02:00
Aliaksandr Valialkin
f52bdbe2a3 lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan query arg when watching for k8s object changes, since it cannot be used when watch=1 query arg is passed 2021-02-28 16:08:43 +02:00
Aliaksandr Valialkin
5c9e657808 lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:25:33 +02:00
Aliaksandr Valialkin
e77f2f8630 lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b 2021-02-28 15:15:27 +02:00
Aliaksandr Valialkin
82441537ff lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject() 2021-02-28 15:09:43 +02:00
Aliaksandr Valialkin
e003453941 lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
This should deflake tests where the order of labels isn't stable
2021-02-28 14:12:32 +02:00
Aliaksandr Valialkin
6a21ef87b7 lib/promscrape: add missing startWatchersForRole() call at the beginning of apiWatcher.getLabelsForRole 2021-02-28 14:00:00 +02:00
Aliaksandr Valialkin
6d0e7fb8b0 lib/promscrape/discovery/kubernetes: reload k8s resources on every error
This is needed for obtaining fresh resourceVersion
2021-02-27 01:46:59 +02:00
Aliaksandr Valialkin
fa3ce450fb lib/promscrape: cache ScrapeWork
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:41 +02:00
Aliaksandr Valialkin
efcdf613c2 lib/promscrape/discovery/kubernetes: cache target labels
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:24:29 +02:00
Aliaksandr Valialkin
22822feea3 lib/promscrape/discovery/kubernetes: errcheck fix 2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
dc8c045378 lib/promscrape: cleanup after 9b2246c29b
Main points:

* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 19:09:12 +02:00
Nikolay
cf9262b01f vmagent kubernetes watch stream discovery. (#1082)
* started work on sd for k8s

* continue work on watch sd

* fixes

* continue work

* continue work on sd k8s

* disable gzip

* fixes typos

* log errror

* minor fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
4cfac70cde lib/promscrape: remove duplicate code a bit 2021-02-26 15:54:07 +02:00
Aliaksandr Valialkin
fccb481de2 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_* and __meta_kuberntes_endpoints_annotation_* labels to role: endpoints
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:36 +02:00
Nikolay
dd2f815204 fixes kubernetes_sd (#983)
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982

* fix test
2020-12-24 11:34:12 +02:00
Aliaksandr Valialkin
367fc17933 lib/promscrape: code prettifying for 8dd03ecf19 2020-12-24 10:57:20 +02:00
Nikolay
b00f7816e2 adds proxy_url support, (#980)
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503

* fixes imports
2020-12-24 10:57:19 +02:00
Vasily
8ba168f3be
Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig (#796)
* Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig

* Add omitempty annotation to all the default/optional values

* Fix annotations after review
2020-11-13 16:17:03 +02:00
Aliaksandr Valialkin
b4efe626d7 lib/promscrape/discovery/kubernetes: go fmt 2020-11-07 13:04:09 +02:00
Aliaksandr Valialkin
92bc1afcee lib/promscrape/discovery/kubernetes: reduce memory usage for labels when discovering big number of scrape targets by using string concatenation instead of fmt.Sprintf
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 13:03:01 +02:00
Aliaksandr Valialkin
114cf24b43 lib/promscrape/discovery/dns: add __meta_dns_srv_record_target and __meta_dns_srv_record_port labels
This syncs dns service discovery with Prometheus 2.21 - see https://github.com/prometheus/prometheus/releases
and https://github.com/prometheus/prometheus/pull/7678 .
2020-09-11 21:35:39 +03:00
Nikolay Khramchikhin
af994562c8 Added endpointslices discovery to k8s api (#760)
This is similar to https://github.com/prometheus/prometheus/pull/6838 , which will be added in Prometheus v2.21.
See https://github.com/prometheus/prometheus/releases/tag/v2.21.0-rc.1

* Added endpointslices discovery to k8s api

Started from 1.17 k8s version endpointslices is beta,
it allows to query k8s api for endpoints more efficient.
It presents at scrape_config.yaml as separate role for kubernetes_sd_config.
kubernetes_sd_config:
- role: endpointslices

* fixed typos, changed EndpointConditions signature - with values instead of pointers
2020-09-11 12:24:50 +03:00
Aliaksandr Valialkin
d962568e93 all: use %w instead of %s for wrapping errors in fmt.Errorf
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
d54a93fc81 app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
89aa6dbf56 lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
ed91fe1d9b lib/promscrape: move common code for discovery api config map handling into discoveryutils 2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc lib/promscrape/discovery/kubernetes/: unify apiConfig creation 2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
b6d88bac04 vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
The upstream fasthttp may contain issues like 996610f021 ,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
069690e3bd lib/promscrape: initial implementation for gce_sd_configs aga Prometheus-compatible service discovery for Google Compute Engine 2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
de991551f5 lib/promscrape: query /api/v1/namespaces/* for the configured namespaces in kubernetes_sd_config
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00
Aliaksandr Valialkin
e220f3eeb6 lib/promscrape: move KubernetesSDConfig to lib/promscrape/discovery/kubernetes 2020-04-23 11:34:30 +03:00
Aliaksandr Valialkin
1187494c8f lib/promscrape/discovery/kubernetes: hide role switch logic behind GetLabels function 2020-04-22 22:16:18 +03:00
Aliaksandr Valialkin
81481abaa9 lib/promscrape/discovery/kubernetes: reuse a client for empty api_server inside different jobs 2020-04-20 17:07:37 +03:00
Aliaksandr Valialkin
6764efde39 lib/promscrape/discovery/kubernetes: update stale comments 2020-04-17 14:06:26 +03:00
Aliaksandr Valialkin
391fb0903e lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning 2020-04-14 14:19:26 +03:00
Aliaksandr Valialkin
7c4fb038e3 lib/promscrape: add initial support for kubernetes_sd_config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:53 +03:00