Commit graph

297 commits

Author SHA1 Message Date
Aliaksandr Valialkin
06f6de6d47
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:55:20 +03:00
Aliaksandr Valialkin
c27bd63f6c
lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html 2022-08-15 01:40:48 +03:00
Aliaksandr Valialkin
1a00c9ef03
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_pod_container_image label in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11034
2022-08-15 01:18:57 +03:00
Aliaksandr Valialkin
2fb63dda83
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_service_port_number label to role: service in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11002
2022-08-15 01:07:19 +03:00
Aliaksandr Valialkin
2b58bd9876
lib/promscrape/discovery/dns: add support for resolving MX records
See https://github.com/prometheus/prometheus/pull/10099
2022-08-15 00:33:06 +03:00
Aliaksandr Valialkin
77bd4e37cc
lib/promscrape/discovery/kubernetes: add missing __meta_kubernetes_ingress_class_name label for role: ingress
See 7e65ad3e43
and 7e1111ff14
2022-08-06 22:39:14 +03:00
Aliaksandr Valialkin
ecbe1ddf1b
lib/promscrape/discovery/ec2: properly handle custom endpoint option in ec2_sd_configs
This option was ignored since d289ecded1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287
2022-08-05 18:52:37 +03:00
Aliaksandr Valialkin
80ecfcf759
lib/promscrape/discovery/dockerswarm: properly set __meta_dockerswarm_container_label_* labels instead of __meta_dockerswarm_task_label_* labels
See https://github.com/prometheus/prometheus/issues/9187
2022-08-05 16:20:29 +03:00
Aliaksandr Valialkin
85b04732ed
lib/promscrape/discovery/consul: allow stale responses from Consul service discovery by default
This aligns with Prometheus behaviour.

See `allow_stale` option description at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
2022-08-05 15:04:05 +03:00
Aliaksandr Valialkin
17290a4598
lib/promscrape/discovery/yandexcloud: further code cleanup after 83a4abda3f 2022-08-05 10:31:19 +03:00
Aliaksandr Valialkin
8ddad31eef
lib/promscrape/discovery/yandexcloud: follow-up after 6e5ac32fba
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386
2022-08-04 22:28:21 +03:00
Igor Tiunov
0ba86fe87e
YC service discovery (#2923)
* YC service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386

* Fixed linter suggestions

* fixed golint errors
2022-08-04 22:28:20 +03:00
Nikolay
c007b129cb
lib/promscrape: adds azure service discovery (#2743)
* lib/promscrape: adds azure service discovery
Adds azure service discovery mechanism
implements authorization with oauth and msi
lists virtual machines and virtual machines managed by scaleSet

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364

* makes linter happy

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-13 23:45:43 +03:00
Aliaksandr Valialkin
4d03ac90fc
lib/promscrape/discovery/kubernetes: properly populate service-level labels for role: endpointslice targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2823
2022-07-07 00:36:25 +03:00
Aliaksandr Valialkin
c4cc45d7f8
lib/promscrape/discovery/kubernetes: allow attaching node-level labels to role: endpoints and role: endpointlice targets in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/10759
2022-07-07 00:36:24 +03:00
Aliaksandr Valialkin
ecc11dc32d
lib/promauth: refactor NewConfig in order to improve maintainability
1. Split NewConfig into smaller functions
2. Introduce Options struct for simplifying construction of the Config with various options

This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2684
2022-07-04 14:31:43 +03:00
Aliaksandr Valialkin
3ae6300497
lib/promauth: add ability to send additional http headers in requests to scrape targets
This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header
2022-06-22 20:40:50 +03:00
Aliaksandr Valialkin
b6e3c12811
lib/promscrape/discovery/kubernetes: use unsupportedFieldError() function instead of errContext string
This improves code readability and maintainability a bit, since the format string
is passed as string literal into fmt.Errorf.
2022-06-07 01:24:14 +03:00
Aliaksandr Valialkin
3dbb19d624
lib/promscrape/discovery/kubernetes: follow-up after 006b8c7534
- make more clear error logs
- simplify testing for newKubeConfig by passing only the path to kube_config file instead of SDConfig struct
2022-06-06 14:41:28 +03:00
Nikolay
72e43ef2fe
lib/promscrape/discovery/kubernetes: follow-up after 0b5c874911 (#2672) 2022-06-04 01:11:23 +03:00
hadesy
28d4624f60
promscrape/discovery: support kubeconfig (#2533) 2022-06-04 01:11:23 +03:00
Roman Khavronenko
7406665fc3
lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery (#2615)
* lib/promscrape/discovery/kubernetes: properly updates discovered scrape works
previously, added or updated scrapeworks may override previuosly
discovered.
it happens because swosByKey may contain small subset of kubernetes
objects with it's labels.
It happens for objectsUpdated and objectsAdded maps, which include only changed elements

* Properly calculate vm_promscrape_discovery_kubernetes_scrape_works

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-21 01:17:21 +03:00
Boris Petersen
3a8b4fab97
Add ability to sign requests for all AWS services (#2604)
This adds the ability to utilize sigv4 signing for all AWS services not
just "aps". When the newly introduced property "service" is not set it
will default to "aps".

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-05-20 14:20:00 +03:00
Aliaksandr Valialkin
0d0561ca8c
lib/awsapi: remove whitelist arg from GetFiltersQueryString(), since it may break new filters in the future
Let users decide which filters to use. If users start using disallowed filters, then AWS will return an error.
2022-05-09 15:34:56 +03:00
Aliaksandr Valialkin
810dd74fb9
lib/promscrape: properly implement ScrapeConfig.clone()
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.

This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
2022-05-07 00:06:19 +03:00
Aliaksandr Valialkin
9d40bb7137
lib/promscrape/discovery/ec2: add ability to filter Availability Zones in ec2_sd_config via az_filters section 2022-05-06 12:44:01 +03:00
Aliaksandr Valialkin
2ce1d09135
lib/promscrape/discovery/ec2: properly pass filters to DescribeAvailabilityZones API call
Previously filters wheren't passed to this call after the commit 0e09fdb8b0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2022-05-05 11:01:17 +03:00
Aliaksandr Valialkin
873f55bac5
lib/awsapi: pass filtersQueryString arg to GetEC2APIResponse() function, so the caller could decide whether to use the filters during the AWS API query
The filters shouldn't be passed to DescribeAvailabilityZones API call.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

Related commits:
0e09fdb8b0
d289ecded1
2022-05-05 10:29:47 +03:00
Nikolay
7e58cba6cf
{lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458)
* {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite
moves aws related code into separate lib from lib/promscrape
it allows to write data from vmagent to the AWS managed prometheus (cortex)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 20:28:37 +03:00
Aliaksandr Valialkin
aa82987d70
lib/promscrape/discovery/kubernetes: do not drop pod meta-labels even if the corresponding node objects are missing
This reflects the logic used in Prometheus.

See https://github.com/prometheus/prometheus/pull/10080
2022-04-26 15:27:42 +03:00
Aliaksandr Valialkin
c2b13e6a04
lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects
Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste
2022-04-22 23:15:34 +03:00
Aliaksandr Valialkin
a89e31b304
lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:34 +03:00
Aliaksandr Valialkin
cc6eae6992
lib/promscrape/discovery/kubernetes: improve the performance of urlWatcher.reloadObjects() on multi-CPU systems
Parallelize the generation of ScrapeWork objects there. Previously they were generated in a single goroutine.
2022-04-22 13:23:39 +03:00
Aliaksandr Valialkin
fea9d1e6ee
lib/promscrape/discovery/kubernetes: properly update endpoints and endpointslice objects when the related pod or service objects are updated
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

This is a follow-up for 2341bd48d7
2022-04-21 13:06:49 +03:00
Aliaksandr Valialkin
e9f08b1e6a
lib/promscrape/discovery/kubernetes: do not pre-allocate memory for ScrapeWork objects
There is high chance that ScrapeWork objects won't be generated because of relabeling
2022-04-20 16:42:41 +03:00
Aliaksandr Valialkin
909a3ee0e4
lib/promscrape: follow-up after 91e290a8ff 2022-04-20 16:12:26 +03:00
Nikolay
429848a67d
lib/promscrape: reduce latency for k8s GetLabels (#2454)
replaces internStringMap with sync.Map - it greatly reduces lock contention
concurently reload scrape work for api watcher - each object labels added by dedicated CPU

changes can be tested with following script https://gist.github.com/f41gh7/6f8f8d8719786aff1f18a85c23aebf70
2022-04-20 16:12:25 +03:00
Aliaksandr Valialkin
d0bac8e224
all: typo fix: Kuberntes -> Kubernetes 2022-04-20 10:51:41 +03:00
Roman Khavronenko
5a4b16794d
Consul SD - update services on the watcher's start (#2202)
* lib/discovery/consul: update services on the watcher's start

Previously, watcher's start was only initing goroutines for discovery
but not waiting for the first iteration to end. It means first Consul
discovery wasn't returning discovered targets until the next iteration.

The change makes the watcher's start blocking until we get first discovery
iteration done and all registries updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: remove workarounds for consul SD

Now when consul SD lib properly updates services
on the first start, we don't need workarounds in vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/discovery/consul: update after review

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 15:33:33 +02:00
Nikolay
c11d0949c8
fixes all_tenants query option usage for openstack service discovery (#2184)
explicit use configuration parametr instead of conditional add
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2182
2022-02-14 13:13:03 +02:00
artifactori
943fc056ca
Show gce sdconfig zone on vmagent:8429/config (#2178)
* vmagent: add test for marshalling gce sdconfig with ZoneYAML

* vmagent: implement MarshalYAML for ZoneYAML on gce sdconfig
2022-02-12 00:48:38 +02:00
Aliaksandr Valialkin
727b29d4a3
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpointslice_{label,annotation}* labels to be consistent with other role values for Kubernetes service discovery 2022-02-11 14:56:10 +02:00
Nikolay
265938a385
fixes service discovery for kubernetes (#2173)
* fixes service discovery for kubernetes
now it must take in account all pods that belong to the discovered endpoint and endpointslice
adds simple test for endpoints
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2134

* wip

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 13:35:34 +02:00
Aliaksandr Valialkin
dd91759f1f
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_node_provider_id label for discovered Kubernetes nodes in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/9603
2022-01-13 23:17:24 +02:00
Aliaksandr Valialkin
bc18368c15
lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace
See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881
2022-01-13 22:44:59 +02:00
Aliaksandr Valialkin
de8299f465
lib/promscrape/discovery/dockerswarm: follow up after 68a117a25a
- Document the bugfix at docs/CHANGELOG.md
- Set __address__ field after copying commonLabels to the resulting map of discovered labels.
  This makes sure that the correct __address__ label is used.
2022-01-11 09:22:03 +02:00
Alexander Shtuchkin
45a92e6ce1
Fix for #2038: Make correct __address__ value for dockerswarm promscrape (#2041) 2022-01-11 09:22:02 +02:00
Aliaksandr Valialkin
847004fa77
app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:42:13 +02:00
Aliaksandr Valialkin
ad445a06cd
lib/promscrape: properly show proxy_url option value at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755
2021-10-26 21:24:22 +03:00
Aliaksandr Valialkin
ea69eef375
lib/promscrape/discovery/kubernetes: log a warning if role: endpoints discovers more than 1000 targets per a single endpoint
In this case `role: endpointslice` must be used instead.

See the following references:

* https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity
* https://github.com/kubernetes/kubernetes/pull/99975
* https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398
2021-10-19 13:22:28 +03:00
Aliaksandr Valialkin
3e9beb0f8d
lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name
This is a follow-up for 31b42b30b6
2021-10-15 12:27:31 +03:00
Nikolay
9be5689b3f
changes auth validation for openstack (#1663)
* changes auth validation for openstack
must fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1655

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:33:38 +03:00
Nikolay
1ab2f844a2 makes filters optional for ec2 api requests (#1627)
filters can be applied only for DescribeInstances requests, like prometheus does.
related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2021-09-17 18:12:25 +03:00
Aliaksandr Valialkin
b684624f67 lib/promscrape/discovery/docker: support host networking mode
See https://github.com/prometheus/prometheus/issues/9116
2021-09-13 13:30:55 +03:00
Aliaksandr Valialkin
6ed9f10da5 lib/promscrape/discovery/kubernetes: properly use https scheme for wildcard TLS certificates in ingress target discovery
See https://github.com/prometheus/prometheus/issues/8902
2021-09-13 13:04:43 +03:00
Aliaksandr Valialkin
146c14d879 lib/promscrape/discovery/kubernetes: return back support role: endpointslices, since it is used by VictoriaMetrics operator
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:36 +03:00
Aliaksandr Valialkin
ca61d7c82b lib/promscrape/discovery/kubernetes: rename role: endpointslices to role: endpointslice to be consistent with Prometheus
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:59 +03:00
Aliaksandr Valialkin
327034b54f lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress and role: endpointslices
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122

The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:23:58 +03:00
Aliaksandr Valialkin
4a2d7aec7f lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config 2021-08-25 13:05:29 +03:00
Aliaksandr Valialkin
77bb9e1656 lib/promscrape/discovery/gce: add __meta_gce_interface_ipv4_<name> labels as in Prometheus 2.29
See https://github.com/prometheus/prometheus/pull/8978
2021-08-03 15:51:45 +03:00
Aliaksandr Valialkin
336a2aa2e0 lib/promscrape/discovery/ec2: add __meta_ec2_availability_zone_id label as Prometheus 2.29 does 2021-08-03 13:28:13 +03:00
Aliaksandr Valialkin
6fc3696260 lib/promscrape/discovery/consul: use case-insensitive comparison for service names
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424
2021-07-02 14:49:22 +03:00
Aliaksandr Valialkin
2edfea8c36 lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID 2021-06-29 13:11:33 +03:00
Aliaksandr Valialkin
97d1ccfc8e lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
This is a follow up after c85a5b7fcb
2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin
4461e20e7d lib/promscrape: consistently sort service discovery routines
This should simplify further maintenance of the code
2021-06-25 13:22:16 +03:00
Lu Jiajing
12b4cbb68f Support Docker ServiceDiscovery (#1402)
* add docker discovery

* add test

* add labels test and add scrape work

* remove TODO

* refactor to merge apiConfig and sdConfig

* apply suggestion
2021-06-25 13:22:16 +03:00
Nikolay
501429c3ff adds missing MustStop call to do and http sd (#1404) 2021-06-25 11:43:32 +03:00
Aliaksandr Valialkin
f10fa0d1d7 lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
Follow-up for 58a2989fe7
2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin
4adf6c9766 lib/promscrape/discovery/http: follow up after e307bbb29a 2021-06-22 13:42:10 +03:00
Nikolay
e03a3d3a36 adds http_sd (#1399)
* adds http_sd

* adds X-Prometheus-Refresh-Interval-Seconds header

* Update lib/promscrape/discovery/http/api.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin
3ab3902f17 lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does 2021-06-22 13:18:51 +03:00
Nikolay
827a2396d2 adds consul enterprise namespace support (#1400)
* adds consul enterprise namespace support

* Update lib/promscrape/discovery/consul/consul.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:56:11 +03:00
Nikolay
9ea1dca3dd fixes DO service discovery labels (#1389)
adds test for digitalocean sd
2021-06-17 17:21:10 +03:00
Nikolay
e42da47608 adds digital ocean sd (#1376)
* adds digital ocean sd config

* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367

* typo fix
2021-06-14 13:19:29 +03:00
Hason Chan
439c2ed510 fix eureka_sd_configs HTTPClientConfig incorrect parsing (#1350) 2021-06-04 11:56:06 +03:00
Nikolay
2780d6dbcd basic OAuth2 support for remoteWrite and scrape targets (#1316)
* adds OAuth2 support for remoteWrite and scrapping

* adds tests
changes init
2021-05-22 18:02:01 +03:00
Aliaksandr Valialkin
110a888e39 lib/promscrape/discovery/kubernetes: make golangci-lint happy by removing empty branches 2021-05-20 12:00:17 +03:00
Aliaksandr Valialkin
9d97f44772 lib/promscrape/discovery/kubernetes: reload objects on object parse error
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:27:24 +03:00
Aliaksandr Valialkin
c507faec0b lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey 2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
0f54c0121b lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
Previously it wasn't descreased during config update.
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
9f62d348db lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
8764b0ae21 lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:49:48 +03:00
Aliaksandr Valialkin
e08287f017 lib/promscrape: reload auth tokens from files every second
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin
a6cb4f10a7 app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin
027607db3e lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin
e6c19cb09d lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin
421a92983a lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:17:45 +03:00
Nikolay
535b3ff618 vmagent kubernetes_sd tests (#1253)
* first part of tests for kubernetes sd

* makes linter happy

* added more test cases

* adds pub/sub for tests
2021-04-29 10:17:45 +03:00
Aliaksandr Valialkin
b3da457629 lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:59:56 +03:00
Aliaksandr Valialkin
34321e5f8d lib/promscrape/discovery/kubernetes: refresh role: endpoints targets on service object removal as Prometheus does
This is a follow-up for ae37cfd528

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:27:29 +03:00
Aliaksandr Valialkin
db27dbab5e lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:12:22 +03:00
Aliaksandr Valialkin
02b83e0957 lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:10:04 +03:00
Aliaksandr Valialkin
db56ee0e28 lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws 2021-04-06 11:11:53 +03:00
Aliaksandr Valialkin
edd66b7e82 lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions 2021-04-05 22:43:49 +03:00
Lu Jiajing
4ee6def68b fix access to nil *url.URL (#1180)
* fix access to nil *url.URL

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>

* Update lib/promscrape/discovery/kubernetes/api_watcher.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:26:43 +03:00
Aliaksandr Valialkin
7eca60694e lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:05:02 +03:00
Aliaksandr Valialkin
9da2ef3d8f lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:22 +03:00
Aliaksandr Valialkin
fe084fdd33 lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.

There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:37:07 +03:00
Aliaksandr Valialkin
ab9e1eb41f lib/promscrape: support for simple HTTP proxies without CONNECT method support such as https://github.com/prometheus-community/PushProx
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:58 +03:00
Aliaksandr Valialkin
87700f1259 lib/promscrape: add support for authorization config in -promscrape.config as Prometheus 2.26 does
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:20:37 +03:00
Aliaksandr Valialkin
245eba8896 lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
This is a follow-up for 12e4785fe8

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:46:34 +03:00
Aliaksandr Valialkin
eee860f83d lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:29:24 +03:00
Nikolay
00157f3ec6 Adds aws ECS credentials support (#1175) 2021-04-02 13:11:03 +03:00
Aliaksandr Valialkin
7d87d42a91 lib/promscrape/discovery/kubernetes: typo fix in error message 2021-03-26 12:46:33 +02:00
Aliaksandr Valialkin
a920e71809 lib/promscrape/discovery/kubernetes: properly handle too old resource version error message from Kubernetes watch API 2021-03-26 12:28:35 +02:00
Aliaksandr Valialkin
6b1f807418 app/vmagent: add -promscrape.consul.waitTime command-line flag for configuring Consul service discovery wait time
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:34:12 +02:00
Nikolay
19a40faf8e changes consul_service label value (#1143)
according to prometheus discovery.
 It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:37:06 +02:00
Aliaksandr Valialkin
e2717d84c0 all: various fixes in command-line flag descriptions 2021-03-15 22:03:49 +02:00
Aliaksandr Valialkin
894246176f lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded 2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
b0b28eeb93 lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
2021-03-14 21:16:41 +02:00
Aliaksandr Valialkin
620f05cd2c lib/promscrape/discovery: fixes after 133b288681
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher

This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:22:38 +02:00
Aliaksandr Valialkin
8fc29ffc67 lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 17:04:14 +02:00
Aliaksandr Valialkin
41f641b132 lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
This is a follow-up for e772d1c920
2021-03-11 13:08:56 +02:00
Aliaksandr Valialkin
6c9cd3f7c1 lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.

See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:08:40 +02:00
Aliaksandr Valialkin
7b66c8cbf8 lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
Log the number of objects loaded per each watch url

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:07:12 +02:00
Aliaksandr Valialkin
c04505e585 lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config role
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.

This is a follow-up for 17b87725ed

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
5807ff57f3 lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197 lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file 2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
bae7a1b47a lib/promscrape/discovery/kubernetes: fix tests after e154f4a644 2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd Fix ingress discovery api (#1110) 2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1

This fixes a panic when the ScrapeWork is filtered out in swcFunc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
ac5c47a9f5 lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric 2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
f9c1fe3852 lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:44:19 +02:00
Aliaksandr Valialkin
f686174329 lib/promscrape/discovery/ec2: follow-up after f6114345de 2021-03-02 13:47:35 +02:00
Nikolay
fd8ca7df50 Adds webIndentity token for aws (#1099)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:47:34 +02:00
Aliaksandr Valialkin
b89a4fac2f lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c 2021-03-01 14:31:44 +02:00
Aliaksandr Valialkin
c3bf72992f lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:15:16 +02:00
Aliaksandr Valialkin
3e44d9947e lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
A follow-up for b21b110b7a
2021-02-28 17:06:49 +02:00
Aliaksandr Valialkin
0ef7a94056 lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_* metrics for monitoring internal state of k8s service discovery 2021-02-28 16:58:45 +02:00
Aliaksandr Valialkin
f52bdbe2a3 lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan query arg when watching for k8s object changes, since it cannot be used when watch=1 query arg is passed 2021-02-28 16:08:43 +02:00
Aliaksandr Valialkin
5c9e657808 lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:25:33 +02:00
Aliaksandr Valialkin
e77f2f8630 lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b 2021-02-28 15:15:27 +02:00
Aliaksandr Valialkin
82441537ff lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject() 2021-02-28 15:09:43 +02:00
Aliaksandr Valialkin
e003453941 lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
This should deflake tests where the order of labels isn't stable
2021-02-28 14:12:32 +02:00
Aliaksandr Valialkin
6a21ef87b7 lib/promscrape: add missing startWatchersForRole() call at the beginning of apiWatcher.getLabelsForRole 2021-02-28 14:00:00 +02:00
Aliaksandr Valialkin
6d0e7fb8b0 lib/promscrape/discovery/kubernetes: reload k8s resources on every error
This is needed for obtaining fresh resourceVersion
2021-02-27 01:46:59 +02:00
Aliaksandr Valialkin
fa3ce450fb lib/promscrape: cache ScrapeWork
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:41 +02:00
Aliaksandr Valialkin
efcdf613c2 lib/promscrape/discovery/kubernetes: cache target labels
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:24:29 +02:00
Aliaksandr Valialkin
22822feea3 lib/promscrape/discovery/kubernetes: errcheck fix 2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
dc8c045378 lib/promscrape: cleanup after 9b2246c29b
Main points:

* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 19:09:12 +02:00
Nikolay
cf9262b01f vmagent kubernetes watch stream discovery. (#1082)
* started work on sd for k8s

* continue work on watch sd

* fixes

* continue work

* continue work on sd k8s

* disable gzip

* fixes typos

* log errror

* minor fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
4cfac70cde lib/promscrape: remove duplicate code a bit 2021-02-26 15:54:07 +02:00
Aliaksandr Valialkin
fccb481de2 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_* and __meta_kuberntes_endpoints_annotation_* labels to role: endpoints
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:36 +02:00
Nikolay
6fc2a8e544 fixes dockerswarm (#1053)
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:57:41 +02:00
Nikolay
dd2f815204 fixes kubernetes_sd (#983)
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982

* fix test
2020-12-24 11:34:12 +02:00
Aliaksandr Valialkin
367fc17933 lib/promscrape: code prettifying for 8dd03ecf19 2020-12-24 10:57:20 +02:00
Nikolay
b00f7816e2 adds proxy_url support, (#980)
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503

* fixes imports
2020-12-24 10:57:19 +02:00
Aliaksandr Valialkin
c80d38f00c lib/promscrape/discovery/consul: reduce load on Consul API server by increasing timeout for blocking requests from 50 seconds to 9 minutes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-11 17:26:34 +02:00
Aliaksandr Valialkin
a84467958a lib/promscrape/discovery/consul: properly pass Datacenter filter to Consul API server
Previously it has been passed as `sdc` query arg, while it should be passed as `dc` query arg.
See https://www.consul.io/api-docs/health#list-nodes-for-service for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
2020-12-08 21:53:23 +02:00
Aliaksandr Valialkin
96190f9d45 lib/promscrape/discovery/consul: log the time needed for stoppig Consul service watcher 2020-12-03 20:14:48 +02:00