Aliaksandr Valialkin
cfd6aa28e1
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
904bbffc7f
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:50 +03:00
Aliaksandr Valialkin
2ab1266593
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:14:26 +03:00
Nikolay
4e5a88114a
vmagent kubernetes_sd tests ( #1253 )
...
* first part of tests for kubernetes sd
* makes linter happy
* added more test cases
* adds pub/sub for tests
2021-04-29 10:10:24 +03:00
Aliaksandr Valialkin
2ec7d8b384
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:57:51 +03:00
Aliaksandr Valialkin
908e35affd
lib/promscrape: apply scrape_timeout
on receiving the first response byte for stream_parse: true
scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 21:53:35 +03:00
Aliaksandr Valialkin
eddba29664
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:26:57 +03:00
Aliaksandr Valialkin
ae37cfd528
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:11:40 +03:00
Aliaksandr Valialkin
6bc52fe41a
all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com
2021-04-20 20:16:17 +03:00
Aliaksandr Valialkin
3f0bcbe067
lib/promscrape: create a single swosFunc
per scrape_config
2021-04-08 09:31:48 +03:00
Aliaksandr Valialkin
5a0938d807
lib/promscrape: do not spend CPU time on constructing scrapeWork key if clustering is disabled
2021-04-07 21:54:22 +03:00
Aliaksandr Valialkin
59f9960992
lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
...
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:07:39 +03:00
Aliaksandr Valialkin
3ec6639bbb
lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws
2021-04-06 11:12:13 +03:00
Aliaksandr Valialkin
5f593b0ed3
lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions
2021-04-05 22:44:12 +03:00
Lu Jiajing
b59164cf33
fix access to nil *url.URL ( #1180 )
...
* fix access to nil *url.URL
Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
* Update lib/promscrape/discovery/kubernetes/api_watcher.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:25:31 +03:00
Aliaksandr Valialkin
b46194472f
lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:04:30 +03:00
Aliaksandr Valialkin
a51d0ec6ec
lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
...
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:17 +03:00
Aliaksandr Valialkin
f010d773d6
lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
...
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.
There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:20:12 +03:00
Aliaksandr Valialkin
6742839fd6
lib/promscrape: pass X-Prometheus-Scrape-Timeout-Seconds
header to scrape targets as Prometheus does
2021-04-05 12:15:24 +03:00
Aliaksandr Valialkin
500e625e8c
lib/promscrape: properly send full url in GET
request via simple HTTP proxy
...
This is a follow-up for a0ae0f86666a75ec57b45eab2429da7ab4a7b250
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 01:20:06 +03:00
Aliaksandr Valialkin
5153410ced
lib/promscrape: support for simple HTTP proxies without CONNECT
method support such as https://github.com/prometheus-community/PushProx
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:40 +03:00
Aliaksandr Valialkin
4c56b1a6dd
lib/promscrape: add tests for authorization
config, which has been added in df148f48b7
2021-04-03 22:13:22 +03:00
Aliaksandr Valialkin
df148f48b7
lib/promscrape: add support for authorization
config in -promscrape.config
as Prometheus 2.26 does
...
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:17:45 +03:00
Aliaksandr Valialkin
7f9c68cdcb
lib/promscrape: add follow_redirect
option to scrape_configs
section like Prometheus does
...
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 19:56:40 +03:00
Aliaksandr Valialkin
5b08e6fb16
lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
...
This is a follow-up for 12e4785fe8
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:45:32 +03:00
Aliaksandr Valialkin
12e4785fe8
lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:28:30 +03:00
Nikolay
fdb8995642
Adds aws ECS credentials support ( #1175 )
2021-04-02 11:56:40 +03:00
Aliaksandr Valialkin
f39c84b21f
lib/promscrape/discovery/kubernetes: typo fix in error message
2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161
lib/promscrape/discovery/kubernetes: properly handle too old resource version
error message from Kubernetes watch API
2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
d4aadba9fa
app/vmagent: add -promscrape.consul.waitTime
command-line flag for configuring Consul service discovery wait time
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:33:25 +02:00
Nikolay
29f9ef9b7f
changes consul_service label value ( #1143 )
...
according to prometheus discovery.
It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:35:01 +02:00
Aliaksandr Valialkin
828669e4e1
all: make golint
happy
2021-03-17 00:49:28 +02:00
Aliaksandr Valialkin
f104f3eb2a
all: make golangci-lint happy after the commit 6378205415
2021-03-17 00:24:40 +02:00
Aliaksandr Valialkin
6378205415
lib/netutil: enable IPv6 UDP listening if -enableTCP6
command-line flag is passed to VictoriaMetrics
...
This is a follow-up for 18cfc4be7b
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:16:17 +02:00
Aliaksandr Valialkin
85a95bf60c
all: various fixes in command-line flag descriptions
2021-03-15 21:59:25 +02:00
Aliaksandr Valialkin
c14dafce43
lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
...
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.
This reverts 7f6f350ee1
2021-03-15 11:45:39 +02:00
Aliaksandr Valialkin
7f6f350ee1
lib/promscrape: return back the logic for flushing big buffers to storage from the commit 3fd8653b40
...
This should reduce memory usage when vmagent scrapes targets with big number of metrics and `-promscrape.streamParse` isn't enabled
2021-03-14 22:26:00 +02:00
Aliaksandr Valialkin
b88806ecbf
lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded
2021-03-14 21:55:00 +02:00
Aliaksandr Valialkin
83edbb7cab
lib/promscrape: retry service discovery in a few seconds if it starts returning 0 targets
...
This should reduce recovery time from temporary issues during service discovery
2021-03-14 21:53:23 +02:00
Aliaksandr Valialkin
bf15d6a6a2
lib/promscrape: remove duplicate target
word in error message
2021-03-14 21:52:02 +02:00
Aliaksandr Valialkin
d409898515
lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
...
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-14 21:14:53 +02:00
Aliaksandr Valialkin
7a16e8e3a2
lib/promscrape/discovery: fixes after 133b288681
...
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher
This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:18:51 +02:00
Aliaksandr Valialkin
def014eb75
lib/promscrape/discovery/kubernetes: remove debug lines left after the commit 133b288681
2021-03-12 11:22:33 +02:00
Aliaksandr Valialkin
a6a71ef861
lib/promscrape: add ability to configure proxy options via proxy_tls_config
, proxy_basic_auth
, proxy_bearer_token
and proxy_bearer_token_file
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 03:36:19 +02:00
Aliaksandr Valialkin
133b288681
lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
...
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 16:43:04 +02:00
Aliaksandr Valialkin
bebcb8130c
lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
...
This is a follow-up for e772d1c920
2021-03-11 13:08:08 +02:00
Aliaksandr Valialkin
e772d1c920
lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
...
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.
See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:06:35 +02:00
Aliaksandr Valialkin
36fd007247
lib/proxy: set missing ServerName in TLS config for proxy_url
.
...
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 18:58:18 +02:00
Nikolay
ad34f42467
Changes tlsConfig init for proxy connections ( #1121 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 18:51:00 +02:00
Aliaksandr Valialkin
3fd8653b40
lib/promscrape: apply sample_limit
after metric relabeling is applied as Prometheus does
...
See the description for `sample_limit` option from Prometheus docs:
Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:47:18 +02:00