Commit graph

207 commits

Author SHA1 Message Date
Aliaksandr Valialkin
530e9904af lib/promscrape: reduce CPU usage an memory allocations when constructing scrapeWorkKey 2021-02-28 22:29:58 +02:00
Aliaksandr Valialkin
e5ca8ac0db lib/promscrape: add ability to spread scrape targets among multiple vmagent instances
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1084
2021-02-28 18:41:08 +02:00
Aliaksandr Valialkin
e02d1ef93c lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
A follow-up for b21b110b7a
2021-02-28 17:06:02 +02:00
Aliaksandr Valialkin
b21b110b7a lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_* metrics for monitoring internal state of k8s service discovery 2021-02-28 16:57:40 +02:00
Aliaksandr Valialkin
c459600346 lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan query arg when watching for k8s object changes, since it cannot be used when watch=1 query arg is passed 2021-02-28 16:07:14 +02:00
Aliaksandr Valialkin
59a31171e3 lib/promscrape: fix possible deadlock in parallel execution of target relabeling 2021-02-28 16:05:13 +02:00
Aliaksandr Valialkin
68a0f5ce12 lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:26:30 +02:00
Aliaksandr Valialkin
b523e0369c lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b 2021-02-28 15:12:17 +02:00
Aliaksandr Valialkin
241ffd1f3b lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject() 2021-02-28 15:09:49 +02:00
Aliaksandr Valialkin
05fb08713c lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
This should deflake tests where the order of labels isn't stable
2021-02-28 14:10:19 +02:00
Aliaksandr Valialkin
af8b7e8391 lib/promscrape: add missing startWatchersForRole() call at the beginning of apiWatcher.getLabelsForRole 2021-02-28 14:00:17 +02:00
Aliaksandr Valialkin
422b31de40 lib/promscrape/discovery/kubernetes: reload k8s resources on every error
This is needed for obtaining fresh resourceVersion
2021-02-27 01:47:27 +02:00
Aliaksandr Valialkin
a78948ae8b lib/promscrape: yet another typo fix after ed8441ec52 2021-02-26 23:35:47 +02:00
Aliaksandr Valialkin
9fa2632ac3 lib/promscrape: typo fix after ed8441ec52 2021-02-26 23:04:05 +02:00
Aliaksandr Valialkin
ed8441ec52 lib/promscrape: cache ScrapeWork
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:22 +02:00
Aliaksandr Valialkin
815666e6a6 lib/promscrape/discovery/kubernetes: cache target labels
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:23:28 +02:00
Aliaksandr Valialkin
19712fc2bd lib/promscrape/discovery/kubernetes: errcheck fix 2021-02-26 17:00:08 +02:00
Aliaksandr Valialkin
c8f2f9b2e8 lib/promscrape: cleanup after 9b2246c29b
Main points:

* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 16:54:05 +02:00
Nikolay
9b2246c29b
vmagent kubernetes watch stream discovery. (#1082)
* started work on sd for k8s

* continue work on watch sd

* fixes

* continue work

* continue work on sd k8s

* disable gzip

* fixes typos

* log errror

* minor fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 16:46:13 +02:00
Aliaksandr Valialkin
a93e644001 lib/promscrape: remove duplicate code a bit 2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
f7b242540b lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel 2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
d136081040 lib/promrelabel: add more optimizations for relabeling for common cases 2021-02-22 16:33:55 +02:00
Aliaksandr Valialkin
ff5bbc4b88 lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric 2021-02-21 23:21:42 +02:00
Aliaksandr Valialkin
2cfb376945 lib/promscrape: typo fix after the commit f26162ec99 2021-02-19 00:33:37 +02:00
Aliaksandr Valialkin
f26162ec99 lib/promscrape: add scrape_align_interval config option into scrape config
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:44 +02:00
Aliaksandr Valialkin
38d7e96602 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_* and __meta_kuberntes_endpoints_annotation_* labels to role: endpoints
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:16 +02:00
Aliaksandr Valialkin
80dc74dbc1 lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.

* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:

  scrape_samples_scraped > 0 if up == 0
2021-02-12 05:26:04 +02:00
Nikolay
48c8c5093b
fixes dockerswarm (#1053)
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:56:42 +02:00
Aliaksandr Valialkin
4068f8d590 lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric 2021-02-02 16:15:25 +02:00
Aliaksandr Valialkin
bd11fd8f1d lib/promscrape: add vm_promscrape_scrape_retries_total, vm_promscrape_discovery_retries_total and vm_promscrape_discovery_requests_total metrics 2021-02-01 20:06:27 +02:00
Aliaksandr Valialkin
fc5b26d856 lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
de3c662e8a all: consistently use timers from timerpool 2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
8cea3c3cc4 lib/promscrape: retry scrape and service discovery requests when the remote server closes http keep-alive connection 2021-01-22 13:22:33 +02:00
Nikolay
7976c22797
Fixes error handling for promscrape.streamParse (#1009)
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:31:47 +02:00
Aliaksandr Valialkin
2c44f9989a lib/promscrape: properly show scrape duration on /targets page
Previously it has been shown as 0.000s for any scrape duration.
2021-01-11 21:14:46 +02:00
Nikolay
b21e16ad0c
fixes kubernetes_sd (#983)
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982

* fix test
2020-12-24 11:26:14 +02:00
Aliaksandr Valialkin
820669da69 lib/promscrape: code prettifying for 8dd03ecf19 2020-12-24 10:56:10 +02:00
Nikolay
8dd03ecf19
adds proxy_url support, (#980)
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503

* fixes imports
2020-12-24 10:52:37 +02:00
Aliaksandr Valialkin
2dfa746c91 lib/promscrape: remove ID field from ScrapeWork struct. Use a pointer to ScrapeWork as a key in targetStatusMap
This simplifies the code a bit.
2020-12-17 14:32:56 +02:00
Aliaksandr Valialkin
a4c7fcb5e1 lib/promscrape: properly remove deleted target from /targets page
Previously `sw` variable wasn't captured correctly by the started goroutine.
2020-12-15 20:57:09 +02:00
Aliaksandr Valialkin
b730fc2667 lib/promscrape: properly handle scrape errors when stream parsing is enabled
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967
2020-12-15 14:08:28 +02:00
Aliaksandr Valialkin
60fcac4878 lib/promscrape: add bootstrap styles to /targets html page 2020-12-15 12:37:56 +02:00
Aliaksandr Valialkin
5af2a9ca0e lib/promscrape: formatting fixes for /tarets page 2020-12-15 11:59:04 +02:00
Aliaksandr Valialkin
020917949b lib/promscrape: formatting fixes for /targets page 2020-12-15 11:24:18 +02:00
Aliaksandr Valialkin
ae972429c7 lib/promscrape: add missing whitespace between duration and ago word at /targets page 2020-12-14 14:19:58 +02:00
Aliaksandr Valialkin
069c9ade52 app/{vmagent,vminsert}: follow-up for ce8c2dd1f1: return /targets page in HTML when requested via web browser 2020-12-14 14:06:00 +02:00
Nikolay
ce8c2dd1f1
Changes targets api (#961)
* changes /targets api
adds html response if requester accepts text/html,
adds quick template for /targets api,
fixes pathPrefix for / requests

* changes namings

* renamed targets file

* Update app/victoria-metrics/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>

* adds trimspace to qtpl,
moves content-type for targets response closer to writer

* fixes bug with prefix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-12-14 13:36:48 +02:00
Aliaksandr Valialkin
5f9d88a3cb lib/promscrape/discovery/consul: reduce load on Consul API server by increasing timeout for blocking requests from 50 seconds to 9 minutes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-11 17:24:13 +02:00
Aliaksandr Valialkin
fd9fd191b9 lib/promscrape/discovery/consul: properly pass Datacenter filter to Consul API server
Previously it has been passed as `sdc` query arg, while it should be passed as `dc` query arg.
See https://www.consul.io/api-docs/health#list-nodes-for-service for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
2020-12-08 21:52:42 +02:00
Aliaksandr Valialkin
364f30a6e7 lib/promscrape: store ScrapeWork items by pointer in the slice returned from get*ScrapeWork()
This should prevent from possible 'memory leaks' when a pointer to ScrapeWork item stored in the slice
could prevent from releasing memory occupied by all the ScrapeWork items stored in the slice when they
are no longer used.

See the related commit e205975716 and the related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-12-08 17:50:05 +02:00