- Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup
don't prevent from processing the corresponding scrape targets after the file becomes correct,
without the need to restart vmagent.
Previously scrape targets with invalid TLS CA file or TLS client certificate files
were permanently dropped after the first attempt to initialize them, and they didn't
appear until the next vmagent reload or the next change in other places of the loaded scrape configs.
- Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent.
Previously the old TLS CA was used until vmagent restart.
- Properly handle errors during http request creation for the second attempt to send data to remote system
at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing,
since the returned request is nil on error.
- Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent.
Previously it could miss details on the source of the request.
- Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header
of every http request issued by vmagent during service discovery or target scraping.
Re-use the HTTP client instead until the corresponding scrape config changes.
- Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached,
e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server
when auth header cannot be obtained because of temporary error.
- Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config.
Cache the loaded certificate and the error for one second. This should significantly reduce CPU load
when scraping big number of targets with the same tls_config.
- Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`.
- Improve test coverage at lib/promauth
- Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later.
Previously vmagent was exitting in this case.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959
* fix inconsistent behaviors with prometheus when scraping
1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting;
2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header;
3. don't send requests if there are wrong auth configs in:
1. vmagent remoteWrite;
2. vmalert datasource/remoteRead/remoteWrite/notifier.
* add changelogs
* address review comments
* fix ut
Initially, stream parse mode was reading data from response and parsing it on flight. This was causing longer delay to read the whole response and required increasing timeout value to allow data processing while reading. So that 908e35affd increased timeout value to fix this.
But after 74c00a8762 response in stream parse mode is saved into memory and then parsed eliminating necessity of having timeout value higher that for usual scrape.
Updates: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4847
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
- Return immediately on context cancel during the backoff sleep.
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3747
- Add a comment describing why the second attempt to obtain the response from remote side
is perfromed immediately after the first attempt.
- Remove fasthttp dependency from lib/promscrape/discoveryutils
- Set context deadline before calling doRequestWithPossibleRetry().
This simplifies the doRequestWithPossibleRetry() a bit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3293
* fix: do not use exponential backoff for first retry of scrape request (#3293)
* lib/promscrape: refactor `doRequestWithPossibleRetry` backoff to simplify logic
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* Update lib/promscrape/client.go
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
* lib/promscrape: refactor `doRequestWithPossibleRetry` to make it more straightforward
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
This fixes handling of values bigger than 2GiB for the following command-line flags:
- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
- Return meta-labels for the discovered targets via promutils.Labels
instead of map[string]string. This improves the speed of generating
meta-labels for discovered targets by up to 5x.
- Remove memory allocations in hot paths during ScrapeWork generation.
The ScrapeWork contains scrape settings for a single discovered target.
This improves the service discovery speed by up to 2x.
ioutil.ReadAll is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is OK to switch from ioutil.ReadAll to io.ReadAll.
This is a follow-up for 02ca2342ab
The 429 status code means that the server is overwhelmed with requests.
The client can retry the request after some wait time.
Implement this strategy for service discovery and scrape requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2940
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.
This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
Stream parsing mode can be automatically enabled when scraping targets with big response bodies
exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.