Missing cherry-picked commits:
- 30b61c6d8a
- a335ed23c7
These commits cannot be cherry-picked now because of too big churn in the docs/anomaly-detection after these commits.
'authKey' is well-known url and form param for VictoriaMetrics
components authorization. Previously, it could be printed into stdout
via httpserver error logger. It makes this authKey insecure and hard to
use.
This commit prevents from logging authKey defined at PostForm or as part
of url.Query.
It's recommneded to transfer authKey via PostForm and it should be
implemented at separate PRs.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5973
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Previously, if ProfileName is set to empty value (as default). AWS s3
lib ignored any profile config defined with `-configProfilePath`.
This commit correctly configure client options and set profile name only
if it's set to non-empty value.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8668
Currently, requests failing due to network timeout would receive "200
OK" while producing a warning log message about the timeout. This
behaviour is confusing and might produce unexpected issues as it is not
possible to retry errors properly.
Change this to return "502 Bad Gateway" response so that error can be
handled by the client.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8621
Config for testing:
```
unauthorized_user:
url_prefix: "http://example.com:9800"
```
Before the change:
```
* Trying 127.0.0.1:8427...
* Connected to 127.0.0.1 (127.0.0.1) port 8427
* using HTTP/1.x
> HEAD /api/v1/query HTTP/1.1
> Host: 127.0.0.1:8427
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
/* NOTE: 30 seconds timeout passes */
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Vary: Accept-Encoding
Vary: Accept-Encoding
< X-Server-Hostname: pc
X-Server-Hostname: pc
< Date: Tue, 01 Apr 2025 08:54:05 GMT
Date: Tue, 01 Apr 2025 08:54:05 GMT
<
* Connection #0 to host 127.0.0.1 left intact
```
After:
```
* Trying 127.0.0.1:8427...
* Connected to 127.0.0.1 (127.0.0.1) port 8427
* using HTTP/1.x
> HEAD /api/v1/query HTTP/1.1
> Host: 127.0.0.1:8427
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 502 Bad Gateway
HTTP/1.1 502 Bad Gateway
< Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
< Vary: Accept-Encoding
Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< X-Server-Hostname: pc
X-Server-Hostname: pc
< Date: Tue, 01 Apr 2025 09:13:57 GMT
Date: Tue, 01 Apr 2025 09:13:57 GMT
< Content-Length: 109
Content-Length: 109
<
* Connection #0 to host 127.0.0.1 left intact
```
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
### Describe Your Changes
Log when the data response from vmselect is partial during
rule(recording, alertingrule) evaluations.
vmselect returns `isPartial: true` in case data is not fully fetched
from scattered vmstorages. At the time of rule evals, it may be drifting
apart from real values due to missing points. This is an important event
that should be logged to inform users to see how often that happens as
it may lead to false positive alerts.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: emreya <emre.yazici@adyen.com>
Signed-off-by: emreya <e.yazici1990@gmail.com>
Signed-off-by: Emre Yazici <e.yazici1990@gmail.com>
When creating and listing snapshots, panic instead of returning an error
since errors are not recoverable anyway.
Also do not cleanup the filesystem on panic. Leave as is for further
manual inspection.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
This reduces CPU usage by up to 30% in exchange of the increased RAM usage by 10%
when scraping thousands of targets, which expose millions of metrics in summary.
This looks like a good tradeoff after the commit edac875179 ,
which reduced RAM usage by more than 10%, so the final RAM usage for vmagent
is still lower than the RAM usage at v1.114.0 by ~15%, while CPU usage drops by 30%.
This reduces memory usage when reading large response bodies because the underlying buffer
doesn't need to be re-allocated during the read of large response body in the buffer.
Also decompress response body under the processScrapedDataConcurrencyLimitCh .
This reduces CPU usage and RAM usage a bit when scraping thousands of targets.
This reduces memory usage for vmagent when scraping big number of targets at the cost of slightly higher CPU usage.
The increased CPU usage can be decreased by disabling tracking of stale markers either via -promscrape.noStaleMarkers
command-line flag or via `no_stale_markers: true` option at the scrape config pointed by -promscrape.config command-line flag.
See https://docs.victoriametrics.com/vmagent/#prometheus-staleness-markers
The pool is used mostly for obtaining byte buffers for responses from scrape targets.
There are no responses smaller than 256 bytes in practice, so there is no sense in maintaining
pools for byte slices up to 64 and 128 bytes.
Previously the maxLabelsLen could be updated with smaller value after it is updated to bigger value by concurrently running goroutines.
Prevent this by loading the latest maxLabelsLen value and updating it only if it is smaller than the current len(wc.labels)
before the exit from callback passed to stream.Parse.
While at it, return early from the callback on the sample_limit exceeding error,
since the rest of the code in the callback becomes no-op after wc.reset().
This simplifies following the logic in the code a bit.
Also remove outdated misleading comment in front of sw.pushData() call inside callbacks passed to stream.Parse.
This comment has no sense after every callback start working with its own goroutine-local wc.