- Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples().
This should reduce string allocation rate, since strings can be re-used between aggrState flushes.
- Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map.
- Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples().
- Add missing string interning at rateAggrState.pushSamples().
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402
The main change is getting rid of interning of sample key. It was
discovered that for cases with many unique time series aggregated by
vmagent interned keys could grow up to hundreds of millions of objects.
This has negative impact on the following aspects:
1. It slows down garbage collection cycles, as GC has to scan all inuse
objects periodically. The higher is the number of inuse objects, the
longer it takes/the more CPU it takes.
2. It slows down the hot path of samples aggregation where each key
needs to be looked up in the map first.
The change makes code more fragile, but suppose to provide performance
optimization for heavy-loaded vmagents with stream aggregation enabled.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
### Describe Your Changes
Added streamaggr metrics to:
- `vm_streamaggr_samples_lag_seconds` - samples lag
- `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN
samples
- `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old
samples
(cherry picked from commit 185fac03b3)
### Describe Your Changes
Fix for issue #6396: according to rmdir manpage, ENOTEMPTY and EEXIST
should be treated equally
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6396
### Checklist
The following checks are **mandatory**:
- [x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: Ludovic Pollet <ludovic.pollet@exfo.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3ddae77c63)
Unsupported path is already handled by `lib/httpserver`.
This prevents from misleading errors in logs caused by double-writing response headers.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a5f81f67fd)
### Describe Your Changes
Scratch based images will be using a separate tag: "(version)-scratch"
and will be built for the same architecture as regular images.
This is useful for environments with higher security standards. In this
case using alpine as base layer requires updating images more frequently
in order to get the latest updates for the base image, even in case the
user did not need to update VictoriaMetrics version.
Tested that scratch images work for:
- vmagent - enterprise with kafka and opensource
- cluster
- single-node
No issues observed so far.
cc: @tenmozes
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7dc9124ba7)
* "*.idleConnTimeout" flags must reduce probability of `write: broken
pipe` and `read: connection reset by peer` errors Those errors may occur
if remote server closes TCP socket for connection, while it's still
exist at client.
* single time retries for `write: broken pipe` and `read: connection
reset by peer` must handle a case for incorrectly configured timeouts at
middleware proxies, mitigate minor network issues.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5661
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
---------
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit b97916276f)
It allows to create alert for possible item drops at indexdb. It may
happen, if ingested metric size exceeds max indexdb item size.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 69d244e6fb)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* It must reduce memory usage for misbehaving clients. Since
VictoriaMetrics stores sparse index inmemory.
* Reduce disk space usage for indexdb.
* Prevent possible indexDB items drops.
* It may trigger slow insert and new timeseries registration due to
default value for flag change
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176
---------
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
This should improve precision of `restarts` and `version change` annotations when
zooming-in/zooming-out on the dashboards.
The change also makes `restarts` dashboard visible on the panels, so user can disable it from
displaying if needed. This could be useful when restarts overlap with version change events.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9dd9b4442f)
* The new flag can be used for for skipping TLS certificates
verification when connecting to S3 endpoint. Affects vmbackup,
vmrestore, vmbackupmanager.
* replace deprecated `EndpointResolver` with `BaseEndpoint`
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ac836bcf6c)
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c746ba154d)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7ce052b32d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Occasionally, vmagent sends empty blocks to downstream servers. If a
downstream server returns an unexpected response, vmagent gets stuck in
a retry loop. While vmagent handles 400 and 409 errors, there are
various prometheus remote write implementations that return different
error codes. For example, vector returns a 422 error. To mitigate the
risk of vmagent getting stuck in a retry loop, it is advisable to skip
sending empty blocks to downstream servers.
Co-authored-by: hao.peng <hao.peng@smartx.com>
Co-authored-by: Zhu Jiekun <jiekun.dev@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3661373cc2)
Allocations are reduced by implementing custom json parser via fastjson
lib.
The change also re-uses `promInstant` object in attempt to reduce number
of
allocations when parsing big responses, as usually happens with heavy
recording rules.
```
name old allocs/op new allocs/op delta
ParsePrometheusResponse/Instant-10 9.65k ± 0% 5.60k ± 0% ~ (p=1.000 n=1+1)
```
Signed-off-by: hagen1778 <roman@victoriametrics.com>
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4f0525852f)
it's needed to remove Summary metric type from the global state of
metrics package. metrics package tracks each bucket of summary and
periodically swaps old buckets with new.
Simple set unregister is not enough to release memory used by Set
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247
(cherry picked from commit 6a6e34ab8e)
Added `rate` and `rate_avg` output
Resource usage is the same as for increase output, tested on a benchmark
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9c3d44c8c9)
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html):
allow configuring `-remoteWrite.disableOnDiskQueue` and
`-remoteWrite.dropSamplesOnOverload` cmd-line flags per each
`-remoteWrite.url`. See this [pull
request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065).
Thanks to @rbizos for implementaion!
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add
labels `path` and `url` to metrics
`vmagent_remotewrite_push_failures_total` and
`vmagent_remotewrite_samples_dropped_total`. Now number of failed pushes
and dropped samples can be tracked per `-remoteWrite.url`.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Raphael Bizos <r.bizos@criteo.com>
(cherry picked from commit 87fd400dfc)
Set correct suffix `<output>_prometheus` for aggregation outputs
`increase_prometheus` and `total_prometheus`
Before, outputs `total` and `total_prometheus` or `increase` and
`increase_prometheus` had the same suffix.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8a03e987cb)
Though labels compressor is quite resource intensive, each aggregator
and deduplicator instance has it's own compressor. Made it shared across
all aggregators to consume less resources while using multiple
aggregators.
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
(cherry picked from commit a9283e06a3)
### Describe Your Changes
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6041
#### Added
- Added service discovery support for Vultr.
#### Docs
- `CHANGELOG.md`, `sd_configs.md`, `vmagent.md` are updated.
#### Note
- Useful links:
- Vultr API:
https://www.vultr.com/api/#tag/instances/operation/list-instances
- Vultr client SDK: https://github.com/vultr/govultr
- Prometheus SD:
https://github.com/prometheus/prometheus/tree/main/discovery/vultr
---
### Checklist
The following checks are mandatory:
- [X] I have read the [Contributing
Guidelines](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/CONTRIBUTING.md)
- [x] All commits are signed and include `Signed-off-by` line. Use `git
commit -s` to include `Signed-off-by` your commits. See this
[doc](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) about
how to sign your commits.
- [x] Tests are passing locally. Use `make test` to run all tests
locally.
- [x] Linting is passing locally. Use `make check-all` to run all
linters locally.
Further checks are optional for External Contributions:
- [X] Include a link to the GitHub issue in the commit message, if issue
exists.
- [x] Mention the change in the
[Changelog](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/CHANGELOG.md).
Explain what has changed and why. If there is a related issue or
documentation change - link them as well.
Tips for writing a good changelog message::
* Write a human-readable changelog message that describes the problem
and solution.
* Include a link to the issue or pull request in your changelog message.
* Use specific language identifying the fix, such as an error message,
metric name, or flag name.
* Provide a link to the relevant documentation for any new features you
add or modify.
- [ ] After your pull request is merged, please add a message to the
issue with instructions for how to test the fix or try the feature you
added. Here is an
[example](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048#issuecomment-1546453726)
- [x] Do not close the original issue before the change is released.
Please note, in some cases Github can automatically close the issue once
PR is merged. Re-open the issue in such case.
- [x] If the change somehow affects public interfaces (a new flag was
added or updated, or some behavior has changed) - add the corresponding
change to documentation.
Signed-off-by: Jiekun <jiekun.dev@gmail.com>
(cherry picked from commit 17e3d019d2)
When at least one remote write has deduplication configured it cleans up
timeseries while they can be in use by another remote write without
deduplication
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 879771808b)
The panel will show how many ongoing select queries are processed by vmstorage
and should help to identify resource bottlenecks. See panel description for more details.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d386a68b59)
Allowing to select multiple instance IPs makes it much easier to view
metrics for longer periods of time in dynamic environments such as
Kubernetes. In k8s update will also cause IP to change making it harder
to use dashboard to check the status.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5869
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6b493582da)