Commit graph

2130 commits

Author SHA1 Message Date
Zhu Jiekun
b37b288dce
docs: [vmagent] Add CHANGELOG for Statsd support in v1.102.0-rc1 (#6494)
### Describe Your Changes
Add CHANGELOG for Statsd support in v1.102.0-rc1
- CHANGELOG is missing for #5053.
- It would be better to include it so that users can be aware of when
this feature is released, rather than attempting to use it with lower
version VM components. See discussion
[here](https://victoriametrics.slack.com/archives/CGZF1H6L9/p1718462859065049?thread_ts=1718451117.668789&cid=CGZF1H6L9).

### Checklist

The following checks are **mandatory**:

- [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-06-16 05:05:06 -07:00
hagen1778
da4fbf61a4
docs: update wording for 6e395048d3
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 15:45:26 +02:00
Hui Wang
6e395048d3
app/vmselect: fix the way of counting raw samples in single query (#6464)
The limit is specified with command-line flag
`-search.maxSamplesPerQuery`.
Previously, samples might be over-counted and query can't be fixed by
reducing time range.
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851
2024-06-14 15:40:30 +02:00
Andrii Chubatiuk
faf67aa8b5
lib/flagutil: use month limit for duration flag for parsed duration assessment (#6486)
use maxMonths limit for parsed duration flag value

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6330

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 15:20:21 +02:00
jackyin
5223981fed
app/vmalert: fix VMAlert oauth2 error (#6478)
Properly set ClientSecret param for notifier.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6471

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 15:06:14 +02:00
hagen1778
de07589bf1
app/vmalert: properly configure authentication with S3 when -s3.configFilePath is specified.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 14:23:01 +02:00
Andrii Chubatiuk
e678a9aa51
lib/backup/s3remote: fixed credsFilePath flag (#6488)
properly use credsFilePath flag value

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6353

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 14:13:02 +02:00
Andrii Chubatiuk
eea361defb
app/vmalert: fixed path prefixes for system routes (#6435)
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6433

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-06-14 13:34:23 +02:00
Roman Khavronenko
51d19485bb
lib/streamaggr: prevent rate_sum and rate_avg from producing NaNs (#6482)
### Describe Your Changes

* check if `lastValue` was seen at least twice with different
timestamps. Otherwise, the difference between last timestamp and
previous timestamp could be `0` and will result into `NaN` calculation
* check if there items left in lastValue map after staleness cleanup.
Otherwise, `rate_avg` could have produce `NaN` result.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-14 10:06:22 +02:00
Zakhar Bessarab
34071ac660
lib/promscrape: increase default value for promscrape.maxDroppedTargets to 10_000 (#6459)
### Describe Your Changes
This limit can be increased since after
4513893ead
tracking of dropped targets uses much less memory per entry.

See:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6381#issuecomment-2156708228


### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-06-12 16:34:18 +02:00
LHHDZ
3a45bbb4e0
app/vmauth: fix discovering backend IPs when url_prefix contains hostname with srv+ prefix (#6401)
This change fixes the following panic:
```
2024-06-04T11:16:52.899Z        warn    app/vmauth/auth_config.go:353   cannot discover backend SRV records for http://srv+localhost:8080: lookup localhost on 10.100.10.4:53: server misbehaving; use it literally
panic: runtime error: integer divide by zero

goroutine 9 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper.func1()
        /Users/lhhdz/wd/projects/go/VictoriaMetrics/lib/httpserver/httpserver.go:291 +0x58
panic({0x103115100?, 0x10338d700?})
        /Users/lhhdz/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.3.darwin-arm64/src/runtime/panic.go:770 +0x124
main.getLeastLoadedBackendURL({0x0?, 0x22?, 0x1400014757b?}, 0x1400013c120?)
        /Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:473 +0x210
main.(*URLPrefix).getBackendURL(0x140000aa080)
        /Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:312 +0xb8
```

---------

Co-authored-by: Haley Wang <haley@victoriametrics.com>
2024-06-12 12:30:44 +02:00
Nikolay
33d07e915f
follow-up docs update after 77f22fdb8d (#6454)
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2024-06-11 11:27:53 +02:00
hagen1778
8d95522529
vmctl: rm --vm-disable-progress-bar flag
It is better to remove deprecated flag completely, so vmctl will
fail if this flag is used and user can immediately fix the issue.

Before, flag was ignored and it is worse then fail fast.

follow-up after 8b46bb0c41 (diff-2bfab3db5cc1baf4c6d3ff6b19901926e3bdf4411ec685dac973e5fcff1c723b)

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 14:02:46 +02:00
Nikolay
d44058bcd6
app/vmauth: adds idleConnTimeout flag, retry trivial errors (#6388)
* adds idleConnTimeout flag, which must reduce probability of `broken
pipe` and `connection reset` errors.
* one-time retry trivial network requests for the same backend

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 12:36:37 +02:00
Dmytro Kozlov
8b46bb0c41
vmctl: disable progress bar for prometheus snapshot migrations (#6385)
* deprecate `--vm-disable-progress-bar` in favour of `--disable-progress-bar`
* new `--disable-progress-bar` consistently disables usage of progress bar
for all migration modes.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6367

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 12:20:52 +02:00
Hui Wang
61dce6f2a1
lib/httpserver: allow reloadAuthKey and configAuthKey to override htt… (#6338)
…pAuth.*

address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329, 
makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey`
behavior the same way,
but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still
protected by httpAuth.*. All the available key are listed in
https://docs.victoriametrics.com/single-server-victoriametrics/#security.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 12:09:47 +02:00
Andrii Chubatiuk
2da45a8368
vmagent: updated dashboard and alert for stream aggregation (#6427)
### Describe Your Changes

Added streaming aggregation section to vmagent dashboards
Added alert for streaming aggregation and deduplication flush timeouts
Removed deprecated compose versions from compose files


Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 11:49:00 +02:00
Aliaksandr Valialkin
e492bf0ae9
docs/CHANGELOG.md: document v1.93.15 LTS release
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.93.15
2024-06-07 23:43:07 +02:00
Aliaksandr Valialkin
914b9068a8
docs/CHANGELOG.md: add changelog for v1.97.5 LTS release
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.97.5
2024-06-07 20:03:10 +02:00
Aliaksandr Valialkin
09c7accac9
docs/CHANGELOG.md: cut v1.202.0-rc1 release 2024-06-07 16:53:42 +02:00
Aliaksandr Valialkin
36be090cd5
lib/streamaggr: follow-up for 7cb894a777
- Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples().
  This should reduce string allocation rate, since strings can be re-used between aggrState flushes.
- Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map.
- Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples().
- Add missing string interning at rateAggrState.pushSamples().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402
2024-06-07 16:27:26 +02:00
Roman Khavronenko
7cb894a777
lib/streamaggr: reduce number of inuse objects (#6402)
The main change is getting rid of interning of sample key. It was
discovered that for cases with many unique time series aggregated by
vmagent interned keys could grow up to hundreds of millions of objects.
This has negative impact on the following aspects:
1. It slows down garbage collection cycles, as GC has to scan all inuse
objects periodically. The higher is the number of inuse objects, the
longer it takes/the more CPU it takes.
2. It slows down the hot path of samples aggregation where each key
needs to be looked up in the map first.

The change makes code more fragile, but suppose to provide performance
optimization for heavy-loaded vmagents with stream aggregation enabled.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-06-07 15:45:52 +02:00
Roman Khavronenko
5f46f8a11d
lib/promrelabel: speedup label match by __name__ (#6432)
The change adds a fastpath for `equalValue` comparisons against
`__name__` label by avoiding calls to `toCanonicalLabelName` func. This
speedups matches by metric name like `'foo'`. See bench stats below:
```
benchcmp old.txt new.txt

benchmark                                           old ns/op     new ns/op     delta
BenchmarkIfExpression/equal_label:_last-10          35.6          35.1          -1.18%
BenchmarkIfExpression/equal_label:_middle-10        18.3          17.3          -5.41%
BenchmarkIfExpression/equal_label:_first-10         1.20          1.24          +2.74%
BenchmarkIfExpression/equal___name__:_last-10       10.1          4.96          -50.75%
BenchmarkIfExpression/equal___name__:_middle-10     5.79          3.16          -45.41%
BenchmarkIfExpression/equal___name__:_first-10      1.17          1.05          -9.76%
```

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-07 15:44:48 +02:00
Andrii Chubatiuk
185fac03b3
lib/streamaggr: metrics to track dropped, nan samples and samples lag (#6358)
### Describe Your Changes

Added streamaggr metrics to:
 - `vm_streamaggr_samples_lag_seconds` - samples lag
- `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN
samples
- `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old
samples
2024-06-06 14:06:11 +02:00
Zhu Jiekun
215f36a402
docs: [deployment] update CHANGELOG.md to include go version change 1.22.4 (#6412)
### Describe Your Changes
Update CHANGELOG to include go version change introduced in
43cf221681

Also see:
- https://go.dev/doc/devel/release
-
https://github.com/golang/go/issues?q=milestone%3AGo1.22.4+label%3ACherryPickApproved

### Checklist

The following checks are **mandatory**:

- [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-06-05 11:17:36 +04:00
pludov
3ddae77c63
lib/fs: support NFS implementations that return EEXIST instead of ENOTEMPTY (#6398)
### Describe Your Changes

Fix for issue #6396: according to rmdir manpage, ENOTEMPTY and EEXIST
should be treated equally

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6396

### Checklist

The following checks are **mandatory**:

- [x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Ludovic Pollet <ludovic.pollet@exfo.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-04 15:17:38 +02:00
hagen1778
a5f81f67fd
app/vmalert: rm extra response for unsupported path
Unsupported path is already handled by `lib/httpserver`.
This prevents from misleading errors in logs caused by double-writing response headers.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-03 12:52:02 +02:00
Hui Wang
e3e40cb848
vmalert-tool: fix float values template in input_series (#6395)
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6391

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-03 11:49:44 +02:00
Zakhar Bessarab
7dc9124ba7
deployment/docker: add scratch-based images (#6386)
### Describe Your Changes

Scratch based images will be using a separate tag: "(version)-scratch"
and will be built for the same architecture as regular images.
This is useful for environments with higher security standards. In this
case using alpine as base layer requires updating images more frequently
in order to get the latest updates for the base image, even in case the
user did not need to update VictoriaMetrics version.

Tested that scratch images work for:
- vmagent - enterprise with kafka and opensource
- cluster
- single-node

No issues observed so far.

cc: @tenmozes 

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-06-03 11:43:28 +02:00
Arkadii Yakovets
c740a8042e
docs: fix docs/ and README.md spelling errors (#6362)
Fixes `docs/` and `README.md` typos and errors.

Signed-off-by: Arkadii Yakovets <ark@victoriametrics.com>
2024-06-03 10:04:13 +02:00
Nikolay
b97916276f
app/vmalert: adds idleConnTimeout flags and retry trivial network errors (#6382)
* "*.idleConnTimeout" flags must reduce probability of `write: broken
pipe` and `read: connection reset by peer` errors Those errors may occur
if remote server closes TCP socket for connection, while it's still
exist at client.
* single time retries for `write: broken pipe` and `read: connection
reset by peer` must handle a case for incorrectly configured timeouts at
middleware proxies, mitigate minor network issues.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5661

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-05-30 17:54:42 +02:00
Roman Khavronenko
b984f4672e
lib/storage: filter deleted label names and values from `/api/v1/labe… (#6342)
…ls` and `/api/v1/label/.../values`

Check for deleted metrics when `match[]` filter matches small number of
time series (optimized path).

The issue was introduced
[v1.81.0](https://docs.victoriametrics.com/changelog_2022/#v1810).

Related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6300 Updates
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-29 14:07:44 +02:00
Alexander Marshalov
a6cc7098fe
Update base Alpine image to 3.20.0 to avoid security risks (#6370)
fixes: CVE-2023-42366, CVE-2023-42363, CVE-2024-4603, CVE-2024-2511,
CVE-2024-24788, CVE-2024-24787
2024-05-28 19:36:15 +02:00
hagen1778
1be1e9a7a4
deployment/alerts: add new alerting rules TooLongLabelValues and TooLongLabelNames to notify about truncation of label values or names respectively.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-24 15:09:52 +02:00
Nikolay
69d244e6fb
lib/mergeset: adds tracking for indexdb records drop (#6297)
It allows to create alert for possible item drops at indexdb. It may
happen, if ingested metric size exceeds max indexdb item size.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-05-24 14:55:20 +02:00
hagen1778
6272ae21eb
docs: fix changelog formatting
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-23 10:02:37 +02:00
Nikolay
a5d1013042
lib/storage: change default value for maxLabelValueLen to 1024 (#6313)
* It must reduce memory usage for misbehaving clients. Since
VictoriaMetrics stores sparse index inmemory.
* Reduce disk space usage for indexdb.
* Prevent possible indexDB items drops.
* It may trigger slow insert and new timeseries registration due to
default value for flag change

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-05-22 21:53:53 +02:00
hagen1778
9dd9b4442f
dashboards: use $__interval variable for offsets and look-behind windows in annotations
This should improve precision of `restarts` and `version change` annotations when
 zooming-in/zooming-out on the dashboards.

 The change also makes `restarts` dashboard visible on the panels, so user can disable it from
 displaying if needed. This could be useful when restarts overlap with version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-22 16:32:51 +02:00
Roman Khavronenko
ac836bcf6c
lib/backup: add -s3TLSInsecureSkipVerify command-line flag (#6318)
* The new flag can be used for for skipping TLS certificates
verification when connecting to S3 endpoint. Affects vmbackup,
vmrestore, vmbackupmanager.

* replace deprecated `EndpointResolver` with `BaseEndpoint`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-22 13:58:39 +02:00
Hui Wang
d7b5062917
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.
2024-05-22 10:52:51 +02:00
Hui Wang
974b7783ee
vmalert: speed up reloading rules from object storage by verifying ob… (#755)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6210
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-21 16:34:16 +02:00
hagen1778
c746ba154d
deployment/dashboards: fix AnnotationQueryRunner error in Grafana
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-21 11:39:02 +02:00
Yury Molodov
f14497f1cd
vmui: fix URL params handling for navigation (#6284)
This PR fixes the handling of URL parameters to ensure correct browser
navigation using the back and forward buttons.

#6126

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5516#issuecomment-1867507232
2024-05-20 14:39:08 +02:00
Roman Khavronenko
7ce052b32d
lib/streamaggr: skip empty aggregators (#6307)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-20 14:03:28 +02:00
viperstars
3661373cc2
app/vmagent/remotewrite: skip sending empty block to downstream server (#6241)
Occasionally, vmagent sends empty blocks to downstream servers. If a
downstream server returns an unexpected response, vmagent gets stuck in
a retry loop. While vmagent handles 400 and 409 errors, there are
various prometheus remote write implementations that return different
error codes. For example, vector returns a 422 error. To mitigate the
risk of vmagent getting stuck in a retry loop, it is advisable to skip
sending empty blocks to downstream servers.

Co-authored-by: hao.peng <hao.peng@smartx.com>
Co-authored-by: Zhu Jiekun <jiekun.dev@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-05-17 14:55:17 +02:00
Yury Molodov
be291c36f7
vmui: remove redundant requests on the Explore Cardinality page (#6263)
Remove redundant requests on the Explore Cardinality page.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6240
2024-05-17 14:08:33 +02:00
Yury Molodov
4ad577cc6f
vmui: fix calendar display (#6255)
Fix the calendar display issue occurring with the `UTC+00:00` timezone

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6239
2024-05-17 14:06:04 +02:00
Andrii Chubatiuk
f153f54d11
app/vmagent: add global aggregator (#6268)
Add global stream aggregation for VMAgent

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467
2024-05-17 14:00:47 +02:00
Roman Khavronenko
4f0525852f
app/vmalert/datasource: reduce number of allocations when parsing instant responses (#6272)
Allocations are reduced by implementing custom json parser via fastjson
lib.
The change also re-uses `promInstant` object in attempt to reduce number
of
allocations when parsing big responses, as usually happens with heavy
recording rules.

```
name                                old allocs/op  new allocs/op  delta
ParsePrometheusResponse/Instant-10     9.65k ± 0%     5.60k ± 0%   ~     (p=1.000 n=1+1)

```

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-15 15:18:33 +02:00
Nikolay
6a6e34ab8e
app/vmauth: explicitly unregister metrics set for auth config (#6252)
it's needed to remove Summary metric type from the global state of
metrics package. metrics package tracks each bucket of summary and
periodically swaps old buckets with new.

Simple set unregister is not enough to release memory used by Set

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247
2024-05-14 09:26:50 +02:00