Commit graph

3032 commits

Author SHA1 Message Date
Alexander Marshalov
3c315a1c23
vmalert: support any status code from the range 200-299 from alertmanager as successful (#6111)
* any status code from the range 200-299 from alertmanager to vmalert is not considered an error from now on (#6110)

* add changelog
2024-04-17 09:42:53 +02:00
Aliaksandr Valialkin
26725e9a5a
deployment: update Go builder from 1.22.1 to 1.22.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.22.2+label%3ACherryPickApproved
2024-04-04 02:31:43 +03:00
hagen1778
ebef0809d1
docs: follow-up after 623d257faf
623d257faf
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 2e843a8ed9)
2024-03-29 14:56:58 +01:00
Jiekun
eb4b59613b
app/vmalert: respect batch size limit for remote write on shutdown (#6039)
During shutdown period of vmalert, remotewrite client retrieve all pending time series from buffer queue, compose them into 1 batch and execute remote write.

This final batch may exceed the limit of -remoteWrite.maxBatchSize, and be rejected by the receiver (gateway, vmcluster or others).

This changes ensures that even during shutdown vmalert won't exceed the max batch size limit for remote write
destination.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6025

(cherry picked from commit 623d257faf)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-29 14:56:58 +01:00
Hui Wang
617653309d
vmalert: fix sending alert messages (#6028)
* vmalert: fix sending alert messages
1. fix `endsAt` field in messages that send to alertmanager, previously rule with small interval could never be triggered;
2. fix behavior of `-rule.resendDelay`, before it could prevent sending firing message when rule state is volatile.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit d7224b2d1c)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-28 15:09:28 +01:00
Aliaksandr Valialkin
1568445944
deployment/docker: update Go builder from Go1.21.7 to Go1.22.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.22.1+label%3ACherryPickApproved
2024-03-06 21:08:11 +02:00
Aliaksandr Valialkin
2c47204378
docs: consistently use https://docs.victoriametrics.com/lts-releases/ link for LTS releases 2024-03-01 02:57:48 +02:00
Aliaksandr Valialkin
9ab4d0fd13
docs/CHANGELOG.md: cut v1.93.13 LTS release 2024-03-01 01:59:16 +02:00
Aliaksandr Valialkin
38dc127275
app/{vmagent,vminsert}: follow-up for 434a5803e7
Document the /opentelemetry/v1/metrics endpoint instead of /opentelemetry/api/v1/push,
since the /v1/metrics suffix is hardcoded in OpenTelemetry protocol specification.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5871
2024-02-29 18:10:40 +02:00
Nikolay
395c076b9f
app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path (#5871)
* app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path
it must fix compatibility with opentemetry-collector [spec](https://opentelemetry.io/docs/specs/otlp/\#otlphttp-request)
this suffix is hard-coded and cannot be changed with collector configuration

* Apply suggestions from code review

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 18:09:54 +02:00
Hui Wang
26d5054faf
metricsql: fix label_join() when dst_label is equal to one of the `… (#5886)
* metricsql: fix label_join() when `dst_label` is equal to one of the `src_label`

* Update app/vmselect/promql/transform.go

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 16:04:41 +02:00
Aliaksandr Valialkin
c432aba12f
deployment/docker: downgrade Go builder from 1.22.0 to 1.21.7
Go1.22.0 has the bug https://github.com/golang/go/issues/65705 ,
which prevents vmagent from normal operation.
2024-02-29 13:50:18 +02:00
Aliaksandr Valialkin
4530db76fd
docs/CHANGELOG.md: document d68bb658ce
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5833
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5834
2024-02-23 01:27:41 +02:00
Alexander Marshalov
4b2ba16e27
[lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5814)
* [lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

* fixed tests

* fixed test

* Revert "fixed test"

This reverts commit 8a29764806.

* Revert "fixed tests"

This reverts commit 9ce13d1042.

* Revert "[lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)"

This reverts commit a7a04bd4

* [lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-02-23 00:56:45 +02:00
hagen1778
76a82cd4e5
app/vmalert: consistently sort groups by name and filename on /groups page
This should prevent non-deterministic sorting for groups with identical names.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 00:32:13 +02:00
Aliaksandr Valialkin
11e0bc88a2
docs/CHANGELOG.md: document f8207e33a2 2024-02-17 17:59:11 +02:00
Aliaksandr Valialkin
1ef66d8562
docs/CHANGELOG.md: cut v1.93.12 LTS release 2024-02-14 18:57:57 +02:00
Aliaksandr Valialkin
f2ba311c80
all: upgrade Go builder from Go1.21.7 to Go1.22.0
See https://go.dev/doc/go1.22
2024-02-12 22:46:51 +02:00
Aliaksandr Valialkin
a2a218e284
app/victoria-metrics: properly send staleness markers on victoriametrics shutdown if -selfScrapeInterval > 0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/943
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2024-02-08 15:51:50 +02:00
Aliaksandr Valialkin
b15efdc743
docs/CHANGELOG.md: clarify the bugfix description for e1bf8440eb 2024-02-08 13:02:55 +02:00
Aliaksandr Valialkin
054bff1f39
lib/mergeset: prevent from possible too big indexBlockSize panic
This panic could occur when samples with too long label values are ingested into VictoriaMetrics.
This could result in too long fistItem and commonPrefix values at blockHeader (up to 64kb each).
This may inflate the maximum index block size by 4 * maxIndexBlockSize.
2024-02-08 12:59:16 +02:00
Aliaksandr Valialkin
726751a311
app/vmselect/promql: properly handle precision errors in rollup functions
changes(), increases_over_time() and resets() shouldn't take into account
value changes, which may occur because of precision errors.

The maximum guaranteed precision for raw samples stored in VictoriaMetrics is 12 decimal digits.
So do not count relative changes for values if they are smaller than 1e-12 comparing to the value.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/767
2024-02-08 02:34:40 +02:00
Aliaksandr Valialkin
d5b9b56ae4
all: update Go builder from Go1.21.6 to Go1.21.7
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.7+label%3ACherryPickApproved
2024-02-07 04:03:28 +02:00
Aliaksandr Valialkin
ae1b09e953
vendor: update github.com/VictoriaMetrics/metricsql from v0.70.0 to v0.70.1
This should help with the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5604
2024-02-06 22:16:00 +02:00
Aliaksandr Valialkin
891a9f9511
docs/CHANGELOG.md: cut v1.93.11 LTS release 2024-02-01 17:15:08 +02:00
Aliaksandr Valialkin
f3e358491c
deployment/docker: upgrade base Docker image from Alpine 3.19.0 to 3.19.1
See https://www.alpinelinux.org/posts/Alpine-3.19.1-released.html
2024-01-30 22:48:29 +02:00
Roman Khavronenko
71ade3731d
lib/promscrape: respect 0 value for series_limit param (#5663)
* lib/promscrape: respect `0` value for `series_limit` param

Respect `0` value for `series_limit` param in `scrape_config`
even if global limit was set via `-promscrape.seriesLimitPerTarget`.
Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`.

This behavior aligns with possibility to override `series_limit` value via
relabeling with `__series_limit__` label.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update docs/CHANGELOG.md

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 13:17:20 +02:00
Aliaksandr Valialkin
2356b02687
app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery()
This is a follow-up for cf03e11d89

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630
2024-01-22 01:42:27 +02:00
Hui Wang
52d0e776ed
lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-22 01:35:51 +02:00
Hui Wang
979bec75ff
app/vmselect/promql: properly handle possible negative results caused… (#5608)
* app/vmselect/promql: properly handle possible negative results caused by float operations precision error in rollup functions like rate() or increase()

* fix test
2024-01-22 01:05:46 +02:00
Nikolay
d4848f6834
app/vmselect/netstorage (#5649)
* app/vmselect/netstorage

correctly handle errGlobal set

* wip

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-22 01:03:35 +02:00
Roman Khavronenko
7734cbbed0
app/vmselect: properly calculate start param for queries with too big look-behind window (#5630)
Properly determine time range search for instant queries with too big look-behind window like `foo[100y]`.
 Previously, such queries could return empty responses even if `foo` is present in database.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-21 23:50:03 +02:00
Aliaksandr Valialkin
79d22739f2
docs/CHANGELOG.md: typo fix in release date for v1.93.10 2024-01-17 01:46:27 +02:00
Aliaksandr Valialkin
48d097c6b2
docs/CHANGELOG.md: cut v1.97.10 LTS release 2024-01-17 01:12:44 +02:00
Aliaksandr Valialkin
992d239745
deployment/docker: update Go builder from Go1.21.5 to Go1.21.6 2024-01-17 00:09:56 +02:00
Aliaksandr Valialkin
3023b68d42
lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop()
Previously the was a race condition when the background goroutine still could try collecting metrics
from already stopped resources after returning from pushmetrics.Stop().
Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning.

This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549
and the commit fe2d9f6646 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-16 23:17:58 +02:00
hagen1778
f23d23bc82
app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 23:09:27 +02:00
Aliaksandr Valialkin
d41218613b
lib/storage: follow-up for 4b8088e377
- Clarify the bugfix description at docs/CHANGELOG.md
- Simplify the code by accessing prefetchedMetricIDs struct under the lock
  instead of using lockless access to immutable struct.
  This shouldn't worsen code scalability too much on busy systems with many CPU cores,
  since the code executed under the lock is quite small and fast.
  This allows removing cloning of prefetchedMetricIDs struct every time
  new metric names are pre-fetched. This should reduce load on Go GC,
  since the cloning of uin64set.Set struct allocates many new objects.
2024-01-16 22:43:13 +02:00
Roman Khavronenko
29908eaa6c
lib/storage: properly check for storage/prefetchedMetricIDs cache expiration deadline (#5607)
Before, this cache was limited only by size.
Cache invalidation by time happens with jitter to prevent thundering herd problem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 22:39:56 +02:00
Aliaksandr Valialkin
742b1b5443
app/vmselect/promql: follow-up for ce4f26db02
- Document the bugfix at docs/CHANGELOG.md
- Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218
- Revert unrelated changes in lib/filestream and lib/fs
- Use simpler test at app/vmselect/promql/exec_test.go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
2024-01-16 22:19:16 +02:00
Nikolay
e26a10360d
lib/awsapi: properly assume role with webIdentity token (#5495)
* lib/awsapi: properly assume role with webIdentity token
introduce new irsaRoleArn param for config. It's only needed for authorization with webIdentity token.
First credentials obtained with irsa role and the next sts assume call for an actual roleArn made with those credentials.
Common use case for it - cross AWS accounts authorization
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3822

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-12-20 19:13:47 +02:00
Roman Khavronenko
35b0db270c
vmctl: retry requests that failed in the very end for vm-native (#5475)
Before, retries happened only on writes into a network connection
between source and destination. But errors returned by server after
all the data was transmitted were logged, but not retried.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-17 19:27:02 +02:00
Aliaksandr Valialkin
bb0771754d
lib/protoparser/opentelemetry: allow ingesting metrics without resource labels
Some clients may ingest samples via OpenTelemetry protocol without Resource labels.
Previously VictoriaMetrics was silently dropping such samples.

The commit 317834f876 added vm_protoparser_rows_dropped_total{type="opentelemetry",reason="resource_not_set"}
counter for tracking of such dropped samples. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5459

It is better from usability PoV to accept such samples instead of dropping them and incrementing the corresponding counter.
2023-12-17 19:24:30 +02:00
Hui Wang
2ba7190078
vmalert: validate schema for -external.url (#5450)
Requests with wrong or no schema in  `-external.url` could be rejected by alertmanager.
So we validate schema on start up.
2023-12-17 19:21:33 +02:00
Aliaksandr Valialkin
b210da0fcd
app/vmstorage: addd missing -inmemoryDataFlushInterval command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-12-14 21:08:20 +02:00
Aliaksandr Valialkin
6fce57713b
docs/CHANGELOG.md: document the bugfix at 66c76a4d4d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5414
2023-12-14 12:50:35 +02:00
hagen1778
0ae61f07eb
app/vmctl: follow-up after 27668c9d01
* remove duplications in error messages
* mention the change in CHANGELOG.md

27668c9d01
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 39c405ed4d)
2023-12-12 23:52:56 +02:00
Aliaksandr Valialkin
0773f3ff65
docs/CHANGELOG.md: cut v1.93.9 release 2023-12-10 14:40:29 +02:00
Aliaksandr Valialkin
d96ceb6a2d
deployment/docker: update base Docker image from alpine:3.18.5 to alpine:3.19.0
See https://www.alpinelinux.org/posts/Alpine-3.19.0-released.html
2023-12-10 02:29:08 +02:00
Roman Khavronenko
f0215454b4
app/vmalert: sanitize label names before sending to Alertmanager (#5442)
Before, vmalert would send notifications with labels containing characters
  not supported by Alertmanager validator, resulting into validation errors
  like `msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"`

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-08 18:09:57 +02:00