Commit graph

1777 commits

Author SHA1 Message Date
Aliaksandr Valialkin
cbd8a62ba4
docs/CHANGELOG.md: typo fix: use proper backtick closing quote instead of single quote 2023-12-17 19:28:22 +02:00
Roman Khavronenko
bcb2b8247c
vmctl: rename vm-native-disable-retries to vm-native-disable-per-metric-migration (#5476)
The change supposed to better reflect the meaning of this flag.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-17 19:19:22 +02:00
Aliaksandr Valialkin
76b120e355
lib/protoparser/opentelemetry: allow ingesting metrics without resource labels
Some clients may ingest samples via OpenTelemetry protocol without Resource labels.
Previously VictoriaMetrics was silently dropping such samples.

The commit 317834f876 added vm_protoparser_rows_dropped_total{type="opentelemetry",reason="resource_not_set"}
counter for tracking of such dropped samples. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5459

It is better from usability PoV to accept such samples instead of dropping them and incrementing the corresponding counter.
2023-12-17 19:16:43 +02:00
Aliaksandr Valialkin
bed19a0b20
docs/CHANGELOG.md: move the description of the bugfix from 9253c24dd6 into correct place
The description of the bugfix was incorrectly placed in already released v1.96.0,
while it should be placed in tip.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5450
2023-12-17 19:16:42 +02:00
Roman Khavronenko
5f14fa94dd
vmctl: retry requests that failed in the very end for vm-native (#5475)
Before, retries happened only on writes into a network connection
between source and destination. But errors returned by server after
all the data was transmitted were logged, but not retried.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 664fa5cb78)
2023-12-15 11:54:08 +01:00
Hui Wang
ed4f77575f
vmalert: validate schema for -external.url (#5450)
Requests with wrong or no schema in  `-external.url` could be rejected by alertmanager.
So we validate schema on start up.

(cherry picked from commit 9253c24dd6)
2023-12-15 11:54:07 +01:00
Aliaksandr Valialkin
42629dead1
app/vmstorage: addd missing -inmemoryDataFlushInterval command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-12-14 20:47:49 +02:00
Aliaksandr Valialkin
8bc87acfb9
docs/CHANGELOG.md: document the bugfix at 66c76a4d4d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5414
2023-12-14 12:49:03 +02:00
Aliaksandr Valialkin
51acf0179c
app/vmauth: add ability to route requests to different backends depending on the request host 2023-12-14 00:47:00 +02:00
Aliaksandr Valialkin
d59ee93d04
docs/CHANGELOG.md: cut v1.96.0 release 2023-12-13 00:43:04 +02:00
Yury Molodov
e76c44c5b4
vmui: autocomplete usability improvements (#5422)
* vmui: add show quick tip for autocomplete

* vmui: auto-completion usability improvements #5348

* vmui: add const for min symbols in autocomplete

* Use proper queries to VictoriaMetrics

* vmui: fix comments for autocomplete

* app/vmselect: run `make vmui-update`

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-12-13 00:33:27 +02:00
Aliaksandr Valialkin
e4bb2808f1
app/vmselect: add support for vmstorage groups with independent -replicationFactor per group
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5197

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect

Thanks to @zekker6 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/718
2023-12-13 00:14:34 +02:00
hagen1778
3b841fe9ce
app/vmctl: follow-up after 6af732b6f7
Make docs more clear about new feature.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 242472086b)
2023-12-12 13:45:35 +01:00
Dmytro Kozlov
63fc200f16
app/vmctl: enable range steps in reverse order (#5444)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5376

(cherry picked from commit 6af732b6f7)
2023-12-12 13:45:35 +01:00
hagen1778
c2ca6cb84d
docs: fix formatting after a list in CHANGELOG.md
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 1e02efd511)
2023-12-12 13:45:34 +01:00
hagen1778
307bcb8d4d
app/vmctl: follow-up after 27668c9d01
* remove duplications in error messages
* mention the change in CHANGELOG.md

27668c9d01
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 39c405ed4d)
2023-12-11 15:38:22 +01:00
hagen1778
73f843a70e
docs: mention alerts change in CHANGELOG.md
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e13dc04fbf)
2023-12-11 15:38:18 +01:00
Aliaksandr Valialkin
050e15d85b
docs/CHANGELOG.md: document v1.93.9 LTS release 2023-12-11 10:39:38 +02:00
Aliaksandr Valialkin
09fe1255ca
docs/CHANGELOG.md: document v1.87.12 2023-12-10 14:30:30 +02:00
Aliaksandr Valialkin
842aba3f46
deployment/docker: update base Docker image from alpine:3.18.5 to alpine:3.19.0
See https://www.alpinelinux.org/posts/Alpine-3.19.0-released.html
2023-12-10 02:28:31 +02:00
Aliaksandr Valialkin
3d6517b05e
app/vmselect: add -search.maxResponseSeries command-line flag for limiting the number of time series a single response can return
This limit can be used for preventing from high memory usage at Grafana when the response returns too many series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5372
2023-12-10 00:54:32 +02:00
Aliaksandr Valialkin
6203c1a745
docs: follow-up after 49552eaa15
Link to the related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792
Fix heading for `Modifying HTTP headers` chapter at docs/vmagent.md
2023-12-08 23:56:40 +02:00
Aliaksandr Valialkin
49552eaa15
app/vmauth: add support for hot standby mode via first_available load balancing policy
vmauth in `hot standby` mode sends requests to the first url_prefix while it is available.
If the first url_prefix becomes unavailable, then vmauth falls back to the next url_prefix.
This allows building highly available setup as described at https://docs.victoriametrics.com/vmauth.html#high-availability

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792
2023-12-08 23:32:10 +02:00
Roman Khavronenko
276e9301f4
app/vmalert: sanitize label names before sending to Alertmanager (#5442)
Before, vmalert would send notifications with labels containing characters
  not supported by Alertmanager validator, resulting into validation errors
  like `msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"`

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-08 18:09:07 +02:00
Alexander Marshalov
e9cf39f519
added field version to the response for /api/v1/status/buildinfo API for using more efficient API in Grafana for receiving label values, added additional info about setup Grafana datasource (#5370) (#5437) 2023-12-07 16:41:56 +02:00
Aliaksandr Valialkin
9f79342e6a
app/vmselect/prometheus: properly encode Prometheus label values at /federate endpoint
Prometheus spec says that only \, \n and " must be escaped inside label values.
See 995743836e/content/docs/instrumenting/exposition_formats.md (L90)

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5431
2023-12-07 15:36:50 +02:00
Aliaksandr Valialkin
896a0f32cd
lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page
The /service-discovery page contains the list of all the discovered targets
after the commit 487f6380d0 on all the vmagent instances
in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ).

This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum
values per each target at /service-discovery page, which has been dropped becasue of sharding,
e.g. if this target is scraped by other vmagent instances in the cluster.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018
2023-12-07 00:11:30 +02:00
Aliaksandr Valialkin
12e94f10cc
deployment/docker: update Go builder from Go1.21.4 to Go1.21.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.5+label%3ACherryPickApproved
2023-12-06 22:33:27 +02:00
Dmytro Kozlov
6a41e1ec0c
app/vmalert: replace error metrics for gauges with counter metrics (#5217)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 935bec447b)
2023-12-06 19:41:34 +01:00
Aliaksandr Valialkin
8b6bce61e4
lib/promscrape: follow-up for 97373b7786
Substitute O(N^2) algorithm for exposing the `vm_promscrape_scrape_pool_targets` metric
with O(N) algorithm, where N is the number of scrape jobs. The previous algorithm could slow down
/metrics exposition significantly when -promscrape.config contains thousands of scrape jobs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5311
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5335
2023-12-06 17:36:48 +02:00
Aliaksandr Valialkin
509339bf63
app/vmselect: properly adjust the lower bound for the time range where raw samples must be selected for default_rollup() function
Previously the lower bound could be too small, which could result in missing values at the beginning of the graph
for default_rollup() function. This function is automatically applied to all the series selectors if they aren't
explicitly wrapped into a rollup function - see https://docs.victoriametrics.com/MetricsQL.html#implicit-query-conversions

While at it, properly take into account `-search.minStalenessInterval` command-line flag when adjusting
the lower bound for the selected time range.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5388
2023-12-06 14:46:18 +02:00
Hui Wang
065f5a7f9e
vmagent: add vm_promscrape_scrape_pool_targets for scrape jobs like… (#5335)
* vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers

* add extra panel for new metric
2023-12-06 14:46:02 +02:00
Aliaksandr Valialkin
61db92cdc7
Revert "lib/protoparser/datadog: follow-up after 543f218fe96574b9b2189c8350bb09afa349e3bb"
This reverts commit 73d18fbc7a.

Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080
2023-12-05 02:29:00 +02:00
Aliaksandr Valialkin
bf187b2dc9
app/vmagent: add -enableMultitenantHandlers command-line flag
This flag allows converting tenant id to (vm_account_id, vm_project_id) labels.
this flag deprecates `-remoteWrite.multitenantURL` command-line flag,
because `-enableMultitenantHandlers` is easier to use and combine with multitenant url
at vminsert - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels

See https://docs.victoriametrics.com/vmagent.html#multitenancy

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1505
2023-12-05 01:35:59 +02:00
Dmytro Kozlov
6770bad207
app/vmalert: expose /vmalert/api/v1/rule and /api/v1/rule API which returns rule status in JSON format (#5397)
* app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format

* app/vmalert: hide updates if query param not set

* app/vmalert: fix panic (recursion call)

* app/vmalert: add needed group name and file name

* app/vmalert: fix comment, update behavior

* app/vmalert: fix description

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-12-04 22:49:39 +02:00
Aliaksandr Valialkin
a3d0bbfcda
deployment/docker: update backe Docker image from alpine 3.18.4 to 3.18.5
See https://www.alpinelinux.org/posts/Alpine-3.15.11-3.16.8-3.17.6-3.18.5-released.html
2023-12-04 18:17:07 +02:00
Aliaksandr Valialkin
d868155751
app/vmselect: do not limit concurrency for static and fast queries
Previously concurrency for static and fast queries was limited with the -search.maxConcurrentRequests
command-line flag. This could complicate identifying heavy queries via `vmui` at `Top queries` and `Active queries` pages,
since `vmui` and these pages couldn't be opened on overloaded vmselect.

Thanks to @f41gh7 for the idea.
2023-12-04 18:14:29 +02:00
Aliaksandr Valialkin
b6d6a3a530
lib/promscrape: show dropped targets because of sharding at /service-discovery page
Previously the /service-discovery page didn't show targets dropped because of sharding
( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ).

Show also the reason why every target is dropped at /service-discovery page.
This should improve debuging why particular targets are dropped.

While at it, do not remove dropped targets from the list at /service-discovery page
until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets .
Previously the list was cleaned up every 10 minutes from the entries, which weren't updated
for the last minute. This could complicate debugging of dropped targets.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389
2023-12-04 17:42:46 +02:00
Zakhar Bessarab
2992682f6c
lib/backup/s3remote: remove prev object versions for recursive delete (#719)
* lib/backup/s3remote: remove prev object versions for recursive delete

- fix error caused by sending empty objects list to be deleted. This was possible in case old versions of objects where deleted, but root-level entries where still available. This caused paginator to return an empty page which wasn't skipped.

- delete previous versions of objects recursively for S3 remote

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/changelog: add vmbackupmanager fix entry

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/backup/s3remote: unify path construction for S3 objects

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-12-04 17:01:09 +02:00
Aliaksandr Valialkin
9f352f1b93
app/vminsert/newrelic: simplify the code a bit after 1fb8dc0092
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5416
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5421
2023-12-04 16:26:52 +02:00
Dmytro Kozlov
1fb8dc0092
app/vminsert: fix newrelic ingestion in cluster version (#5421)
Properly pass tenant ID to ingested data from newrelic.
Before tenant ID was mistakenly skipped.
2023-12-04 09:38:32 +01:00
Hui Wang
3507e1e27b
vmalert-tool: fix alert_rule_test case when eval_time is not multiple of evaluation_interval (#5387)
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 1911320c86)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 14:00:58 +01:00
Aliaksandr Valialkin
d1445bc0c8
all: expose additional metrics for simplifying debugging of VictoriaMetrics components
Updates https://github.com/VictoriaMetrics/metrics/issues/54

(cherry picked from commit 8eddccfbb4)
2023-12-01 14:00:28 +01:00
Aliaksandr Valialkin
f0215afee3
lib/promrelabel: add keep_if_contains and drop_if_contains relabeling actions
(cherry picked from commit ac65c6b178)
2023-12-01 14:00:20 +01:00
Nikolay
9505d48070
lib/streamaggr: properly reference slice with labels (#5406)
* lib/streamaggr: properly reference slice with labels
by limiting slice capacity. It must fix issues with slice modification, in case of append new slice will be allocated, instead of modifying refrenced slice
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402

* Reduce memory allocations when output_relabel_configs adds new labels to output samples

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
(cherry picked from commit 41f7940f97)
2023-12-01 14:00:18 +01:00
hagen1778
73d18fbc7a
lib/protoparser/datadog: follow-up after 543f218fe9
* prevent /api/v1 from panic on parsing rows
* add tests for Extract function for v1 and v2 api's
* separate request types in different pools to prevent different objects mixing
* add changelog line

543f218fe9
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 98d0f81f21)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 13:56:23 +01:00
hagen1778
1e557b73a5
docs: mention contributor of PR 5368
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5424632ba3)
2023-11-28 12:49:49 +01:00
luckyxiaoqiang
8ce82c5400
app/vmselect/promql: add day_of_year() function (#5368)
Co-authored-by: dingxiaoqiang <dingxiaoqiang@bytedance.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit d7897e0d70)
2023-11-28 12:49:48 +01:00
Aliaksandr Valialkin
2f14394335
app/vmagent: follow-up for 090cb2c9de
- Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check
  for the result returned from these functions.

- Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue.

- Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function
  after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case.

- Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead
  of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage
  cannot keep up with the data ingestion rate.

- Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples.

- Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push
  data to persistent queue when -remoteWrite.disableOnDiskQueue is set.

- Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics,
  because they are replaced with vmagent_remotewrite_samples_dropped_total metric.

- Update 'Disabling on-disk persistence' docs at docs/vmagent.md

- Update stale comments in the code

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110
2023-11-25 12:13:39 +02:00
Nikolay
25ac2aac31
app/vmagent: allow to disabled on-disk persistence (#5088)
* app/vmagent: allow to disabled on-disk queue
Previously, it wasn't possible to build data processing pipeline with a
chain of vmagents. In case when remoteWrite for the last vmagent in the
chain wasn't accessible, it persisted data only when it has enough disk
capacity. If disk queue is full, it started to silently drop ingested
metrics.

New flags allows to disable on-disk persistent and immediatly return an
error if remoteWrite is not accessible anymore. It blocks any writes and
notify client, that data ingestion isn't possible.

Main use case for this feature - use external queue such as kafka for
data persistence.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110

* adds test, updates readme

* apply review suggestions

* update docs for vmagent

* makes linter happy

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-11-25 12:12:29 +02:00