Commit graph

2638 commits

Author SHA1 Message Date
Roman Khavronenko
5f9ad22884
vmalert: update retry policy for pushing data to -remoteWrite.url (#4504)
By default, vmalert will make multiple retry attempts with exponential delay.
The total time spent during retry attempts shouldn't exceed `-remoteWrite.retryMaxTime` (default is 30s).
When retry time is exceeded vmalert drops the data dedicated for `-remoteWrite.url`.
Before, vmalert dropped data after 5 retry attempts with 1s delay between attempts (not configurable).

See `-remoteWrite.retryMinInterval` and `-remoteWrite.retryMaxTime` cmd-line flags.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-06-22 15:14:23 +02:00
Roman Khavronenko
4aad7a43df
vmalert: properly interrupt remotewrite retries on shutdown (#4505)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-22 15:07:32 +02:00
Aliaksandr Valialkin
83aa78dfb4
app/vlstorage: export vl_active_merges and vl_merges_total metrics 2023-06-21 20:58:57 -07:00
Aliaksandr Valialkin
57541d5cea
Revert "app/vlselect/logsql: use buffered writer in order to save syscalls when sending big amounts of data to clients"
This reverts commit c19048dc13.

Reason for revert: it has been appeared that the net/http.ResponseWriter is already buffered,
so there in no need in double bufferring
2023-06-21 20:40:01 -07:00
Aliaksandr Valialkin
c19048dc13
app/vlselect/logsql: use buffered writer in order to save syscalls when sending big amounts of data to clients 2023-06-21 20:25:32 -07:00
Aliaksandr Valialkin
3ded68d4b8
app/vmui/Makefile: consistently use tabs instead of spaces in multi-line Makefile rules 2023-06-21 19:57:48 -07:00
Aliaksandr Valialkin
8be52ef217
app/vlselect: handle vmui at /select/vmui path instead of /vmui
This simplifies routing at auth proxies such as vmauth to vlselect component,
which serves VMUI - just route all the requests, which start with /select/, to vlselect.
2023-06-21 19:52:50 -07:00
Aliaksandr Valialkin
dde9ceed07
app/vlinsert/jsonline: code prettifying 2023-06-21 19:39:22 -07:00
Aliaksandr Valialkin
83a1249299
app/vlselect/logsql: properly handle the error from ParseLogMessage 2023-06-21 10:28:47 -07:00
Dmytro Kozlov
81c1124a0f
app/victoria-logs: remove header control (#4493) 2023-06-21 18:35:42 +02:00
Alexander Marshalov
944793c0f7
removed debug message from jsonlines handler of victorialogs (#4492)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-06-21 09:34:12 -07:00
dmitryk-dk
58c84ad90d app/victoria-logs: add vmui dependecies 2023-06-21 09:08:03 -07:00
Yury Molodov
4a7b17ed76
vmui: logs explorer (#4484)
* feat: add a logs page

* app/vixtoria-logs: add handlers for vmui

* feat: add group logs

* feat: add logs build

* app/vixtoria-logs: update make file

* app/vixtoria-logs: cleanup make

* app/vixtoria-logs: fix description

* fix: correct url for logs

* fix: save display view in query params

* fix: change logo for logs build

* app/vixtoria-logs: remove dashboards from vlselect

* app/vixtoria-logs: enable user

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-06-21 16:57:09 +02:00
Alexander Marshalov
bf081d157e
jsonline support for data ingestion in vlinsert (#4487)
added json lines / json stream format for ingestion to vlinsert
2023-06-21 15:31:28 +02:00
Aliaksandr Valialkin
7346bb4f03
app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize 2023-06-21 01:11:25 -07:00
Roman Khavronenko
94f516df43
docs/vmalert: specify version requirements for new features (#4480)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-21 09:44:00 +02:00
Aliaksandr Valialkin
332b295268
docs/VictoriaLogs: change the structure of the docs in order to be more maintainable
The change is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4477
2023-06-20 22:23:39 -07:00
Aliaksandr Valialkin
3edc548584
app/vlinsert/elasticsearch: allow empty lines in Elasticsearch bulk protocol
Empty lines may appear there during debugging and custom client implementation
2023-06-20 21:23:38 -07:00
Aliaksandr Valialkin
07bd118b1a
app/vlinsert/elasticsearch: optimize parsing command line
Just search for "create" or "index" substrings there instead of spending CPU time on its parsing
2023-06-20 21:18:12 -07:00
Aliaksandr Valialkin
e7bf36f0b6
app/vlstorage: log -storageDataPath and basic stats for the opened storage 2023-06-20 20:47:53 -07:00
Aliaksandr Valialkin
00c3dbd15d
app/victoria-logs: add ability to debug data ingestion by passing debug query arg to data ingestion API 2023-06-20 20:02:46 -07:00
Roman Khavronenko
79a5499cb2
vmalert: retry all errors except 4XX status codes (#4461)
vmalert: retry all errors except 4XX status codes

Retry all errors except 4XX status codes while pushing via remote-write
to the remote storage. Previously, errors like broken connection could
prevent vmalert from retrying the request.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-20 13:24:45 +02:00
Yury Molodov
66b42a6772
vmui: memory leak fix (#4455)
* fix: optimize the preparation of data for the graph

* fix: optimize tooltip rendering

* fix: optimize re-rendering of the chart

* vmui: memory leak fix
2023-06-20 11:29:24 +02:00
Aliaksandr Valialkin
a638f5e2bf
app/vmctl/utils: properly use timezone in TestGetTime() 2023-06-19 23:16:50 -07:00
Aliaksandr Valialkin
87b66db47d
app/victoria-logs: initial code release 2023-06-19 22:55:12 -07:00
Aliaksandr Valialkin
78eaa056c0
app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils
While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs
2023-06-19 22:34:20 -07:00
Dmytro Kozlov
7a92263459
vmctl: increase retry backoff policy delay (#4447)
vmctl: update backoff policy on retries to reduce probability of overloading for `source` or `destination` databases
2023-06-14 09:47:44 +02:00
Roman Khavronenko
28b23e7a4c
docs/vmalert: mention same labelset error in docs (#4443)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-13 17:03:53 +02:00
Dmytro Kozlov
ddb3ae0f00
vmctl: finish retries if context canceled (#4442)
vmctl: interrupt backoff retries if import context is cancelled

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-06-13 13:54:24 +02:00
Roman Khavronenko
c4be49e21b
docs: mention stream aggregation as more efficient approach for aggregation (#4429)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-09 13:47:54 +02:00
Dmytro Kozlov
24f34347f1
docs: clarify -retentionPeriod flag usage (#4417)
app/vmstorage: clarify the min value for `-retentionPeriod` flag

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-06-09 09:46:25 +02:00
Roman Khavronenko
476c7bdd6f
all: update Go builder from Go1.20.4 to Go1.20.5 (#4427)
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.5+label%3ACherryPickApproved

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-09 09:42:55 +02:00
Zakhar Bessarab
ce7141383d
app/vmagent/remotewrite: fix vmagent panic on shutdown (#4407)
app/vmagent/remotewrite: fix vmagent panic on shutdown

Currently, when vmagent is stopping it first flushes pending series in remote write context and proceeds to stop streaming aggregation. This leads to streaming aggregation being unable to write results into pending timeseries (since it is already nil) and panic.
This can lead to losing some aggregation results being lost almost silently.

The fix is reordering flow to first stop streaming aggregation and flush all pending time series after that.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-06-07 15:45:43 +02:00
Dmytro Kozlov
fc5292d8ed
app/vmctl: add verbose output for docker installations or when TTY isn't available (#4333)
* app/vmctl: add verbose output for docker installations or when TTY isn't available

* app/vmctl: fix tests

* app/vmctl: make vmctl interactive if no tty

* app/vmctl: cleanup

* app/vmctl: add comment

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-06-02 14:57:08 +02:00
Dmytro Kozlov
c7884f8686
app/{graphite,netstorage,prometheus}: fix graphite search tags api limits, remove redudant limit from SeriesHandler handler (#4352)
* app/{graphite,netstorage,prometheus}: fix graphite search tags api limits, remove unused limit from SeriesHandler handler,

* app/{graphite,netstorage,prometheus}: use search.maxTagValues for Graphite

* app/{graphite,netstorage,prometheus}: update CHANGELOG.md

* app/{graphite,netstorage,prometheus}: use own flags for Graphite API

* app/{graphite,netstorage,prometheus}: cleanup

* app/{graphite,netstorage,prometheus}: cleanup

* app/{graphite,netstorage,prometheus}: update docs

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-06-02 14:34:04 +02:00
Roman Khavronenko
de94812088
vmalert: fix nil map assignment (#4392)
* vmalert: fix nil map assignment

The storage instance with nil map params was created for remote-read purposes.
And before change 7a9ae9de0d this map was ignored in ApplyParams.
Now, it started to be used and vmalert panics in runtime.

The fix properly inits map for at `NewVMStorage` and verifies it is not nil
on assignment in `ApplyParams`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add to changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-02 11:38:55 +02:00
Roman Khavronenko
eccecdf177
app/vmalert: follow-up after 7a9ae9de0d (#4381)
7a9ae9de0d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-01 11:38:48 +02:00
Dmytro Kozlov
9843ec0e1d
app/vmui: fix behavior when changing url in global settings (#4332)
* app/vmui: fix behavior when changing url in global settings

* app/vmctl: minor fix

* app/vmui: fix behavior when changing url in global settings
2023-06-01 12:19:03 +03:00
Roman Khavronenko
8185c2466c
docs: clarify deduplication docs (#4371)
The purpose of the change is too highlight what HA pair is
and how deduplication needs identical labels to be present
in raw samples.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4367

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-01 10:28:21 +02:00
gsakun
20dc3db71e
app/vmalert: fix datasource.roundDigits Parameter (#4341)
app/vmalert: fix querybuild clone and extraParams merge logic

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4340
2023-06-01 09:44:11 +02:00
Roman Khavronenko
1b24f4729b
vmalert: mention default value for external.url flag (#4365)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-30 12:33:17 +02:00
Artem Navoiev
83c1944184
docs: fix markdown for title
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-05-29 22:22:25 +02:00
Artem Navoiev
6967c3cb95 fix image path
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-05-29 07:11:15 -07:00
Artem Navoiev
8a4c89ea22 docs: change images from markdown tag to html for migration
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-05-29 07:11:15 -07:00
Roman Khavronenko
51cea6cad4
vmalert: properly form assets address if httpPrefix set (#4351)
Properly form path to static assets in WEB UI
 if `http.pathPrefix` set.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4349

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-29 07:38:13 +02:00
Nikolay
228ea03bda
app/vmselect/graphite: fixes tests for arm (#4348)
at arm based CPUs only 9 digits after comma matches for tests.
Especially at holtWinters functions. Since it only takes effect at tests
it makes no sense for changing float prescision at actual functions
2023-05-26 09:34:15 +02:00
Dmytro Kozlov
4060f3f261
app/vmctl: fix tests (#4345) 2023-05-25 17:04:04 +02:00
Roman Khavronenko
66ed6fe62f
vmalert: do not return nil rules for /api/v1/rules (#4344)
The fix addresses a case when vmalert is configured with a group
which has `name`, but doesn't have `rules` configured. In this
case it still returns a `nil` instead of `[]` slice.

Fixing this via current commit.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-25 16:56:54 +02:00
Roman Khavronenko
76f7e66d8e
vmalert: fix the typo in popup (#4331)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-19 12:45:15 +02:00
Aliaksandr Valialkin
2b53ff774b
app/vmselect: log locations of sendPrometheusError() calls
Previously the location inside the sendPrometheusError() was logged.
This could make hard investigating error locations via `vm_log_messages_total` metric.
2023-05-18 20:39:53 -07:00
Aliaksandr Valialkin
d9b3a92348
app/vmselect/vmui: run make vmui-update after 39c1b0f8d1 2023-05-18 12:15:12 -07:00
Aliaksandr Valialkin
0645688f32
app/vmauth: allow -auth.config without users section of unauthorized_user section is present here
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4083
2023-05-18 10:41:34 -07:00
Dmytro Kozlov
7cbda6796c
app/vmctl: set default value for --vm-native-step-interval flag (#4327)
* app/vmctl: set default value for `--vm-native-step-interval` flag

* app/vmctl: update CHANGELOG.md

* app/vmctl: update CHANGELOG.md, fix docs

* app/vmctl: fix typo

* app/vmctl: fix typo
2023-05-18 13:43:35 +02:00
Aliaksandr Valialkin
63b1cab454
app/vmauth: simplify the code after 4a1d29126c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4242
2023-05-17 00:37:05 -07:00
Nikolay
4a1d29126c
app/vmauth: retry common network dial errors (#4280)
with tracking request body read calls
it allows us to retry POST and PUT requests
2023-05-17 00:19:33 -07:00
Nikolay
16df18ec14
app/vmauth: do not return invalid credentials (#4288)
at http response by default
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4188

based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4190
Thanks @raj-kumar-j  for init implementation
2023-05-17 00:09:47 -07:00
Alexander Marshalov
7b15834cbe
added backup locking/unlocking against retention policy to vmbackupmanager (#558)
* added backup locking/unlocking against retention policy to vmbackupmanager

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* added docs for new commands

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comments

Signed-off-by: Alexander Marshalov <_@marshalov.org>

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-16 11:23:36 -07:00
Roman Khavronenko
f68d93cca2
vmalert: follow-up after 669becd011 (#4318)
* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-16 18:51:38 +02:00
Michael Hoffmann
3a65f4a733
vmalert: improve retry logic for remote write (#4134)
vmalert should not retry on 4xx status codes
according to https://prometheus.io/docs/concepts/remote_write_spec/
2023-05-16 16:30:03 +02:00
Yury Molodov
39c1b0f8d1
vmui: refactor code using custom hooks (#4145)
* refactor: replace boolean useState with useBoolean

* refactor: replace useResize with useWindowSize/useElementSize

* refactor: replace addEventListener with useEventListener

* refactor: replace navigator.clipboard.writeText with useCopyToClipboard

* fix: prevent redirect loop
2023-05-16 16:41:06 +03:00
Roman Khavronenko
686bf6c5ad
vmctl: update VictoriaMetrics migration section (#4310)
Remove unnecessary information to simplify the description and tips.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-15 16:15:01 +02:00
Aliaksandr Valialkin
2613110f75
deployment/docker: update base docker image from 3.17.3 to 3.18.0
See https://www.alpinelinux.org/posts/Alpine-3.18.0-released.html
2023-05-12 17:31:21 -07:00
Yury Molodov
f0fad01e8a
vmui: add notification for non-matching queries (#4301)
vmui: add notification for non-matching queries (#4211)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4211
2023-05-12 13:55:47 +02:00
Yury Molodov
1e4a9a8dfe
vmui: enhancements to top queries page (#4299)
* feat: improvement of the top queries page

* vmui/docs: enhancements to top queries page

* Apply suggestions from code review

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-05-11 13:47:32 -07:00
Roman Khavronenko
adc5635a07
vmalert: add hints to filter buttons (#4296)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-11 16:38:08 +02:00
Denys Holius
bb7a295146 app/vmctl/vm_native.go: fixed a typo in error message 2023-05-11 16:36:13 +02:00
Yury Molodov
a55d3f6882
vmui: increase font-size and fix the text display (#4273)
vmui: change default font size to 14px for better readability
vmui: fix bug with missing text on buttons in safari

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-05-11 14:50:09 +02:00
Dmytro Kozlov
7ea2531db0
app/vmui: added table where Labels with the highest number of unique values show (#4271)
* app/vmui: added Labels with the highest number of unique values

* app/vmui: cleanup

* app/vmui: cleanup

* app/vmui: add table description

* app/vmui: fix comment, updated CHANGELOG.md

* app/vmui: disable links

* app/vmui: added actions to the table, it will show values for selected label with the highest number of series

* app/vmui: fix comment
2023-05-11 15:19:36 +03:00
Roman Khavronenko
2caf0b05c6
vmalert: correctly update seriesFetched metric for const exprs (#4287)
Previously, metric `vmalert_alerting_rules_last_evaluation_series_fetched`
would be set to 0 for const expressions, because const expression do not match
any series. This may result into a confusion: no series were matched but response isn't empty.
The change updates the logic behind metric: if no series were matched but there are samples
in response - use amount of samples as number of series.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 15:04:05 +02:00
Roman Khavronenko
51196739af
Vmalert UI updates (#4276)
* vmalert: expand rule groups on anchor click

before, anchor click was only updating the URL.
To expand the group, user had to click on rule's block.
Now, group will toggle automatically.

* vmalert: allow filtering group in web UI

The new filter allows to filter groups and rules within
groups by: errors only or noMatch only.

The filtering supposed to help navigating big numbers of groups/rules.
Filtering is reflected in URL, so can be shared as a link.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 14:38:13 +02:00
Alexander Marshalov
2e494e2375
fixed typos in documentation and commandline flags descriptions (#4275) 2023-05-10 09:50:41 +02:00
Roman Khavronenko
491831df49
vminsert: properly reset labels object on aggregation (#4278)
Without reset, labels duplicates could have been added during stream aggregation.
Since `ctx.Labels` is reused during processing of many series, each series will
add its labels to the context. Even if the same labels were already addeded on prev
iteration. Now, we reset `ctx.Labels` on each iteration to contain so labels from
different series didn't interfere.

This could have cause exceeding of the limit on number of labels per pushed time series.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4277

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 08:33:58 -07:00
Aliaksandr Valialkin
3c0470f91e
docs/vmalert.md: clarify docs regarding the support of recursive globs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4041
2023-05-08 16:21:44 -07:00
Aliaksandr Valialkin
185894fc5a
app/vmagent/remotewrite: make more user-friendly the warning message about too small -remoteWrite.maxdiskUsagePerURL value
This is a follow-up for bc17f4828c .
While at it, document the change at docs/CHANGELOG.md .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4195
2023-05-08 15:42:30 -07:00
Aliaksandr Valialkin
d906e83e5e
app/vmauth: merge default_url example into multi-url example in order to reduce the amounts of text to read for the user
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4084

This is a follow-up for 041e188df8
2023-05-08 15:12:23 -07:00
Aliaksandr Valialkin
ec3943d14a
app/vmselect: small cleanup after 4f3f9950d0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3807
2023-05-08 14:57:11 -07:00
Aliaksandr Valialkin
1db9b78b88
app/vmselect: small cleanup after 68e31a6000
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3811
2023-05-08 14:34:37 -07:00
Aliaksandr Valialkin
80946f06c2
app/{vmselect,vmctl}: move ParseTime() to lib/promutils
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4091

This is a follow-up for e2053baf32
2023-05-08 14:17:57 -07:00
Aliaksandr Valialkin
92a549bccb
app/vmauth/README.md: mention about ip filters and concurrency limiter at Security chapter 2023-05-08 13:35:58 -07:00
Aliaksandr Valialkin
23595465b8
app/vmauth: refer ip_filters option in example auth config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3491
2023-05-08 13:29:18 -07:00
Aliaksandr Valialkin
8f43f496d7
docs: document IP filters functionality in vmauth
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3491

This is a follow-up for 2f08ed3be2
2023-05-08 12:12:16 -07:00
Aliaksandr Valialkin
09268d41ed
app/vmauth: remove duplicate mentioning of -auth.config value in error message in logs on usuccessful load of -auth.config
This is a follow-up for 25759082f4
2023-05-08 10:16:16 -07:00
Aliaksandr Valialkin
1b288e0a05
all: update Go builder from Go1.20.3 to Go1.20.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.4+label%3ACherryPickApproved
2023-05-08 09:40:55 -07:00
Roman Khavronenko
fa3a17938e
vmalert: follow-up after cae87da (#4269)
* vmalert: follow-up after cae87da

cae87da4bb
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: update struct comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rm typo

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 13:31:54 +02:00
Haleygo
cae87da4bb
vmalert: support reading rule from http url (#4212)
vmalert: support reading rule's config from HTTP URL
2023-05-08 09:52:57 +02:00
Roman Khavronenko
5b8450fc1b
app/vmalert: detect alerting rules which don't match any series at all (#4198)
app/vmalert: detect alerting rules which don't match any series at all

vmalert starts to understand /query responses which contain object:
```
"stats":{"seriesFetched": "42"}
```
If object is present, vmalert parses it and populates a new field
`SeriesFetched`. This field is then used to populate the new metric
`vmalert_alerting_rules_last_evaluation_series_fetched` and to
display warnings in the vmalert's UI.

If response doesn't contain the new object (Prometheus or
VictoriaMetrics earlier than v1.90), then `SeriesFetched=nil`.
In this case, UI will contain no additional warnings.
And `vmalert_alerting_rules_last_evaluation_series_fetched` will
be set to `-1`. Negative value of the metric will help to compile
correct alerting rule in follow-up.

Thanks for the initial implementation to @Haleygo
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4056

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4039

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 09:36:39 +02:00
Alexander Marshalov
56b84140a9
added new consulagent service discovery (#3953) (#4217) 2023-05-04 11:36:21 +02:00
Nikolay
4786f036de
lib/backup: fixes path generation for windows (#4133)
replaces custom fsync function with standard Fsync methods for files.
fixes pattern matching for parts and properly generate backup path for local fs.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-05-03 10:48:53 +02:00
Roman Khavronenko
baf456978d
vmselect: exit early from queue on context cancel (#4223)
* vmselect: exit early from queue on context cancel

When `-search.maxConcurrentRequests` is reached, vmselect puts
request in the queue. It is expected, that requests in the queue
will be processed as soon as it would be enough capacity to do so.

However, it could happen that while request was waiting its turn,
the client could have already cancel it (close the connection,
or just close the tab with UI). In this case, we should de-queue
such requests to avoid spending extra resources on them.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: address review comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-03 10:42:17 +02:00
Roman Khavronenko
3383f12a4b
vmalert: fix API to return non-nil values (#4222)
Properly return empty slices instead of nil for `/api/v1/rules` and `/api/v1/alerts` API handlers.
This improves compatibility with Grafana.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-28 10:08:29 +02:00
Roman Khavronenko
eb746a4dab
Revert "http server: limit max concurrent requests (#4185)" (#4215)
This reverts commit 77f76371

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:47 +02:00
Roman Khavronenko
29e059e49c
app/vmalert: follow-up after 6c322b4a00 (#4214)
6c322b4a00

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:21 +02:00
Haleygo
6c322b4a00
vmalert: allow configuring custom notifier headers per group (#4088)
vmalert: allow configuring custom notifier headers per group
2023-04-27 12:17:26 +02:00
Zakhar Bessarab
b21a55febf
app/vmalert: add support of recursive path globs for rules and templates (#4148)
Supports using `**` for `-rule` and `-rule.templates`: `dir/**/*.tpl` loads contents of dir and all subdirectories recursively.

See: #4041

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-04-26 19:20:22 +02:00
Zakhar Bessarab
89a1c941c2
app/vmalert: return an error when using query function in -external.alert.source flag (#4191)
Templating of `-external.alert.source` is not expected to have access to the query which was causing runtime error when query function was passed as nil.
See: #4181

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-26 15:31:14 +02:00
Dmytro Kozlov
bc17f4828c
app/vmagent,lib/persistentqueue: show warning message if --remoteWrite.maxDiskUsagePerURL flag lower than 500MB (#4196)
* app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB

* app/vmagent,lib/persistentqueue: linter fix

* app/vmagent,lib/persistentqueue: fix comment
2023-04-26 13:23:01 +03:00
Alexander Marshalov
041e188df8
added default_url field in vmauth users config (#4084) (#4156)
* added default url field in vmauth users config (#4084)

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-04-26 11:04:35 +02:00
Yury Molodov
4f3f9950d0
vmui: add metric relabel debug (#3889)
* feat: add metric relabel debug (#3807)

* fix: add link to relabeling cookbook

* lib/promrelabel: merge, fix conflicts

* lib/promrelabel: fix diff

* docs/vmui: add metric relabel playground

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-04-26 11:53:29 +03:00
Yury Molodov
cf567badcf
vmui: display heatmap in the Explore Metrics (#4124)
* feat: display heatmap in the explore metrics (#4111)

* fix: correct calc step for heatmap

* fix: remove spaces in the result of getDurationFromMilliseconds
2023-04-25 13:14:57 +03:00
Yury Molodov
a80b0aebe8
vmui: add a comparison of data to the Cardinality Explorer (#4123)
* feat: add button "show today" to date picker

* feat: add comparison with the prev day (#3967)

* vmui/docs: add comparison of data to cardinality page
2023-04-25 12:21:57 +03:00
Yury Molodov
68e31a6000
vmui: Integrate WITH template playground (#3831)
* feat: add WithTemplate page

* app/vmselect/prometheus: enable json mode for expand with expr API

* app/vmselect/prometheus: enable CORS and add content type

* feat: add api for expand with templates

* fix: remove console from useExpandWithExprs

* app/vmselect/prometheus: fix escaping

* vmui:  integrate WITH template

* app/vmctl: check content type instead of form param

* fix: add content-type for fetch with-exprs

* fix: add a header to the server's response that allows the "Content-Type" header

* app/vmctl: added comment and cleanup

* app/vmctl: use format query param

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-04-25 11:40:01 +03:00
Dmytro Kozlov
e2053baf32
app/vmctl: add support for the different time format in the native binary protocol (#4189)
* app/vmctl: add support for the different time format in the native binary protocol

* app/vmctl: update flag description, update CHANGELOG.md

* app/vmctl: add comment to exported function
2023-04-24 18:33:30 +02:00
Alexander Marshalov
73e22dcf81
added unauthorized_user field in vmauth users config (#4083) (#4157)
added `unauthorized_user` field in vmauth users config (#4083)

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-04-24 14:57:13 +02:00
Roman Khavronenko
77f76371d0
http server: limit max concurrent requests (#4185)
* lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag

Introduce `-http.maxConcurrentRequests` command-line flag to protect
VM components from resource exhaustion during unexpected spikes of HTTP requests.
By default, the new flag's value is set to 0 which means no limits are applied.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/httpserver: mention http.maxConcurrentRequests in docs

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-24 14:52:06 +02:00
Yury Molodov
05ab34f2c8
vmui: fix freeze when query regular with heatmap query (#4093)
* fix: fix freeze when query regular with heatmap query

* vmui/docs: fix freeze when query regular with heatmap query
2023-04-21 11:59:09 +03:00
Yury Molodov
3140aa34de
vmui: fix bug where tenant list was not displayed (#4162)
* fix: modify the condition for querying tenants

* fix: change getTenantIdFromUrl output to string
2023-04-21 11:56:08 +03:00
Alexander Marshalov
25759082f4
vmauth ip filters (refactoring) (#4059)
Added ip filters (allow_list and deny_list) for enterprise-version of vmauth (#3491)

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-04-20 19:08:27 +02:00
Roman Khavronenko
de61a73c63
vmalert: retry datasource requests with EOF or unexpected EOF errors (#4146)
* vmalert: retry datasource requests with EOF or unexpected EOF errors

Retry failed read request on the closed connection one more time.
This may improve rules execution reliability when connection
between vmalert and datasource closes unexpectedly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: fix old tests

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-19 10:18:32 +02:00
Aliaksandr Valialkin
52006149b2
lib/storage: replace OpenStorage() with MustOpenStorage()
Callers of OpenStorage() log the returned error and exit.
The error logging and exit can be performed inside MustOpenStorage()
alongside with printing the stack trace for better debuggability.
This simplifies the code at caller side.
2023-04-14 23:02:40 -07:00
Aliaksandr Valialkin
3727251910
lib/fs: add MustReadDir() function
Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity.
The fs.MustReadDir() logs the error with the directory name and the call stack on error
before exit. This information should be enough for debugging the cause of the error.
2023-04-14 22:10:46 -07:00
Dmytro Kozlov
2a5b9ff782
app/vmctl: fix performance degradation, add flag to disable backoff policy (#4097)
* app/vmctl: change api for getting metric names

* app/vmctl: fix tests

* app/vmctl: add flag to enable backoff policy, fix test, performance improvements

* app/vmctl: use one http client

* app/vmctl: made linter happy

* app/vmctl: updated documentation and CHANGELOG.md

* app/vmctl: cleanup

* app/vmctl: rename flag

* app/vmctl: cleanup

* app/vmctl: fix comments

* app/vmctl: fix metrics parser problem, improve tests
2023-04-14 09:34:54 +03:00
Aliaksandr Valialkin
30425ca81a
lib/fs: rename WriteFileAtomically to MustWriteAtomic
Callers of this function log the returned error and exit.
So let's just log the error with the given filepath and the call stack
inside the function itself and then exit. This simplifies the code
at callers' place while leaves the same level of debuggability in case of errors.
2023-04-13 22:41:15 -07:00
Aliaksandr Valialkin
036a7b7365
lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.

While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
2023-04-13 22:11:59 -07:00
Aliaksandr Valialkin
e1211a1187
app/vmstorage: deprecate -bigMergeConcurrency command-line flag
Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled
growth of unmerged parts, which, in turn, increases CPU usage and query durations.

So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag
can be used instead for controlling the concurrency of background merges.
2023-04-13 20:40:24 -07:00
Aliaksandr Valialkin
90b876cd1e
app/vmbackupmanager: sync with enterprise-single-node branch after 41a54c775891c87e3d5ed59ff0769c869dd2fe71 2023-04-13 19:29:06 -07:00
Aliaksandr Valialkin
1fe87691ec
app/vmbackupmanager/README.md: sync with docs/vmbackupmanager.md after 4b2cc1b32c 2023-04-10 10:51:49 -07:00
Aliaksandr Valialkin
b5d18c0d28
app/vmctl/terminal: fix builds for GOOS=freebsd and GOOS=openbsd
This is a follow-up for 8da9502df6
2023-04-06 17:09:07 -07:00
Aliaksandr Valialkin
a3eebf118e
app/vmselect/vmui: run make vmui-update after 01fc228fb0 2023-04-06 15:07:41 -07:00
Dmytro Kozlov
244c18fa38
app/vmctl: add multiple filters defined in --vm-native-filter-match flag to discovered metric names (#4063)
* app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names

* app/vmctl: fix comments

* app/vmctl: move function buildMatchWithFilter to the correct place

* app/vmctl: update CHANGELOG.md

* app/vmctl: fix CI, remove error wrapping

* app/vmctl: fix CI, simplify `Set()`
2023-04-06 15:06:52 -07:00
Yury Molodov
01fc228fb0
vmui: heatmap fixes (#4086)
* fix: correct display of errors for query

* fix: change the logic of histogram detection

* feat: hide empty buckets from the graph

* fix: revert server url
2023-04-06 15:02:44 -07:00
Aliaksandr Valialkin
4770377fb3
app/vmselect/vmui: run make vmui-update after a1601929ec 2023-04-06 03:20:13 -07:00
Yury Molodov
a1601929ec
fix: correct the description of shortcut keys (#4057) 2023-04-05 22:19:36 -07:00
Yury Molodov
74eea53dee
vmui: implement heatmap improvements (#4078)
* fix: disabled limits for histogram

* fix: add sorted buckets by upper bound

* refactor: move line chart components to folder

* feat: implement heatmap improvements (https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3384#issuecomment-1484023162)

* app/vmselect/vmui: `make vmui-update`

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-05 22:13:57 -07:00
Aliaksandr Valialkin
5074cc672a
all: update Go builder from Go1.20.2 to Go1.20.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
2023-04-05 13:37:22 -07:00
Zakhar Bessarab
a8d1497024
app/vmalert: update Grafana URLs to match latest format (#4061)
See: #4019

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-04 15:25:29 +04:00
Aliaksandr Valialkin
de0fe02f6e
app/vmselect/vmui: run make vmui-update after edb45d7fc1 2023-04-02 21:21:51 -07:00
Yury Molodov
edb45d7fc1
feat: add accept/cancel buttons for settings (#4013) (#4052) 2023-04-02 21:20:10 -07:00
Aliaksandr Valialkin
06b721dd07
app/vmselect/vmui: run make vmui-update after 42087518ba 2023-04-01 00:40:49 -07:00
Yury Molodov
42087518ba
vmui: tips for working with the graph and legend (#4045)
* feat: add tips for working with the graph and legend

* feat: add the ability to collapse the legend

* vmui/docs: add the ability to collapse the legend

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-01 00:38:18 -07:00
Yury Molodov
dd200409d9
vmui: add a tip for JSON and Table tabs (#4000)
* feat: add a tip for JSON and Table tabs

* feat: add Hyperlink component

* fix: update Hyperlink

* fix: update link to instant query
2023-03-31 23:53:06 -07:00
Aliaksandr Valialkin
ffdf430be0
app/vmselect/graphite: open source Graphite Render API 2023-03-31 23:25:04 -07:00
Aliaksandr Valialkin
d577657fb7
lib/streamaggr: follow-up for ff72ca14b9
- Make sure that the last successfully loaded config is used on hot-reload failure
- Properly cleanup resources occupied by already initialized aggregators
  when the current aggregator fails to be initialized
- Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config
  This should simplify monitoring and debugging failed reloads
- Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa
  could be in use at realoadSaConfig()
- Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability
  on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push().
- Remove fine-grained aggregator reload - reload all the aggregators on config change instead.
  This simplifies the code a bit. The fine-grained aggregator reload may be returned back
  if there will be demand from real users for it.
- Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag
- Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639
2023-03-31 22:30:38 -07:00
Max Golionko
59c350d0d2
fix: app/vmui/Dockerfile-web to reduce vulnerabilities (#4044)
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-3368755
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-3368755
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-5291795
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-5291795

Co-authored-by: snyk-bot <snyk-bot@snyk.io>
2023-03-31 16:29:44 +02:00
Zakhar Bessarab
5e5fc66e3b
docs/vmctl: add examples of URLs used for migration in different modes (#4042)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-30 17:21:36 +02:00
Roman Khavronenko
4a49577028
vmalert: use missingkey=zero for templating (#4040)
Replace empty labels with "" instead of "<no value>"
during templating, as Prometheus does.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4012

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-30 16:57:00 +04:00
Alexander Marshalov
ff72ca14b9
added hot reload support for stream aggregation configs (#3969) (#3970)
added hot reload support for stream aggregation configs (#3969)

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-03-29 18:05:58 +02:00
Aliaksandr Valialkin
aea6df8197
app/vmagent/remotewrite: cosmetic updates after f3a51e8b1d
- Compare directory names instead of paths to directory when determining which persistent queues must be deleted
  This is less error-prone solution, since paths to the same directory can differ, which could lead
  to accidental directory removal for the existing -remoteWrite.url

- Log the `removed %d dangling queues` message when at least a single queue has been removed

- Consistently use filepath.Join() for creating paths to persistent queues.
  This is needed for Windows support (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70 )

- Clarify the description of the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
2023-03-27 18:33:07 -07:00
Zakhar Bessarab
f3a51e8b1d
app/vmagent: add -remoteWrite.removeDanglingQueues flag (#4017)
* app/vmagent: add `-remoteWrite.removeDanglingQueues` flag which allows to automatically remove dangling persistent queue contents

Related issue: #4014

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmagent: address review feedback

- remove persistent queues files by default
- rename `remoteWrite.removeDanglingQueues` to `remoteWrite.keepDanglingQueues`
- update docs to reflect changed behaviour

Related issue: #4014

* Apply suggestions from code review

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-27 18:15:28 -07:00
Nikolay
9b1e002287
app/vmselect: properly remove temp files at windows system (#4020)
With non-posix compliant systems it's not possible to remove unclosed files.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-27 18:10:15 -07:00
Aliaksandr Valialkin
02ee4ffd4d
app/vmselect/promql: follow-up for 79e1c6a6fc
- Document the fix at docs/CHANGELOG.md
- Add tests with multiple adjancent zero buckets
- Simplify the fix a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4021
2023-03-27 18:03:36 -07:00
Ze'ev Klapow
79e1c6a6fc
fix le buckets when adjacent vmrange is empty (#4021)
There is a bug here where if you have a single bucket like:

foo{vmrange="4.084e+02...4.642e+02"} 2 123

The expected output is three le encoded buckets like:

foo{le="4.084e+02"} 0 123
foo{le="4.642e+02"} 2 123
foo{le="+Inf"} 2 123

This correctly encodes the start and end of the vmrange.
If however, the input contains the previous bucket, and that bucket is
empty then you only get the end le and +Inf out currently, i.e:

foo{vmrange="7.743e+05...8.799e+05"} 5 123
foo{vmrange="6.813e+05...7.743e+05"} 0 123

results in:

foo{le="8.799e+05"} 5 123
foo{le="+Inf"} 5 123

This causes issues when you go to compute a quantile because this means
that the assumed lower bound of the buckets is 0 and this we interpolate
between 0->end rather than the vmrange start->end as expected.
2023-03-27 17:54:19 -07:00
Aliaksandr Valialkin
622000797a
app/vmselect: follow-up for 10ab086366
- Expose stats.seriesFetched at `/api/v1/query_range` responses too
  for the sake of consistency.

- Initialize QueryStats when it is needed and pass it to EvalConfig then.
  This guarantees that the QueryStats is properly collected when the query
  contains some subqueries.
2023-03-27 15:22:00 -07:00
Roman Khavronenko
4021aa11b5
app/vmselect: export seriesFetched stat for /query responses (#3925)
The change adds a new field `seriesFetched` to EvalConfig object.
Since EvalConfig object can be copied inside `Exec`,
`seriesFetched` is a pointer which can be updated by all copied
objects.

The reason for having stats is that other components, like vmalert,
could benefit from this information.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-27 15:18:25 -07:00
Yury Molodov
3214b1c315
vmui: heatmap (#3780)
* fix: add stroke and font for all axes

* feat: add util for generate gradient

* feat: add heatmap plugin

* feat: add heatmap legend

* feat: add heatmap graph (#3384)

* vmui: add heatmap graph (#3384)

* feat: add convert Prometheus to VictoriaMetrics histogram

* fix: prevent re-render graph

* feat: reset step for heatmap

* feat: normalize heatmap data

* fix: format heatmap legend

* wip

* app/vmselect/vmui: run `make vmui-update`

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-26 00:30:02 -07:00
Aliaksandr Valialkin
5832242b44
app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores
This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419
2023-03-25 16:37:07 -07:00
Aliaksandr Valialkin
a1e496ced6
app/vmselect/netstorage: document why runtime.Gosched() is removed at 28f054bb00
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 16:36:51 -07:00
Zakhar Bessarab
28f054bb00
vmselect/netstorage: remove direct calls to Gosched to reduce amount of locks for global scope
using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly.

Updates #3966

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-25 16:34:03 -07:00
Aliaksandr Valialkin
811f4a9380
app/{vmbackup,vmrestore}: publish vmbackup and vmrestore binaries for Windows
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-25 15:08:21 -07:00
Aliaksandr Valialkin
2b851e69d2
app/vmselect/promql: typo fix after e7f46a0aab 2023-03-24 23:46:30 -07:00
Aliaksandr Valialkin
e7f46a0aab
app/vmselect/promql: follow-up for 7205c79c5a
- Allocate and initialize seriesByWorkerID slice in a single go instead
  of initializing every item in the list separately.
  This should reduce CPU usage a bit.
- Properly set anti-false sharing padding at timeseriesWithPadding structure
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-24 23:34:37 -07:00
Zakhar Bessarab
7205c79c5a
app/vmselect/promql: use lock-less approach to gather results of parallel processing for evalRollup* funcs (#4004)
* vmselect/promql: refactor `evalRollupNoIncrementalAggregate` to use lock-less approach for parallel workers computation

Locking there is causing issues when running on highly multi-core system as it introduces lock contention during results merge.

New implementation uses lock less approach to store results per workerID and merges final result in the end, this is expected to significantly reduce lock contention and CPU usage for systems with high number of cores.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: add pooling for `timeseriesWithPadding` to reduce allocations

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: refactor `evalRollupFuncWithSubquery` to avoid using locks

Uses same approach as `evalRollupNoIncrementalAggregate` to remove locking between workers and reduce lock contention.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-24 23:07:12 -07:00
Aliaksandr Valialkin
9436ae3b07
app/vmbackup: simplify code a bit after 5ba347bd2c
Unconditionally call deleteSnapshot() func just after making the snapshot, either successful or unsuccessful

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2055
2023-03-24 22:03:14 -07:00
Zakhar Bessarab
5ba347bd2c
app/vmbackup: delete created snapshot in case of error during backup (#4008)
Related issue: #2055

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-24 21:49:58 -07:00
Aliaksandr Valialkin
ebc1caa5dc
app/vmselect/vmui: run make vmui-update after dc2c712a29 2023-03-24 18:01:39 -07:00
Aliaksandr Valialkin
c1d871a45a
docs/vmauth.md: follow-up for 36edba9bfb
- Document `-configCheckInterval` command-line flag in `quick start` section
- Clarify the addition of `-configCheckInterval` at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3990
2023-03-24 13:22:37 -07:00
Aliaksandr Valialkin
54796f69db
docs/vmagent.md: clarify that there is no need to specify multiple -remoteWrite.url options when writing data to a single VictoriaMetrics cluster when data replication is needed
Also add a link to https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format from `getting started` section,
so users could quickly find how to write data to VictoriaMetrics cluster
2023-03-24 13:07:33 -07:00
Roman Khavronenko
365d2ff0bf
vmalert: add anchor char to Group's link (#4006)
This should help users to see that Group's name is clickable
and used for anchoring.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 09:48:43 +01:00
Roman Khavronenko
8db6a71f83
vmalert: mention VMUI example for alert's source (#4005)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 09:40:55 +01:00
Dmytro Kozlov
dc2c712a29
app/vmui: update cardinality page (#3986)
vmui: update cardinality page

---------

Co-authored-by: Yury Moladau <yurymolodov@gmail.com>
2023-03-23 18:18:02 +01:00
Yury Molodov
023c65968f
vmui: display errors for each query individually (#3987) (#3994) 2023-03-23 13:10:59 +01:00
Alexander Marshalov
36edba9bfb
added configCheckInterval flag for vmauth (#3990) (#3991)
* added configCheckInterval flag for vmauth (#3990)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-03-23 09:34:12 +01:00
Dmytro Kozlov
f0b09a1382
app/vmctl: follow up after aed59b9029 (#3983) 2023-03-21 15:53:53 +01:00
Aliaksandr Valialkin
6a78755b66
docs/vmagent.md: mention in docs that the target relabel debug page shows target url now 2023-03-20 22:20:03 -07:00
Aliaksandr Valialkin
e480b9881e
app/vmselect/promql: pass workerID to the callback inside doParallel()
This opens the possibility to remove tssLock from evalRollupFuncWithSubquery()
in the follow-up commit from @zekker6 in order to speed up the code
for systems with many CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-20 20:54:57 -07:00
Aliaksandr Valialkin
9e16329b2f
app/vmselect/promql: fix TestIncrementalAggr test on systems less than 3 CPU cores
This is a follow-up for 4856a4cf5a
2023-03-20 20:37:18 -07:00
Aliaksandr Valialkin
70959d5dab
app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker()
Call runtime.Gosched() only when there is a work to steal from other workers.
Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork().

This should reduce CPU usage when processing queries on systems with big number of CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-20 20:31:02 -07:00
Aliaksandr Valialkin
4856a4cf5a
app/vmselect: optimize incremental aggregates a bit
Substitute sync.Map with an ordinary slice indexed by workerID.
This should reduce the overhead when updating the incremental aggregate state
2023-03-20 15:37:06 -07:00
Aliaksandr Valialkin
8622dee4b5
app/vmselect/vmui: make vmui-update after d4525bd2d0 2023-03-20 14:35:03 -07:00
Roman Khavronenko
3283f0dae4
vmalert: support logs suppressing during config reloads (#3973)
* vmalert: support logs suppressing during config reloads

The change is mostly required for ENT version of vmalert,
since it supports object-storage for config files.
Reading data from object storage could be time-consuming,
so vmalert emits logs to track the progress.

However, these logs are mostly needed on start or on
manual config reload. Printing these logs each time
`rule.configCheckInterval` is triggered would too verbose.
So the change allows to control logs emitting during
config reloads.

Now, logs are emitted during start up or when SIGHUP is receieved.
For periodicall config checks logs emitted by config pkg are suppressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-20 16:08:30 +01:00
Dmytro Kozlov
8da9502df6
app/vmctl: automatically check tty (#3938)
app/vmctl: automatically detect if TTY is available
2023-03-20 11:16:08 +01:00
Yury Molodov
d4525bd2d0
vmui: support for drag'n'drop in the "Trace analyzer" page (#3971)
vmui: add drag-and-drop support for the trace analyzer page
2023-03-20 11:07:18 +01:00
Yury Molodov
a2af2e5a1b
vmui: improve usability of date/time picker (#3968)
* vmui: allow manually set input date and time
* vmui/docs: improve usability of date/time picker
2023-03-20 09:22:49 +01:00
Aliaksandr Valialkin
43b24164ef
all: add Windows build for VictoriaMetrics
This commit changes background merge algorithm, so it becomes compatible with Windows file semantics.

The previous algorithm for background merge:

1. Merge source parts into a destination part inside tmp directory.
2. Create a file in txn directory with instructions on how to atomically
   swap source parts with the destination part.
3. Perform instructions from the file.
4. Delete the file with instructions.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since the remaining files with instructions is replayed on the next restart,
after that the remaining contents of the tmp directory is deleted.

Unfortunately this algorithm doesn't work under Windows because
it disallows removing and moving files, which are in use.

So the new algorithm for background merge has been implemented:

1. Merge source parts into a destination part inside the partition directory itself.
   E.g. now the partition directory may contain both complete and incomplete parts.
2. Atomically update the parts.json file with the new list of parts after the merge,
   e.g. remove the source parts from the list and add the destination part to the list
   before storing it to parts.json file.
3. Remove the source parts from disk when they are no longer used.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since incomplete partitions from step 1 or old source parts from step 3 are removed
on the next startup by inspecting parts.json file.

This algorithm should work under Windows, since it doesn't remove or move files in use.
This algorithm has also the following benefits:

- It should work better for NFS.
- It fits object storage semantics.

The new algorithm changes data storage format, so it is impossible to downgrade
to the previous versions of VictoriaMetrics after upgrading to this algorithm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-19 01:36:51 -07:00
Roman Khavronenko
8fdd613f25
Vmalert tests (#3975)
* vmalert: add tests for notifier pkg

* vmalert: add tests for remotewrite pkg

* vmalert: add tests for template functions

* vmalert: add tests for web pages

* vmalert: fix int overflow in tests

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-17 15:57:24 +01:00
oliverpool
fbefc940ef
app/vmselect/promql: add test to ensure 8-byte alignment (#3948)
See 0af9e2b693
2023-03-16 09:01:42 -07:00
Dmytro Kozlov
4d68f5b1fc
app/vmctl: integration test for native protocol (#3947)
* app/vmctl: integration test for native protocol

* app/vmctl: implemented two integration tests

* app/vmctl: cleanup

* app/vmctl: split storage init and filling data logic

* app/vmctl: cleanup

* app/vmctl: remove storage from server, used initialization process

* app/vmctl: prepare for parallel run, code cleanup

* app/vmctl: code cleanup

* app/vmctl: remove unused field
2023-03-14 09:55:49 +01:00
Aliaksandr Valialkin
e8225d7d6b
app/vmselect/promql: prevent from cannot unmarshal timeseries from rollupResultCache panic after the upgrade to v1.89.0
The issue has been introduced in 0af9e2b693
2023-03-12 19:09:39 -07:00
Aliaksandr Valialkin
1428aa2c22
app/vmselect/vmui: make vmui-update after 00a0816ab1 2023-03-12 17:19:19 -07:00
Yury Molodov
00a0816ab1
vmui: predefined dashboards docs (#3895)
* fix: correct display predefined panels

* docs: update the documentation for predefined dashboards
2023-03-12 17:16:26 -07:00
Aliaksandr Valialkin
0af9e2b693
app/vmselect/promql: prevent from SIGBUS crash on architecures, which deny unaligned access to 8-byte words (e.g. ARM)
Thanks to @oliverpool for nailing down the root cause of the issue and for the initial attempt to fix it
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3927
2023-03-12 16:32:08 -07:00
Yury Molodov
01367faa39
vmui: remove send step param for instant queries (#3931)
* fix: remove step param for instant queries (#3896)

* vmui: remove send step param for instant queries

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-12 03:09:56 -07:00
Aliaksandr Valialkin
b5db69fe05
app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage
It has been appeared that this interning may lead to increased memory usage and increased CPU usage
when vmselect performs queries, which select big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863
2023-03-12 00:52:35 -08:00
Aliaksandr Valialkin
d95037a175
app/vmctl/README.md: remove trailing space from the line added at 4c3bc04efa 2023-03-12 00:11:51 -08:00
Zakhar Bessarab
f7a7eb8f3e
docs: add a note about cache reset for vmalert backfilling docs (#3940)
docs: add a note about cache reset for vmalert backfilling docs
2023-03-10 13:45:11 +01:00
Dmytro Kozlov
4c3bc04efa
app/vmctl: update importing tips when migrating data with overlapping time range (#3941)
app/vmctl: update importing tips when migrating data with overlapping time range
2023-03-10 11:29:08 +01:00
Dmytro Kozlov
3c9058c168
app/vmctl: add support of basic auth and barer token (#3921)
app/vmctl: add support of basic auth and bearer token
2023-03-09 14:53:29 +01:00
Roman Khavronenko
d66bae212b
app/vmalert: log number of configration files found for each specified -rule (#3936)
The change also introduces `List` method to `FS` interface.
The `List` method can be used for wildcard support in object storage FS.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-03-09 14:46:19 +01:00
Dmytro Kozlov
7f54c181bb
app/vmctl: follow up after 09e3742a82 (#3937)
app/vmctl: follow up after 09e3742a82
2023-03-09 13:28:55 +01:00
Roman Khavronenko
3de7fc5c71
security: bump go version to 1.20.2 (#3935)
upgrade Go builder from Go1.20.1 to Go1.20.2
See the list of issues addressed in Go1.20.2 here (https://github.com/golang/go/issues?q=milestone%3AGo1.20.2+label%3ACherryPickApproved).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-09 13:20:54 +01:00
Gowtam Lal
09e3742a82
app/vmctl: Allow vmnative exports to skip HTTP keepalive. (#3909)
app/vmctl: support HTTP keepalive disabling for vm-native mode
2023-03-09 09:47:46 +01:00
Aliaksandr Valialkin
1b5dc9f91d
all: follow-up for 7a3e16e774
- Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth,
  so it is consistent with the description at vmauth and victoria-metrics
- Add a sample of panic text to docs/CHANGELOG.md, so it could be googled
- Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-08 01:26:55 -08:00
Aliaksandr Valialkin
05709bdfae
app/vmselect/vmui: make vmui-update after bbf8e459a0 2023-03-08 01:15:52 -08:00
Aliaksandr Valialkin
138757629b
app/vmctl/README.md: remove trailing space after cc5b916237 2023-03-08 00:32:00 -08:00
Aliaksandr Valialkin
9e71462cee
all: typo fixes of the same type as in the d056be710b 2023-03-08 00:30:21 -08:00
Aliaksandr Valialkin
1bb529e23e
app/vmagent/remotewrite: follow-up for e3a756d82869f8c357b072f6e635ebfc7d65dd2c
- Document the fix
- Move the detection of VictoriaMetrics remoteWrite protocol from client.init() to newHTTPClient()
  This simplifies the fix to the following diff:

diff --git a/app/vmagent/remotewrite/client.go b/app/vmagent/remotewrite/client.go
index 099899c19..70b904af4 100644
--- a/app/vmagent/remotewrite/client.go
+++ b/app/vmagent/remotewrite/client.go
@@ -151,10 +151,6 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
        }
        c.sendBlock = c.sendBlockHTTP

-       return c
-}
-
-func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
        useVMProto := forceVMProto.GetOptionalArg(argIdx)
        usePromProto := forcePromProto.GetOptionalArg(argIdx)
        if useVMProto && usePromProto {
@@ -173,6 +169,10 @@ func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
        }
        c.useVMProto = useVMProto

+       return c
+}
+
+func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
2023-03-07 23:54:24 -08:00
Dmytro Kozlov
66bf1987bf
app/vmagent: fix panic if auth config not defined (#530) 2023-03-07 23:51:30 -08:00
Nikolay
7a3e16e774
lib/netutil: fixes panic at proxy protocol (#3905)
it may occur if non proxy protocol message received by tcp server.
Listener Accept method must return only non-recoverable errors.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-07 08:50:18 -08:00
Yury Molodov
bbf8e459a0
vmui: fix display of selected value in the selector (#3919)
vmui: fix selected value in dropdowns for Explore page
2023-03-07 16:23:02 +01:00
Roman Khavronenko
2472baa934
app/vmalert: do not wait for group start on removal (#3891)
Each group in vmalert starts with an artifical delay to avoid
thundering herd problem. For some groups with high evaluation
intervals, the delay could be significant.
If during this delay user will remove the group from the config
and hot-reload it - vmalert will have to wait until the delay
ends. This results into slow config reloading and UI hang.

The change moves the start-delay logic back to the group's
`start` method. Now, group can immediately exit from the
delay when `group.close()` method is called.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-06 14:04:43 +01:00
Dmytro Kozlov
cc5b916237
docs: follow up after 4b136abff8 (#3918)
docs: follow up after 4b136abff8
2023-03-06 12:41:48 +01:00