Commit graph

808 commits

Author SHA1 Message Date
Aliaksandr Valialkin
994c06cb84
docs/CHANGELOG.md: document e726340914 2022-05-06 18:11:16 +03:00
Dmytro Kozlov
8b5a819266
vmbackup: Prevent save backups to the same folder where TSDB data is (#2547)
* {vmbackup, vmbackup/snapshot}: validate snapshot name

* vmbackup/snapshot: added another checks

* backup/actions: added check that we ignore backup_complete.ignore file

* vmbackup: moved snapshot to lib directory

* lib/snapshot: added functions description

* lib/snapshot: fixed typo

* vmbackup: code cleanup

* wip

* vmbackup: Prevent save backups to the same folder where TSDB data is

* Apply suggestions from code review

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-06 18:04:58 +03:00
Aliaksandr Valialkin
983b465757
app/vmagent: add missing _total suffix to vmagent_remotewrite_global_rows_pushed_before_relabel_total counter
This is a follow up for c536139d0b
2022-05-06 15:51:34 +03:00
Aliaksandr Valialkin
af0da45d3e
lib/promscrape: rename promscrape_stale_samples_created_total metric to vm_promscrape_stale_samples_created_total, so its name is consistent with the rest of vm_promscrape_ metrics 2022-05-06 15:33:43 +03:00
Aliaksandr Valialkin
2baa9e8f48
app/vmagent: expose vmagent_remotewrite_global_rows_pushed_before_relabel and vmagent_remotewrite_rows_pushed_after_relabel_total metrics 2022-05-06 15:30:10 +03:00
Aliaksandr Valialkin
9c89ea1c90
app/vmagent: rename vmagent_remote_write_rate_limit_reached_total to vmagent_remotewrite_rate_limit_reached_total for the sake of consistency with other vmagent_remotewrite_ metrics 2022-05-06 15:02:18 +03:00
Aliaksandr Valialkin
9d40bb7137
lib/promscrape/discovery/ec2: add ability to filter Availability Zones in ec2_sd_config via az_filters section 2022-05-06 12:44:01 +03:00
Aliaksandr Valialkin
8babb4aebc
app/vmselect/vmui: make vmui-update after 450d879eaa 2022-05-05 22:11:41 +03:00
Aliaksandr Valialkin
0634a894a9
docs/CHANGELOG.md: document bf5e3774cc 2022-05-05 13:38:44 +03:00
Aliaksandr Valialkin
0ad5b64930
docs/CHANGELOG.md: cut v1.77.0 2022-05-05 00:16:31 +03:00
Aliaksandr Valialkin
358fa99af2
app/vmalert: run make quicktemplate-gen from the root directory after the commit f6dcfbcdd6 2022-05-04 20:28:37 +03:00
Nikolay
7e58cba6cf
{lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458)
* {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite
moves aws related code into separate lib from lib/promscrape
it allows to write data from vmagent to the AWS managed prometheus (cortex)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 20:28:37 +03:00
Dmytro Kozlov
0aeefeb5f1
vmalert/tpl: fixed truncating alerts expression in table (#2494)
vmalert: improve `/groups` UI visual 

The change also fixes truncated rules expressions in UI
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2484
2022-05-04 20:28:37 +03:00
Aliaksandr Valialkin
9dc9e03a14
docs/CHANGELOG.md: yet another typo fix: present -> pressed 2022-05-04 18:20:58 +03:00
Aliaksandr Valialkin
c2610e4186
docs/CHANGELOG.md: typo fixes 2022-05-04 18:19:05 +03:00
Aliaksandr Valialkin
47010a9875
docs/CHANGELOG.md: document 8639e79d38 2022-05-04 10:46:32 +03:00
Aliaksandr Valialkin
25266d2194
docs/CHANGELOG.md: document 3575aabeaf 2022-05-03 14:01:58 +03:00
Aliaksandr Valialkin
ec3a37896f
all: add -cluster.tlsInsecureSkipVerify command-line option to vminsert, vmselect and vmstorage components in order to be able to disable TLS certificate verification in mTLS mode
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2490
2022-05-03 13:13:43 +03:00
Aliaksandr Valialkin
ccf44e810c
docs/CHANGELOG.md: document 488c34f5e1 2022-05-03 11:01:27 +03:00
Aliaksandr Valialkin
d384997657
docs/CHANGELOG.md: document d0706c8c95 2022-05-02 22:25:47 +03:00
Aliaksandr Valialkin
361b08c30e
lib/storage: leave the last sample per each discrete interval during the deduplicaton
This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
2022-05-02 21:59:31 +03:00
Aliaksandr Valialkin
7ca32c21c8
app/vmui: execute query by pressing enter in the same way as Prometheus does
Multi-line query can be entered via `shift-enter` in the query input field
2022-05-02 20:24:43 +03:00
Aliaksandr Valialkin
a70ac35ac7
docs/CHANGELOG.md: document 3616337812
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2514
2022-05-02 15:37:54 +03:00
Aliaksandr Valialkin
75f4adab40
docs/CHANGELOG.md: document 32a6b67e6c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761
2022-05-02 15:37:54 +03:00
Aliaksandr Valialkin
693e1838b3
docs/CHANGELOG.md: document b2294d1cf1 2022-05-02 15:37:54 +03:00
Aliaksandr Valialkin
190c8b463c
lib/netutil: close connections in ConnPool if they are idle for more than 30 seconds
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2508
2022-05-02 15:01:52 +03:00
Artem Navoiev
11db05a4ff
lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487)
* lib/{storage,flagutil} - Add option for snapshot autoremoval

- add prometheus-like duration as command flag
- add option to delete stale snapshots
- update duration.go flag to re-use own code

* wip

* lib/flagutil: re-use Duration.Set() call in NewDuration

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 11:24:12 +03:00
Aliaksandr Valialkin
eaae544238
docs/CHANGELOG.md: document c7aad8d441 2022-04-29 13:02:43 +03:00
Dima Lazerka
e6ee235707
Export "null" in jsonl instead of NaN (#2518)
* Export "null" in jsonl instead of NaN

The NaN appeared because of staleness markers that were added for compatibility. I think it's better to use json `null`, implemented here.

Also maybe it also makes sense to add a flag like `?skip-staleness-markers=true` to `/export`, to skip nulls at all?

* Update app/vmselect/prometheus/export.qtpl

* app/vmselect/prometheus/export.qtpl.go: `make quicktemplate-gen`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-29 12:51:03 +03:00
Aliaksandr Valialkin
6eb1580158
app/vmselect/promql: add tlast_change_over_time(m[d]) function, which returns the timestamp for the last change of m on the given lookbehind window d 2022-04-27 10:58:40 +03:00
Yury Molodov
eae6f68be2
vmui: add support relative time (#2504)
* feat: add support relative time

* app/vmselect: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-26 15:46:47 +03:00
Aliaksandr Valialkin
cd4d1599cb
docs/CHANGELOG.md: document 4c1fbcd6b0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368
2022-04-26 15:09:57 +03:00
Aliaksandr Valialkin
374212333f
docs/CHANGELOG.md: typo fix: may result -> could result 2022-04-23 00:31:47 +03:00
Aliaksandr Valialkin
4c3cd96db5
lib/promauth: add support for min_version option at tls_config section in the same way as Prometheus does 2022-04-23 00:24:11 +03:00
Aliaksandr Valialkin
808a2f3b61
lib/promauth: add support for proxy_url option at oauth2 section in the same way as Prometheus does 2022-04-23 00:01:53 +03:00
Aliaksandr Valialkin
4ade8511e2
lib/promauth: add support for tls_config section at oauth2 config in the same way as Prometheus does 2022-04-23 00:01:52 +03:00
Aliaksandr Valialkin
a89e31b304
lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:34 +03:00
Aliaksandr Valialkin
dac24aa342
app/vmselect/promql: properly handle scalar default vector, scalar if vector and scalar ifnot vector queries
Previously `vector` time series could be unexpectedly returned from such queries
2022-04-21 15:34:14 +03:00
Aliaksandr Valialkin
a5cfe5d13e
app/vmselect/promql: add drop_common_labels() function 2022-04-21 14:20:36 +03:00
Aliaksandr Valialkin
ed1b394a1a
app/vmstorage: expose vm_indexdb_items_added_total and vm_indexdb_items_added_size_bytes_total counters at /metrics page
These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index).

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471
2022-04-21 13:19:42 +03:00
Aliaksandr Valialkin
25b841c6ed
app/vmselect/promql: fix duplicate time series error on joins against time series filtered by values
This should prevent from `duplicate time series` errors when executing the following query:

kube_pod_container_resource_requests{resource="cpu"} * on (namespace,pod) group_left() (kube_pod_status_phase{phase=~"Pending|Running"}==1)

where `kube_pod_status_phase{phase=~"Pending|Running"}==1` filters out diplicate time series
2022-04-20 22:21:20 +03:00
Aliaksandr Valialkin
90db923662
docs/CHANGELOG.md: document that the service discovery speed now scales with the number of CPU cores 2022-04-20 16:22:50 +03:00
Aliaksandr Valialkin
45385a5dc6
lib/promscrape: optimize getScrapeWork() function
Reduce the number of memory allocations in this function. This improves its performance by up to 50%.
This should improve service discovery speed when big number of potential targets with big number of meta-labels
are generated by service discovery.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2270
2022-04-20 15:34:18 +03:00
Aliaksandr Valialkin
d0bac8e224
all: typo fix: Kuberntes -> Kubernetes 2022-04-20 10:51:41 +03:00
Dmytro Kozlov
17552dba8b
lib/promscrape: Enable filters for endpoint and labels (#2466)
* lib/promscrape: Enable filters for endpoint and labels

* lib/promscrape: cleanup

* lib/promscrape: update template

* lib/promscrape: move logic filter logic to backend

* lib/promscrape: updated placeholder

* lib/promscrape: updated placeholder

* lib/promscrape: use two different fields for filters, updated form, added error on parsing queries

* lib/promscrape: rename functions

* lib/promscrape: removed unused values

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-19 18:27:44 +03:00
Nikolay
628905f080
lib/promscrape: adds job restart method (#2455)
* lib/promscrape: adds job restart method
it must restart only ScrapeConfig with changed content
this change greatly reduce time, that needed for job restart
and it should decrease possible data loss when config frequently changed at kubernetes based deployments

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 20:29:33 +03:00
Yury Molodov
60c4e0022a
fix: prevent graph hiding without data (#2456)
* fix: prevent graph hiding without data

* fix: add yaxis labels default

* app/vmselect: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 17:16:17 +03:00
Aliaksandr Valialkin
a7689e1b0c
app/vmstorage: add support for mTLS cipher suites via -cluster.tlsCipherSuites command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404
2022-04-16 16:36:38 +03:00
Aliaksandr Valialkin
27e74f25d6
lib/httpserver: follow up after def0032c7d 2022-04-16 15:52:44 +03:00
Aliaksandr Valialkin
c50e48a74c
lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:26:38 +03:00
Aliaksandr Valialkin
564996da14
docs/CHANGELOG.md: document 45fcaa33e8 2022-04-13 14:14:25 +03:00
Aliaksandr Valialkin
951b2a0067
docs/CHANGELOG.md: document f7e4c5a628 2022-04-13 14:14:25 +03:00
Anton Bystrov
af96e6594c
Update CHANGELOG.md (#2463)
May be mispint here?
2022-04-13 14:14:25 +03:00
Aliaksandr Valialkin
4a6b56e5be
docs/CHANGELOG.md: cut v1.76.1 2022-04-12 16:21:06 +03:00
Aliaksandr Valialkin
3e2a4c07cd
app/vmui: further improvements for number display on graphs
This is a follow-up for c4d2cd8336

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2409
2022-04-12 16:02:37 +03:00
Aliaksandr Valialkin
50a6354a03
docs/CHANGELOG.md: link to the bug related to improper handling of maxSeries limit passed from vmselect to vmstorage 2022-04-12 16:02:37 +03:00
Aliaksandr Valialkin
70ad171070
lib/promscrape: follow-up after 7e79adfb55 2022-04-12 12:37:03 +03:00
Aliaksandr Valialkin
81b7a31cb1
app/vmstorage: properly handle maxSeries limit passed from vmselect to vmstorage 2022-04-12 11:19:07 +03:00
Aliaksandr Valialkin
e3bf464f11
lib/protoparser/native: follow-up after fe01f4803d 2022-04-11 19:27:53 +03:00
Aliaksandr Valialkin
b3a3e9990f
vendor: update github.com/VictoriaMetrics/metricsql from v0.40.0 to v0.41.0
This allows using built-in function names as with template names
2022-04-11 18:32:18 +03:00
Aliaksandr Valialkin
3c27bde77e
docs/CHANGELOG.md: document ed364a42e3 2022-04-11 12:12:07 +03:00
Aliaksandr Valialkin
ce02d086d0
docs/CHANGELOG.md: document backwards-incompatible changes in cluster version of v1.76.0 2022-04-08 12:06:49 +03:00
Aliaksandr Valialkin
d7557b12ab
docs/CHANGELOG.md: document the bugfix in hitCount function 2022-04-08 11:48:36 +03:00
Aliaksandr Valialkin
949bf25a87
docs/CHANGELOG.md: typo fix 2022-04-07 17:20:20 +03:00
Aliaksandr Valialkin
e59019d088
docs/CHANGELOG.md: cut v1.76.0 2022-04-07 15:34:12 +03:00
Aliaksandr Valialkin
f082e64e0c
app/vmagent: reduce the probability of TLS handshake timeout when dialing the remote storage
The following actions are taken:

- Increase the TLS hashdshake timeout from 5 seconds to 10 seconds
- Increase dial timeout from 5 seconds to 30 seconds
- Specify DialContext instead of Dial in http.Transport. This allows properly handling
  the Context arg during dialing the remote storage

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699
2022-04-06 12:35:14 +03:00
Aliaksandr Valialkin
c24c5b8926
docs/CHANGELOG.md: document 0c0efc7781 2022-04-05 19:22:08 +03:00
Aliaksandr Valialkin
8752cce157
app/vminsert: reduce the max packet size, which vminsert can send to vmstorage
This reduces the max memory usage for vminsert and vmstorage under heavy ingestion rate
by up to 50% on production workload
2022-04-05 15:39:58 +03:00
Aliaksandr Valialkin
816085a652
docs/CHANGELOG.md: document 70bb0d2708 2022-04-04 13:02:41 +03:00
Aliaksandr Valialkin
676c1f4fd7
docs/CHANGELOG.md: document 173073364e1bb1e0259ddc873dbd96ce62b07543 2022-04-04 12:56:11 +03:00
Aliaksandr Valialkin
157555b022
docs/CHANGELOG.md: document a57e3807537914396ee3eb378648a464fa9e1b97 2022-04-01 12:25:54 +03:00
Aliaksandr Valialkin
151a00189e
docs/CHANGELOG.md: document 0989649ad0 2022-04-01 12:03:41 +03:00
Aliaksandr Valialkin
7c81ce296d
docs/CHANGELOG.md: cut v1.75.1 2022-03-28 12:28:27 +03:00
Yury Molodov
7cbf19812d
vmui: predefined panels (#2243)
* feat: add basic components for predefined dashboards

* fix: change display alert

* feat: add autosize and unit for axes

* feat: add component for CircularProgress

* feat: change layout for predefined dashboards

* feat: add override step for predefined panels

* feat: add override step for predefined panels

* feat: change yaxis limits for predefined panels

* fix: rename flag for hide legend

* feat: add formatted panel description

* feat: add README.md for dashboard setup

* feat: validate dashboard settings

* feat: add unit for y-ticks

* fix: correct display error for dashboards

* fix: disable auto refresh after route change

* update package-lock.json

* fix: add basename for BrowserRouter

* fix: add dynamic basename for routing

* update packages

* feat: add a pre-defined dashboard "per-job resource usage"

* feat: display unit in the hover-tooltip

* fix: change routing and home layout

* fix: change axis width calc

* updated packages

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-26 13:05:21 +02:00
Aliaksandr Valialkin
b843f0e229
app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs 2022-03-26 11:28:14 +02:00
Dima Lazerka
6e8e385375
VMAnomaly docs fixes (#2361)
* Added docs for vmanomaly

* Add example images

* Stylistic fixes

* Move images to root

* Update docs/vmanomaly.md

* Update docs/vmanomaly.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Squeeze vmanomaly after vmbackupmanager before Case Studies

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-03-25 12:09:05 +02:00
Aliaksandr Valialkin
b7b06fb50e
Revert "docs/CHANGELOG.md: mention a bugfix in Graphite render API in v1.75.0"
This reverts commit 1e18c1c1ae.

The bugfix has been alread mentioned in the commit 9290605891
2022-03-24 19:21:15 +02:00
Aliaksandr Valialkin
1e18c1c1ae
docs/CHANGELOG.md: mention a bugfix in Graphite render API in v1.75.0 2022-03-24 19:20:42 +02:00
Arash Hatami
2e01691d5d
A good change for MD files (#2353)
* Lint YAML

* Remove extra comment

* Fix command problem

* Format MD files

* Format & fix problem of MD files for docs

* Another fix for MD files
2022-03-22 14:01:04 +02:00
Roman Khavronenko
bb594b34b8
docs: update release notes (#2349)
Warn about memory issue introduced in releases 1.73 - 1.74
2022-03-21 15:41:25 +02:00
Aliaksandr Valialkin
84b3234f3d
docs/CHANGELOG.md: document a1e17e91f8 2022-03-21 15:37:03 +02:00
hagen1778
7c7436b584
docs: add update note to v1.75.0 release note
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-21 15:37:01 +02:00
Aliaksandr Valialkin
7e67d61fdf
docs/CHANGELOG.md: cut v1.75.0 2022-03-18 19:54:13 +02:00
Aliaksandr Valialkin
85695faa94
app/vmselect/bufferedwriter: suppress trivial network errors, which can be generated by remote side
These errors include `broken pipe` and `reset by peer`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2334
2022-03-18 19:27:33 +02:00
Aliaksandr Valialkin
9d91ad7124
app/vmagent/remotewrite: prevent from infinite recursion panic when pushing a time series with big number of samples to remote storage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2335
2022-03-18 19:07:27 +02:00
Aliaksandr Valialkin
ab966f8a7a
docs: document 20bb5e703c 2022-03-18 18:42:09 +02:00
Aliaksandr Valialkin
e35c9124b7
lib/storage: reduce the interval for checking for free disk space from 30 seconds to 1 second
This should reduce the probability of out of disk space panics when -storage.minFreeDiskSpaceBytes is set to low values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2305
2022-03-18 16:53:19 +02:00
Aliaksandr Valialkin
7c92aaeaa4
lib/blockcache: properly release memory occupied by deleted entries
Proviously the deleted entries could remain referenced via lastAccessHeap for long time.
This could lead to increased memory usage for the following caches starting from v1.73.0:

* indexdb/indexBlocks
* indexdb/dataBlocks
* storage/indexBlocks

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2242
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-03-18 16:53:19 +02:00
Aliaksandr Valialkin
85a4b805e1
docs/CHANGELOG.md: document e5868b9c29
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2255
2022-03-18 13:08:56 +02:00
Aliaksandr Valialkin
e1f70af14e
docs/CHANGELOG.md: document 11ae1ae924 2022-03-17 20:09:18 +02:00
Aliaksandr Valialkin
f0e96a84db
docs: document the addition of mTLS communication between cluster components 2022-03-17 20:00:14 +02:00
Aliaksandr Valialkin
9290605891
docs/CHANGELOG.md: document c1d07e7c52f0a2ab892921b0639cd42677aa33a8 2022-03-16 14:25:38 +02:00
Aliaksandr Valialkin
c175afbe02
docs/CHANGELOG.md: document changes from fb6eab03a2 2022-03-16 13:28:29 +02:00
Aliaksandr Valialkin
5b53373154
docs/CHANGELOG.md: document 565bd08c43
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1824
2022-03-16 13:28:29 +02:00
hagen1778
fd0f521bb9
docs: add update details for some releases
Some of the releases could negatively affect performance for a limited
period of time due to some changes in core. Update details are meant to
warn users about expected changes in peformance after the update.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-16 13:03:36 +02:00
Roman Khavronenko
ec48d022df
docs: fix broken links (#2303)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-16 13:01:46 +02:00
Aliaksandr Valialkin
e7a0b4e095
docs/CHANGELOG.md: cut v1.74.0 2022-03-03 19:30:53 +02:00
Aliaksandr Valialkin
0b5ec1780d
docs/CHANGELOG.md: document performance improvements when registering new time series 2022-03-03 17:12:25 +02:00
Nikolay
d5ba1249f8
fixes incorrect step for calculation for MovingWindow functions (#283)
* fixes incorrect step for calculation for MovingWindow functions
https://victoriametrics.zendesk.com/agent/tickets/99

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-25 13:53:48 +02:00
Aliaksandr Valialkin
1bde8b4e22
docs/CHANGELOG.md: document cfc6c14dc48ae9dd35e65f1a6e5c7af8ccb9f029 2022-02-25 13:53:09 +02:00
Aliaksandr Valialkin
02a922b53f
lib/storage: properly handle series selector matching multiple metric names plus a negative filter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2238

This is a follow-up for 00cbb099b6
2022-02-24 12:11:53 +02:00
Aliaksandr Valialkin
2431c9cf81
lib/promrelabel: add support for conditional relabeling via if filter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998
2022-02-24 02:35:13 +02:00
Nikolay
54221297d1
fixes jwt token parse with correct base64Url decoding (#281)
* fixes jwt token parse with correct base64Url decoding
it must be applied according to jwt RFC that requires token to be URL safe

added slow path for decoding tokens with std base64 decoding

adds error logging for vmgateway

* docs/CHANGELOG.md: document the bugfix

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-23 13:57:54 +02:00
Aliaksandr Valialkin
244c23ea2c
lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes
This should reduce memory usage under high time series churn rate
2022-02-23 13:42:27 +02:00
Aliaksandr Valialkin
cb100ce2af
docs/CHANGELOG.md: cut v1.73.1 2022-02-22 21:11:55 +02:00
Aliaksandr Valialkin
1ff42a0080
docs/CHANGELOG.md: link to the feature request for X-Influxdb-Version response header
Follow-up for 71ef3155c8

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2209
2022-02-22 20:34:03 +02:00
Aliaksandr Valialkin
89ead3daca
app/vmselect/netstorage: report vmstorage errors to vmselect clients even if partial responses are allowed
If a vmstorage is reachable and returns an application-level error to vmselect,
then such error must be returned to the caller even if partial responses are allowed,
since it usually means cluster mis-configuration.

Partial responses may be returned only if some vmstorage nodes are temporarily unavailable.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1941
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/678
2022-02-21 21:17:05 +02:00
Roman Khavronenko
5a4b16794d
Consul SD - update services on the watcher's start (#2202)
* lib/discovery/consul: update services on the watcher's start

Previously, watcher's start was only initing goroutines for discovery
but not waiting for the first iteration to end. It means first Consul
discovery wasn't returning discovered targets until the next iteration.

The change makes the watcher's start blocking until we get first discovery
iteration done and all registries updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: remove workarounds for consul SD

Now when consul SD lib properly updates services
on the first start, we don't need workarounds in vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/discovery/consul: update after review

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 15:33:33 +02:00
Roman Khavronenko
bd7837d524
lib: allow to configure cache size by type (#2206)
* lib: allow to configure cache size by type

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1940
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 13:55:51 +02:00
Aliaksandr Valialkin
ce91456982
app/vminsert: add X-Influxdb-Version response header for InfluxDB API requests
This is needed for some clients, which expect this header.
See https://github.com/ntop/ntopng/issues/5449#issuecomment-1005347597
2022-02-17 12:49:24 +02:00
Aliaksandr Valialkin
8aaeffa7b4
docs: document 3d19fa6932 2022-02-16 23:30:52 +02:00
Aliaksandr Valialkin
ee066aa0d5
lib/storage: use binary search instead of full scan for skipping artificial tags when searching for tag names or tag values
This should improve performance for /api/v1/labels and /api/v1/label/<label_name>/values

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200
2022-02-16 18:17:27 +02:00
Aliaksandr Valialkin
bb113efeb4
docs/CHANGELOG.md: document 2efa46a11c 2022-02-15 21:13:32 +02:00
Aliaksandr Valialkin
e2be41f4dd
docs/CHANGELOG.md: document ad6bdd78d0 2022-02-15 12:48:23 +02:00
Aliaksandr Valialkin
38c73a00db
docs/CHANGELOG.md: cut v1.73.0 2022-02-14 17:54:53 +02:00
Aliaksandr Valialkin
5d8ea8c918
docs/CHANGELOG.md: document 3d890e89f1 2022-02-14 17:42:33 +02:00
Nikolay
48a9e068be
adds release build for macos darwin amd64 and arm64 (#2185)
* adds release build for macos darwin amd64 and arm64

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1896
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1851

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-14 17:42:33 +02:00
Aliaksandr Valialkin
fc0771a888
docs/CHANGELOG.md: document c90c1c4d54 2022-02-14 13:13:03 +02:00
Aliaksandr Valialkin
31b42e9c57
lib/promscrape: add expand all and collapse all buttons to /targets page
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2021
2022-02-12 18:42:01 +02:00
Aliaksandr Valialkin
989668beba
app/vmselect/promql: return at most one time series from absent_over_time() in the same way as Prometheus does
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2130
2022-02-12 15:46:43 +02:00
Aliaksandr Valialkin
3ab4aef140
docs/CHANGELOG.md: document ea153e5f90 2022-02-12 00:48:40 +02:00
Roman Khavronenko
d107f86fbc
lib/index: reduce read/write load after indexDB rotation (#2177)
* lib/index: reduce read/write load after indexDB rotation

IndexDB in VM is responsible for storing TSID - ID's used for identifying
time series. The index is stored on disk and used by both ingestion and read path.

IndexDB is stored separately to data parts and is global for all stored data.
It can't be deleted partially as VM deletes data parts. Instead, indexDB is
rotated once in `retention` interval.

The rotation procedure means that `current` indexDB becomes `previous`,
and new freshly created indexDB struct becomes `current`. So in any time,
VM holds indexDB for current and previous retention periods.
When time series is ingested or queried, VM checks if its TSID is present
in `current` indexDB. If it is missing, it checks the `previous` indexDB.
If TSID was found, it gets copied to the `current` indexDB. In this way
`current` indexDB stores only series which were active during the retention
period.

To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both
write and read path consult `tsidCache` and on miss the relad lookup happens.

When rotation happens, VM resets the `tsidCache`. This is needed for ingestion
path to trigger `current` indexDB re-population. Since index re-population
requires additional resources, every index rotation event may cause some extra
load on CPU and disk. While it may be unnoticeable for most of the cases,
for systems with very high number of unique series each rotation may lead
to performance degradation for some period of time.

This PR makes an attempt to smooth out resource usage after the rotation.
The changes are following:
1. `tsidCache` is no longer reset after the rotation;
2. Instead, each entry in `tsidCache` gains a notion of indexDB to which
they belong;
3. On ingestion path after the rotation we check if requested TSID was
found in `tsidCache`. Then we have 3 branches:
3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID.
3.2 Slow path. It wasn't found, so we generate it from scratch,
add to `current` indexDB, add it to `tsidCache`.
3.3 Smooth path. It was found but does not belong to the `current` indexDB.
In this case, we add it to the `current` indexDB with some probability.
The probability is based on time passed since the last rotation with some threshold.
The more time has passed since rotation the higher is chance to re-populate `current` indexDB.
The default re-population interval in this PR is set to `1h`, during which entries from
`previous` index supposed to slowly re-populate `current` index.

The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs
were moved from `previous` indexDB to the `current` indexDB. This metric supposed to
grow only during the first `1h` after the last rotation.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-12 00:34:44 +02:00
Roman Khavronenko
791cad8c2e
lib/promscrape: support prometheus-like duration in scrape configs (#2169)
* lib/promscrape: support prometheus-like duration in scrape configs

The change allows to specify duration values like `1d`, `1w`
for fields `scrape_interval`, `scrape_timeout`, etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/blockcache: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/promscrape: support prometheus-like duration in scrape configs

* add support for extra fields `scrape_align_interval` and `scrape_offset`;
* support Prometheus duration parsing for `__scrape_interval__`
and `__scrape_duration__` labels;

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

* docs/CHANGELOG.md: document the feature

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 16:17:51 +02:00
Aliaksandr Valialkin
727b29d4a3
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpointslice_{label,annotation}* labels to be consistent with other role values for Kubernetes service discovery 2022-02-11 14:56:10 +02:00
Nikolay
265938a385
fixes service discovery for kubernetes (#2173)
* fixes service discovery for kubernetes
now it must take in account all pods that belong to the discovered endpoint and endpointslice
adds simple test for endpoints
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2134

* wip

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 13:35:34 +02:00
Aliaksandr Valialkin
6cb2954612
docs/CHANGELOG.md: document 4e722c459b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2167
2022-02-10 12:21:05 +02:00
Aliaksandr Valialkin
b81e6ba403
docs/CHANGELOG.md: add instructions on how to build VictoriaMetrics components from source code in order to test tip changes 2022-02-08 16:44:20 +02:00
Nikolay
feefbbab48
adds CGO build for arm64 (#2102)
* adds CGO build for arm64
it must improve performance for arm64 based deployments of vmstorage and
vmsingle for 15-20%

it depends on gozstd package update for correct musl gozstd vendoring

* typo fixes

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-08 16:26:55 +02:00
Aliaksandr Valialkin
eed66b6640
lib/promscrape: set -promscrape.config.strictParse to true by default
This allows detecting long-living silent errors in -promscrape.config
2022-02-08 15:42:33 +02:00
Aliaksandr Valialkin
2149805886
docs/CHANGELOG.md: add links to issues, which could benefit from improved re-routing algorithm 2022-02-07 16:54:27 +02:00
Aliaksandr Valialkin
c738739494
app/vminsert: add -dropSamplesOnOverload command-line flag
Drop incoming samples if the destination vmstorage node is unavailable
and/or accepts data at slower rate than other vmstorage nodes
2022-02-07 12:32:18 +02:00
Aliaksandr Valialkin
021ee53ba8
app/vminsert: improve re-routing logic in order to spread rows more evenly among the available storage nodes 2022-02-06 20:20:02 +02:00
Aliaksandr Valialkin
d24e5d9efd
lib/promscrape: show the total number of scrapes and the total number of scrape errors per target at /targets page
This information may be useful when debugging unreliable scrape targets
2022-02-03 20:23:27 +02:00
Aliaksandr Valialkin
678b3e71db
lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics
This feature may be useful when debugging metrics for the given target located in isolated environment
2022-02-03 19:02:12 +02:00
Aliaksandr Valialkin
10367d7e4c
app/vmselect/promql: do not push down filters, which enumerate more than 10k unique values
Such filters may slow down time series search, so just skip them.

This is a follow-up for e7f1ceeb84

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1827
2022-02-02 23:42:25 +02:00
Aliaksandr Valialkin
97b7b94f91
docs/CHANGELOG.md: document 55e3bbd4cc
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1567
2022-02-02 23:42:25 +02:00
Aliaksandr Valialkin
dbead4813e
docs: updates after 5da71eb685
* Mention about the ability to configure vmalert notifiers via files in docs/CHANGELOG.md
* Mention about the ability to use Consul service discovery for vmalert notifiers in docs/CHANGELOG.md
* Run `make docs-sync` in order to sync app/vmalert/README.md to docs/vmalert.md
2022-02-02 23:42:25 +02:00
Aliaksandr Valialkin
566e12874d
lib/cgroup: expose process_cpu_cores_available metric
This metric shows the number of CPU cores available to the process.
This allows creating alerting rules on CPU saturation with the following query:

    rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107
2022-01-31 20:25:15 +02:00
Aliaksandr Valialkin
04d6596298
app/vmselect/promql: optimize queries, which join on _info metrics.
Automatically add common filters from one side of binary operation
to the other side before sending the query to storage subsystem.

See https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale/
and https://www.robustperception.io/exposing-the-software-version-to-prometheus
2022-01-31 20:25:15 +02:00
Aliaksandr Valialkin
35164d4dcf
docs/CHANGELOG.md: document 6a519896db 2022-01-31 12:42:27 +02:00
Aliaksandr Valialkin
0ac2a51682
vendor: update github.com/VictoriaMetrics/metricsql from v0.37.0 to v0.38.0
This adds more optimization cases for https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization

For example:

* Multi-level transform functions. For example, abs(round(foo{a="b"})) + bar{x="y"}
  is now optimized to abs(round(foo{a="b",x="y"})) + bar{a="b",x="y"}
* Binary operations with `on()`, `without()`, `group_left()` and `group_right()` modifiers.
  For example, foo{a="b"} on (a) + bar is now optimized to foo{a="b"} on (a) + bar{a="b"}
* Multi-level binary operations. For example, foo{a="b"} + bar{x="y"} + baz{z="q"}
  is now optimized to foo{a="b",x="y",z="q"} + bar{a="b",x="y",z="q"} + baz{a="b",x="y",z="q"}
* Aggregate functions. For example, sum(foo{a="b"}) by (c) + bar{c="d"}
  is now optimized to sum(foo{a="b",c="d"}) by (c) + bar{c="d"}
2022-01-27 19:04:45 +02:00
Aliaksandr Valialkin
49650fe6aa
lib/logger/throttler.go: show the original location of the error and warning message
Previously the location inside LogThrottler implementation was shown. This could complicate debugging.
2022-01-23 13:55:48 +02:00
Yury Molodov
196bef8348
vmui: fixed display type switching (#2088)
* fix: correct switch display type

* docs/CHANGELOG.md: document the bugfix

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-21 16:57:14 +02:00
Yury Molodov
034012c80f
vmui: fix time range selector (#2085)
* fix: add date validate for time range

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the bugfix

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-21 12:03:41 +02:00
Aliaksandr Valialkin
6ae584b9b3
lib/{mergeset,storage}: properly limit cache sizes for indexdb
Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`,
since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded,
which could result to out of memory errors.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:45:03 +02:00
Aliaksandr Valialkin
c2abd6a702
docs/CHANGELOG.md: document the bugfix for highestMax() function is Graphite render API 2022-01-20 12:16:12 +02:00
Aliaksandr Valialkin
6233e7c40c
docs/CHANGELOG.md: add missing parens in example for @ modifier 2022-01-19 13:05:29 +02:00
Aliaksandr Valialkin
2eaf7a7c46
docs/CHANGELOG.md: fix incorrect link to the issue
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911
2022-01-19 00:08:31 +02:00
Aliaksandr Valialkin
f3196e48e1
docs/CHANGELOG.md: cut v1.72.0 2022-01-18 22:43:34 +02:00
Yury Molodov
1d19303f35
fix: remove buffer period (#2078)
* fix: remove buffer period

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the implemented feature

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2064

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-18 22:23:40 +02:00