Commit graph

1208 commits

Author SHA1 Message Date
Roman Khavronenko
43a7984cd8
vmalert: correctly calculate alert ID including extra labels (#1734)
Previously, ID for alert entity was generated without alertname or groupname.
This led to collision, when multiple alerting rules within the same group
producing same labelsets. E.g. expr: `sum(metric1) by (job) > 0` and
expr: `sum(metric2) by (job) > 0` could result into same labelset `job: "job"`.

The issue affects only UI and Web API parts of vmalert, because alert ID is used
only for displaying and finding active alerts. It does not affect state restore
procedure, since this label was added right before pushing to remote storage.

The change now adds all extra labels right after receiving response from the datasource.
And removes adding extra labels before pushing to remote storage.

Additionally, change introduces a new flag `Restored` which will be displayed in UI
for alerts which have been restored from remote storage on restart.
2021-10-22 12:30:38 +03:00
Yury Molodov
2b266cb87e
vmui: query history (#1732)
* feat: add query history

* fix: change detect keyUp for nav query history

* feat: set default query history

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:21:22 +03:00
Aliaksandr Valialkin
8ad95f0db7
lib/httpserver: expose command-line flags at /flags page
This should simplify debugging.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:45:09 +03:00
Roman Khavronenko
bdfac4ff53
vmalert: make group.ID() thread-safe (#1726)
Commit fixes potential race condition when group update
and generating of ID() happens simultaneously.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:44:13 +03:00
Roman Khavronenko
dcd881bb7a
vmalert: properly init SIGHUP listener before starting group manager (#1725)
Regression was introduced during code refactoring. It potentially
could lead to situation when SIGHUP signals were ignored while
vmalert was still busy with initing group manager.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:35:27 +03:00
Aliaksandr Valialkin
b8123b862a
app/vmauth: fix metric name prefixes: vmagent -> vmauth 2021-10-19 15:29:07 +03:00
Yury Molodov
a3e09a57c2
vmui: features (#1711)
* feat: initial uPlot graph

* feat: add zoom/pan for graph

* fix: add zoom by ctrl/mac

* fix: remove unused code

* feat: add toggle cache for fetch

* feat: add fix y-axis limits

* fix: stop point events while panning

* fix: change getting cursor position when scaling

* feat: add cursor tooltip to graph

* fix: uninstall chart.js

* fix: change link for create an issue

* fix: set default cache value to true

* app/vmalert: follow-up after 0e2486df56

* docs/CHANGELOG.md: document 5416e18007

* app/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-18 15:16:57 +03:00
Roman Khavronenko
146a5b504c
vmalert: remove extra / from path in WEB interface (#1717)
The extra `/` may cause issues when additional path prefixes
are configured. Also, removing it makes it consistent
with the rest of declarations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:12:47 +03:00
Roman Khavronenko
478854d36d
vmctl: follow-up after 95d1d38595 (#1718)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:10:44 +03:00
Miro Prasil
5416e18007
vmctl influx convert bool to number (#1714)
vmctl: properly convert influx bools into integer representation

When using vmctl influx, the import would fail importing boolean fields
with:

```
failed to convert value "some".0 to float64: unexpected value type true
```

This converts `true` to `1` and `false` to `0`.

Fixes #1709
2021-10-18 10:29:34 +03:00
Alexander Rickardsson
0e2486df56
vmalert: add disablePathAppend to remote read (#1712)
* vmalert: add disablePathAppend to remoteRead

* docs: add docs for remoteRead.disablePathAppend
2021-10-18 10:24:52 +03:00
Alexander Rickardsson
c0e58ade45
vmalert: Redact passwords from error messages (#1713) 2021-10-18 10:20:26 +03:00
Aliaksandr Valialkin
da97e58979
app/vmselect/promql: randomize the static selection of time series returned from limitk()
Sort series by a hash calculated from the series labels. This should guarantee "random" selection of the returned time series.
Previously the selection could be biased, since time series were sorted alphabetically by label names and label values.
2021-10-16 21:16:49 +03:00
Aliaksandr Valialkin
cae174b11c
app/vmselect/promql: typo fix in comment: didsn't -> didn't 2021-10-16 13:00:34 +03:00
Aliaksandr Valialkin
9866dd95c1
lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice
This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode
at the cost of the increased memory usage, since now the potentially big response is kept
in the lastScrape byte slice per each scrapeWork.

In practice the memory usage increase shouldn't be big, since the response size
is usually much smaller than the parsed metrics from this response after the relabeling,
which usually adds a big pile of target-specific labels per each metric.
2021-10-15 15:39:23 +03:00
Aliaksandr Valialkin
bbd34fa15e
lib/promscrape: add -promscrape.minResponseSizeForStreamParse command-line option for automatic switching to stream parsing mode when scraping targets with big responses
This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes.
This is common case in Kubernetes monitoring.
2021-10-14 12:29:35 +03:00
Aliaksandr Valialkin
1a7287c408
lib/promscrape: return error if sample_limit or series_limit options are set when stream parsing mode is enabled 2021-10-14 12:11:23 +03:00
Roman Khavronenko
7fcbd3fa4b
Adjust http.Transport.MaxIdleConns setting for vmauth/vmalert services (#1704)
* vmalert: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `datasource.maxIdleConnections` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmauth: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `maxIdleConnsPerBackend` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-13 17:29:28 +03:00
Roman Khavronenko
8df3c569c7
vmalert: add Source link to alerts UI (#1701)
The source link is controlled by `external.url` and `external.alert.source`
flags, in the same way as for alertmanager notifications.
The source link is added to Alerts list view, and specific Alert view.
2021-10-13 15:25:11 +03:00
Aliaksandr Valialkin
5a58c041c2
app/vmagent: expose -promscrape.config contents at /config page as Prometheus does
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-12 16:25:37 +03:00
Aliaksandr Valialkin
a5001b9c20
app/vmselect/promql: add atan2 binary operator, which is going to be added in Prometheus 2.31
See https://github.com/prometheus/prometheus/pull/9248
2021-10-11 21:15:53 +03:00
Aliaksandr Valialkin
81c6720392
app/vmselect/promql: add missing trigonometric functions, which are going to be added in Prometheus 2.31
See https://github.com/prometheus/prometheus/issues/9233
2021-10-11 21:01:33 +03:00
Aliaksandr Valialkin
92b92d4d2c
app/vmselect/promql: consistently return the same set of time series from limitk() function
This is the expected behaviour by most users.
2021-10-08 19:53:52 +03:00
Aliaksandr Valialkin
0ff8fcac6a
app/vmui: follow-up after 7bfb44113e
* Run `vmui-update`
* Document the changes in README.md and CHANGELOG.md
2021-10-08 15:09:29 +03:00
Yury Molodov
7bfb44113e
vmui: use uPlot as default engine for graph (#1683)
* feat: initial uPlot graph

* feat: add zoom/pan for graph

* fix: add zoom by ctrl/mac

* fix: remove unused code
2021-10-08 15:07:35 +03:00
Aliaksandr Valialkin
cf5cbd1c70
app/{vminsert,vmstorage}: follow-up after a171916ef5
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:35:49 +03:00
Nikolay
4290b46e8c
Adds read-only mode for vmstorage node (#1680)
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269

* changes order a bit

* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage

* renames freeSpaceLimit to ReadOnly
2021-10-08 14:35:48 +03:00
Aliaksandr Valialkin
2748255c8b
app/vmselect/promql: substitute rollupFuncsCannotAdjustWindow with rollupFuncsCanAdjustWindow
The list of functions, which can adjust lookbehind window is more limited than the rest of functions,
so it is better from maintainability and readability PoV using the allowlist instead of blocklist.
2021-10-07 13:18:42 +03:00
Aliaksandr Valialkin
c45210a6f9
app/vmselect/promql: return back the behaviour for deriv() function when the lookbehind window doesnt contain enough points
It is expected that the `deriv(m[d])` returns non-empty value if the lookbehind window `d`
contains less than 2 samples in the same way as `rate()` does.

This is a follow-up after 3e084be06b .
2021-10-07 12:52:27 +03:00
Roman Khavronenko
3e084be06b
app/vmselect: make predict_linear and deriv compatible with Prometheus (#1681)
Previously, `predict_linear` returned slightly different results comparing
to Prometheus. The change makes linear regression algorithm compatible
with Prometheus.

`deriv` was excluded from the list of functions which can adjust the time
window for the same reasons.
2021-10-07 12:50:49 +03:00
Aliaksandr Valialkin
c7c966d0e9
docs/vmagent.md: update docs after 3e9a939a990c8b608414388c96f68eb062364ae7 2021-10-05 10:23:33 +03:00
Aliaksandr Valialkin
9515e58e28
docs/vmagent.md: document how to write data to Kafka 2021-09-30 17:45:53 +03:00
Aliaksandr Valialkin
0e3de5a0cc
app/vmselect/promql: add topk_last and bottomk_last functions 2021-09-30 13:22:52 +03:00
Roman Khavronenko
a31407006c
app/vmselect: fix binary comparison func (#1667)
The fix makes the binary comparison func to check for NaNs
before executing the actual comparison. This prevents VM
to return values for non-existing samples for expressions
which contain bool comparisons. Please see added test
for example.
2021-09-30 12:24:17 +03:00
Roman Khavronenko
344490d89b
app/vmselect: fix testRowsEqual func NaN checks (#1666)
It appeared, that `testRowsEqual` NaN comparison was incorrect.
The fix caused some tests to fail. Please see the change and
tests updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-30 12:08:47 +03:00
Aliaksandr Valialkin
6d9f1d4227
app/vminsert: document that -relabelConfig is reloaded on SIGHUP signal 2021-09-29 21:18:58 +03:00
Aliaksandr Valialkin
d80d72efec
app/{vmbackup,vmrestore}: switch from gcs://... to gs://... urls for backups to GCS
The `gs://` urls are commonly used, so prefer them instead of `gcs://` urls,
while leaving support for `gcs://` urls for backwards compatibility.
2021-09-29 12:10:29 +03:00
Aliaksandr Valialkin
396e233ac1
docs/vmagent.md: update Telegraf config in the section about Kafka 2021-09-29 11:21:15 +03:00
Aliaksandr Valialkin
0e5ab52908
docs/vmagent.md: add docs about reading metrics from Kafka 2021-09-29 01:46:12 +03:00
Yury Molodov
893af0a92c
vmui: fixed bug with time range (time zone) (#1661)
* fix: set date in query string in utc format

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 01:00:44 +03:00
Nikolay
cc72f9428d
changes vmagent api (#1656)
* changes vmagent api
adds auth.Token to promremotewrite InsertHandlerReader
changes remoteWrite client constructor, allows to use multiple remoteWriteUrl schemes, like kafka://
changes url path concatenation for tenant remoteWrite

Update app/vmagent/remotewrite/client.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* Update app/vmagent/remotewrite/remotewrite.go

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:52:07 +03:00
Roman Khavronenko
5dc84bf210
app/vmselect: disable time-window adjustment for min/max_over_time funcs (#1658)
Adjustment results into discrepancy between Prometheus and VM on time windows
smaller than scrape interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:43:21 +03:00
Roman Khavronenko
de810031bf
app/vmselect: always return zero for stddev func if there is only one value (#1659)
The fix will always return zero if received set of items consists of one
element only, which also means no deviation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:38:55 +03:00
Roman Khavronenko
dd536b475c
app/vmselect: return NaN instead of 0 for empty value sets (#1660)
The change affects `count/stddev/stdvar_over_time` funcs and makes
them to return NaN instead of zero when there is no datapoints
in a time window.
This is needed for improving compatibility with Prometheus.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:37:04 +03:00
Roman Khavronenko
03cd93bf1a
app/vmselect: rm quantile_over_time fast-path optimisations (#1662)
The removed fast path optimisations weren't consistent with
`quantile` function behavior and results into discrepancy.
Specifically, results didn't match in cases when:
* 0 < phi > 1;
* values contain only one element.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:35:14 +03:00
Aliaksandr Valialkin
91b3c601bc
app/{vminsert,vmagent}: add ability to ingest data via DataDog "submit metrics" API
See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
2021-09-29 00:13:08 +03:00
Yury Molodov
a64155d91e
vmui: use Chart.js as default engine for graph (#1634)
* feat: add Plotly as default engine for graph

* fix: remove unused components

* feat: use Chart.js as default engine graph

* fix: correct styles for loader

* feat: add zoom/pan for chart

* feat: add height for chart

* fix: remove unused code

* fix: remove empty units from duration

* fix: change debounce for pan to 500ms

* fix: add utility for plugins register globally

* fix: optimize render graph

* feat: add buffer data for zoom

* fix: add limits for zoom in/out

* fix: change update data while zooming

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-27 22:26:14 +03:00
Aliaksandr Valialkin
8c0283381d
app/victoria-metrics/testdata/graphite/max_lookback_unset.json: fix the test after c4c77aa2dd
The commit c4c77aa2dd slightly changed how scrape_interval is detected per-time series,
so the max_lookback_unset test should be updated accordingly.
2021-09-27 21:41:14 +03:00
Aliaksandr Valialkin
2efe0acfc9
app/vmselect/promql: add rollup_scrape_interval(m[d]) function
It calculates the min, max and avg scrape intervals for m over the given lookbehind window d
2021-09-27 19:21:24 +03:00
Aliaksandr Valialkin
c4c77aa2dd
app/vmselect/promql: follow-up after 526dd93b32
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1625
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1612
2021-09-27 18:55:39 +03:00