Commit graph

3340 commits

Author SHA1 Message Date
Aliaksandr Valialkin
6bc10f0623
lib/promscrape: do not sort original labels and do not intern label string for the original labels before the sharding code is executed
This should reduce CPU and memory usage in shard mode when service discovery finds big number of scrape targets with many long labels.
See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets

This is a follow-up after 9882cda8b9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728
2021-10-22 13:55:39 +03:00
Aliaksandr Valialkin
a8bcc3c276
lib/promscrape: reduce memory usage if -promscrape.noStaleMarkers command-line flag is passed
Do not store in memory the response from the last scrape per each target if -promscrape.noStaleMarkers option is enabled.
This should reduce memory usage when the scraped targets return large responses.
2021-10-22 13:22:08 +03:00
Roman Khavronenko
e0f21d6000
vmalert: correctly calculate alert ID including extra labels (#1734)
Previously, ID for alert entity was generated without alertname or groupname.
This led to collision, when multiple alerting rules within the same group
producing same labelsets. E.g. expr: `sum(metric1) by (job) > 0` and
expr: `sum(metric2) by (job) > 0` could result into same labelset `job: "job"`.

The issue affects only UI and Web API parts of vmalert, because alert ID is used
only for displaying and finding active alerts. It does not affect state restore
procedure, since this label was added right before pushing to remote storage.

The change now adds all extra labels right after receiving response from the datasource.
And removes adding extra labels before pushing to remote storage.

Additionally, change introduces a new flag `Restored` which will be displayed in UI
for alerts which have been restored from remote storage on restart.
2021-10-22 12:31:35 +03:00
Aliaksandr Valialkin
ad40e70d39
docs/CHANGELOG.md: document a3684fe3de 2021-10-22 12:29:27 +03:00
Nikolay
83e1dfccba
adds tab as second separator for graphite text protocol (#1733)
* adds tab as second separator for graphite text protocol

* changes indexFunc for indexAny

* Update lib/protoparser/graphite/parser_test.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:29:27 +03:00
Yury Molodov
3ccc37a98b
vmui: query history (#1732)
* feat: add query history

* fix: change detect keyUp for nav query history

* feat: set default query history

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:29:27 +03:00
Aliaksandr Valialkin
d56e676d71
lib/flagutil: do not expose sensitive info (passwords, keys and urls) at /flags page 2021-10-20 00:51:15 +03:00
Aliaksandr Valialkin
5705f4b6d1
lib/httpserver: expose command-line flags at /flags page
This should simplify debugging.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:46:54 +03:00
Aliaksandr Valialkin
a105b71116
lib/envflag: use flag.Set for setting the flags from env vars
This should make visible the set flags at flag.Visit(), which is used later for logging
and exporting the `is_set` label for these flags at /metrics page
2021-10-20 00:46:53 +03:00
Aliaksandr Valialkin
93511b4be7
lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
This should improve the debuggability of the readonly feature.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:58:09 +03:00
Roman Khavronenko
ca49853664
vmalert: make group.ID() thread-safe (#1726)
Commit fixes potential race condition when group update
and generating of ID() happens simultaneously.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:45:52 +03:00
Roman Khavronenko
627224d493
vmalert: properly init SIGHUP listener before starting group manager (#1725)
Regression was introduced during code refactoring. It potentially
could lead to situation when SIGHUP signals were ignored while
vmalert was still busy with initing group manager.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:36:08 +03:00
Aliaksandr Valialkin
0a1982b294
app/vmauth: fix metric name prefixes: vmagent -> vmauth 2021-10-19 15:30:16 +03:00
Aliaksandr Valialkin
5df7ef323f
docs/Single-server-VictoriaMetrics.md: add a link to VMUI at VictoriaMetrics playground 2021-10-19 14:42:35 +03:00
Aliaksandr Valialkin
ea69eef375
lib/promscrape/discovery/kubernetes: log a warning if role: endpoints discovers more than 1000 targets per a single endpoint
In this case `role: endpointslice` must be used instead.

See the following references:

* https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity
* https://github.com/kubernetes/kubernetes/pull/99975
* https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398
2021-10-19 13:22:28 +03:00
Aliaksandr Valialkin
948e9b8f50
docs/CHANGELOG.md: document 146a5b504c 2021-10-19 11:24:44 +03:00
Aliaksandr Valialkin
a7de3712e6
docs/CHANGELOG.md: document cbcc622786 2021-10-19 09:00:05 +03:00
Aliaksandr Valialkin
e4ebcebc8a
deployment/docker/alerts.yml: formatting fixes after 865a60f13e 2021-10-19 09:00:05 +03:00
Nikolay
e84a063209
changes job source for /target api (#1723)
use jobNameOriginal instead of relabeled as prometheus does

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707
2021-10-19 09:00:05 +03:00
Roman Khavronenko
d763837130
dashboards: add cardnilaity limiter panels for vmagent (#1720)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 09:00:05 +03:00
Yurii Kravets
34f52de3a5
Update alerts.yml
Added Series Limit day\hour alerts
2021-10-19 09:00:05 +03:00
Aliaksandr Valialkin
79709e2586
vendor: return back the previous google.golang.org/genproto version, since the latest version leads to compile errors
The following errors:

    vendor/cloud.google.com/go/storage/storage.go:1447:53: o.GetCustomerEncryption().GetKeySha256 undefined (type *"google.golang.org/genproto/googleapis/storage/v2".Object_CustomerEncryption has no field or method GetKeySha256)
    vendor/cloud.google.com/go/storage/writer.go:439:10: q.GetCommittedSize undefined (type *"google.golang.org/genproto/googleapis/storage/v2".QueryWriteStatusResponse has no field or method GetCommittedSize)
2021-10-18 15:37:29 +03:00
Aliaksandr Valialkin
c92746bb01
vendor: make vendor-update 2021-10-18 15:23:46 +03:00
Yury Molodov
744f6a98eb
vmui: features (#1711)
* feat: initial uPlot graph

* feat: add zoom/pan for graph

* fix: add zoom by ctrl/mac

* fix: remove unused code

* feat: add toggle cache for fetch

* feat: add fix y-axis limits

* fix: stop point events while panning

* fix: change getting cursor position when scaling

* feat: add cursor tooltip to graph

* fix: uninstall chart.js

* fix: change link for create an issue

* fix: set default cache value to true

* app/vmalert: follow-up after 0e2486df56

* docs/CHANGELOG.md: document 5416e18007

* app/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-18 15:19:01 +03:00
Roman Khavronenko
44a88b7458
vmalert: remove extra / from path in WEB interface (#1717)
The extra `/` may cause issues when additional path prefixes
are configured. Also, removing it makes it consistent
with the rest of declarations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:19:01 +03:00
Roman Khavronenko
93d4ef1239
vmctl: follow-up after 95d1d38595 (#1718)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:19:01 +03:00
Aliaksandr Valialkin
f87d43a5e4
docs/CHANGELOG.md: document 5416e18007 2021-10-18 15:08:44 +03:00
Aliaksandr Valialkin
9273b83cd0
app/vmalert: follow-up after 0e2486df56 2021-10-18 15:08:42 +03:00
Miro Prasil
c0853d4bf8
vmctl influx convert bool to number (#1714)
vmctl: properly convert influx bools into integer representation

When using vmctl influx, the import would fail importing boolean fields
with:

```
failed to convert value "some".0 to float64: unexpected value type true
```

This converts `true` to `1` and `false` to `0`.

Fixes #1709
2021-10-18 14:59:17 +03:00
Alexander Rickardsson
0e1dbcd039
vmalert: add disablePathAppend to remote read (#1712)
* vmalert: add disablePathAppend to remoteRead

* docs: add docs for remoteRead.disablePathAppend
2021-10-18 14:59:17 +03:00
Alexander Rickardsson
63571e1334
vmalert: Redact passwords from error messages (#1713) 2021-10-18 14:59:17 +03:00
Aliaksandr Valialkin
ed994873fd
app/vmselect/promql: randomize the static selection of time series returned from limitk()
Sort series by a hash calculated from the series labels. This should guarantee "random" selection of the returned time series.
Previously the selection could be biased, since time series were sorted alphabetically by label names and label values.
2021-10-16 21:16:34 +03:00
Aliaksandr Valialkin
fbcc8b5c7d
lib/promscrape: set honor_timestamps: true by default if this option isnt set explicitly in scrape configs
This aligns the behavior to Prometheus - see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-10-16 20:48:53 +03:00
Aliaksandr Valialkin
102dd91ddd
docs/CaseStudies.md: add Smarkets case study 2021-10-16 20:00:32 +03:00
Aliaksandr Valialkin
6e1d7d3c9c
docs/CaseStudies.md: add Fly.io case study 2021-10-16 20:00:32 +03:00
Aliaksandr Valialkin
043c7119c8
docs/CaseStudies.md: add a case study for Razorpay 2021-10-16 20:00:32 +03:00
Aliaksandr Valialkin
3e12eeaa3b
docs/CaseStudies.md: add AbiosGaming 2021-10-16 20:00:32 +03:00
Aliaksandr Valialkin
24ffb606be
docs/CaseStudies.md: add Percona case study 2021-10-16 19:14:15 +03:00
Aliaksandr Valialkin
0a9be5ef9d
lib/promscrape: expose promscrape_series_limit_max_series and promscrape_series_limit_current_series metrics per each scrape target with the enabled unique series limiter 2021-10-16 19:14:13 +03:00
Aliaksandr Valialkin
f1803573c3
vendor: update github.com/valyala/gozstd from v1.13.0 to v1.14.1
This should reduce memory usage in vmagent when compressing large scrape responses in stream parsing mode
2021-10-16 19:14:07 +03:00
Aliaksandr Valialkin
99011c6b63
lib/promscrape: always initialize http client for stream parsing mode
Stream parsing mode can be automatically enabled when scraping targets with big response bodies
exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.
2021-10-16 13:19:48 +03:00
Aliaksandr Valialkin
0f4fda1bda
lib/promscrape: store the last scraped response in compressed form if its size exceeds -promscrape.minResponseSizeForStreamParse
This should reduce memory usage when scraping targets with big response bodies.
2021-10-16 13:00:11 +03:00
Aliaksandr Valialkin
da0adf446d
app/vmselect/promql: typo fix in comment: didsn't -> didn't 2021-10-16 13:00:10 +03:00
Aliaksandr Valialkin
0452a8d4e8
lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice
This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode
at the cost of the increased memory usage, since now the potentially big response is kept
in the lastScrape byte slice per each scrapeWork.

In practice the memory usage increase shouldn't be big, since the response size
is usually much smaller than the parsed metrics from this response after the relabeling,
which usually adds a big pile of target-specific labels per each metric.
2021-10-15 15:26:24 +03:00
Aliaksandr Valialkin
3e9beb0f8d
lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name
This is a follow-up for 31b42b30b6
2021-10-15 12:27:31 +03:00
Aliaksandr Valialkin
72c79d7689
docs/FAQ.md: improve wording on why MetricsQL isnt 100% compatible with PromQL 2021-10-14 16:23:17 +03:00
Aliaksandr Valialkin
ae1ee5ba4a
docs/CHANGELOG.md: document the change at 7fcbd3fa4b 2021-10-14 14:38:14 +03:00
Aliaksandr Valialkin
1beb38d476
docs/FAQ.md: add an entry explaining why MetricsQL isn't 100% compatible with PromQL 2021-10-14 12:50:24 +03:00
Aliaksandr Valialkin
25421fa2ae
lib/promscrape: add -promscrape.minResponseSizeForStreamParse command-line option for automatic switching to stream parsing mode when scraping targets with big responses
This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes.
This is common case in Kubernetes monitoring.
2021-10-14 12:30:55 +03:00
Aliaksandr Valialkin
bee130cc78
lib/promscrape: return error if sample_limit or series_limit options are set when stream parsing mode is enabled 2021-10-14 12:30:54 +03:00