Commit graph

3498 commits

Author SHA1 Message Date
Roman Khavronenko
eff940aa76
Vmalert metrics update (#1580)
* vmalert: remove `vmalert_execution_duration_seconds` metric

The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.

* vmalert: update config reload success metric properly

Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.

* vmalert: add Grafana dashboard to overview application metrics

* docker: include vmalert target into list for scraping

* vmalert: extend notifier metrics with addr label

The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.

* vmalert: update documentation and docker environment for vmalert's dashboard

Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.

Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-08-31 12:28:02 +03:00
Aliaksandr Valialkin
f41b3d6118 vendor: make vendor-update 2021-08-31 12:03:21 +03:00
Aliaksandr Valialkin
6e085e6dac docs/Single-server-VictoriaMetrics.md: remove outdated link to VictoriaMetrics wiki
VictoriaMetrics wiki became outdated after publishing all the docs at https://docs.victoriametrics.com
2021-08-31 11:48:32 +03:00
Aliaksandr Valialkin
8b228a5873 docs/CHANGELOG.md: add a link to Prometheus staleness tracking 2021-08-31 11:48:32 +03:00
dependabot[bot]
6c388f63b3
build(deps): bump github.com/aws/aws-sdk-go from 1.40.30 to 1.40.33 (#1582)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.30 to 1.40.33.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.30...v1.40.33)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:10:31 +03:00
dependabot[bot]
525e6ae1b8
build(deps): bump google.golang.org/api from 0.54.0 to 0.55.0 (#1583)
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.54.0 to 0.55.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/master/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.54.0...v0.55.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:09:55 +03:00
dependabot[bot]
462fa70967
build(deps): bump github.com/klauspost/compress from 1.13.4 to 1.13.5 (#1584)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.13.4 to 1.13.5.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.13.4...v1.13.5)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:01:58 +03:00
Aliaksandr Valialkin
5c63d69454 lib/promscrape/discovery/kubernetes: return back support role: endpointslices, since it is used by VictoriaMetrics operator
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:03 +03:00
Aliaksandr Valialkin
db330232ac lib/protoparser/opentsdb: follow-up after 8ee75ca45a 2021-08-29 11:49:21 +03:00
envzhu
8ee75ca45a
lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. (#1575) 2021-08-29 11:38:32 +03:00
Aliaksandr Valialkin
31b42b30b6 lib/promscrape/discovery/kubernetes: rename role: endpointslices to role: endpointslice to be consistent with Prometheus
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:08 +03:00
Aliaksandr Valialkin
2e001db4de lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress and role: endpointslices
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122

The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:16:59 +03:00
Aliaksandr Valialkin
189507d9d0 docs/Single-server-VictoriaMetrics.md: mention that downsampling doesnt improve query performance on high churn rate 2021-08-27 18:50:26 +03:00
Aliaksandr Valialkin
5ea689d61b app/vmselect/promql: add quantile("phiLabel", phi1, ..., phiN, q) aggregate function to MetricsQL
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1573
2021-08-27 18:37:20 +03:00
Aliaksandr Valialkin
bec18e4fe9 app/vmselect: add -search.disableAutoCacheReset command-line option for disabling automatic cache reset when a sample with old timestamp outside -search.cacheTimestampOffset is inserted
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1570
2021-08-27 17:15:31 +03:00
Aliaksandr Valialkin
67189be1cb docs/{vmgateway,vmbackupmanager}: mention that enterprise binaries are free for download and evaluation 2021-08-27 14:54:09 +03:00
Aliaksandr Valialkin
321da535fa docs: link to active time series, churn rate and high cardinality questions 2021-08-27 14:44:53 +03:00
Aliaksandr Valialkin
c8c153fb91 docs/CHANGELOG.md: document the bugfix for possible timeout error in vmbackupmanager when making snapshots
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1571
2021-08-27 13:04:06 +03:00
Aliaksandr Valialkin
2bc79042f6 docs: mention that enterprise binaries can be downloaded and evaluated for free 2021-08-27 12:48:14 +03:00
Aliaksandr Valialkin
4b3877b798 vendor: make vendor-update 2021-08-26 09:42:23 +03:00
Aliaksandr Valialkin
2bef940add docs/vmagent.md: document the ability to load scrape configs from multiple files
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 09:13:14 +03:00
Aliaksandr Valialkin
10f960fa0c lib/promscrape: add ability to load scrape configs from multiple files
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 08:51:16 +03:00
Aliaksandr Valialkin
25ee4a3644 vendor: make vendor-update 2021-08-25 13:41:02 +03:00
dependabot[bot]
66626db92f
build(deps): bump codecov/codecov-action from 2.0.2 to 2.0.3 (#1563)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-25 13:40:21 +03:00
Aliaksandr Valialkin
e24203cde8 docs/CHANGELOG.md: document 48f33d098b 2021-08-25 13:31:35 +03:00
benclive
48f33d098b
Remove trailing slash for URLPrefixes with specific path (#1554) 2021-08-25 13:28:50 +03:00
Aliaksandr Valialkin
9fc9d76a7f docs/Cluster-VictoriaMetrics.md: mention that the -replicationFactor at vmselect is an optional parameter 2021-08-25 13:10:57 +03:00
Aliaksandr Valialkin
c27ee35c5c lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config 2021-08-25 13:05:49 +03:00
Aliaksandr Valialkin
ffc0ab1774 lib/{mergeset,storage}: improve the detection of the needed free space for background merge
This should prevent from possible out of disk space crashes during big merges.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 09:35:44 +03:00
Aliaksandr Valialkin
a287a48634 docs/FAQ.md: add more entries for frequently asked questions
The following topics are covered:

* Active time series
* High cardinality
* High churn rate
* Slow inserts
2021-08-24 11:34:44 +03:00
Aliaksandr Valialkin
8358890e33 docs/MetricsQL.md: typo fix: histogram_qunatile -> histogram_quantile 2021-08-23 23:08:16 +03:00
Aliaksandr Valialkin
6c5760db9c app/vmselect/promql: make fmt after 0078486ea7 2021-08-23 23:06:00 +03:00
Aliaksandr Valialkin
7c57745f40 docs/MetricsQL.md: fix the indentation for median function 2021-08-23 12:04:31 +03:00
Aliaksandr Valialkin
a78672f95a docs/MetricsQL.md: typo fix: convesions->conversions 2021-08-23 12:02:03 +03:00
Aliaksandr Valialkin
17a1241022 docs/MetricsQL.md: typo fixes 2021-08-23 11:59:12 +03:00
Aliaksandr Valialkin
60ac3e1e46 docs/MetricsQL.md: rehaul the documentation on MetricsQL
* Document all the functions supported by MetricsQL, including PromQL functions
* Group functions by their type: rollup functions, transform functions, label manipulation functions and aggregate functions.
* Document implicit query transformations.
2021-08-23 11:45:52 +03:00
Aliaksandr Valialkin
0078486ea7 app/vmselect/promql: rename sign() function to sgn() in order to be consistent with Prometheus
See https://github.com/prometheus/prometheus/pull/8457 for details.
2021-08-23 11:45:51 +03:00
Aliaksandr Valialkin
69c291353b deployment/docker: update Go builder from Go1.16.0 to Go1.17.0
This improves data ingestion and query performance by up to 5% according to benchmarks.

See https://go.dev/blog/go1.17
2021-08-21 22:20:49 +03:00
Aliaksandr Valialkin
d5622b32e2 lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets
Store the scraped response body instead of storing the parsed and relabeld metrics.
This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics.
This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics.

This should also reduce CPU usage, since the marshaling of the parsed
and relabeld metrics has been substituted by response body copying.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-21 21:17:26 +03:00
Aliaksandr Valialkin
2288e75f03 docs/vmalert.md: run make docs-sync after 9ee3d0378f 2021-08-21 20:24:56 +03:00
Aliaksandr Valialkin
6ffbb46aef docs/CHANGELOG.md: document 9ee3d0378f 2021-08-21 20:20:08 +03:00
Aliaksandr Valialkin
89a4e8fd9b vendor: make vendor-update 2021-08-21 20:16:19 +03:00
Roman Khavronenko
9ee3d0378f
vmalert: add flag disableAlertgroupLabel for disabling extra label added to series (#1534)
The new label added in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611
may negatively impact deduplication in Alertmanager. The new flag supposed to give
an option to disable adding this label.

To enable flag just add `-disableAlertgroupLabel` to binary execution command.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1532
2021-08-21 20:08:55 +03:00
Aliaksandr Valialkin
4f3a5742eb app/vmselect/prometheus: do not extend [d] to the detected interval between samples for first_over_time(m[d])
This is for the sake of consistency with similar change for the last_over_time(m[d]) at a724229b5d
2021-08-21 19:56:14 +03:00
dependabot[bot]
41fdfdb895
build(deps): bump github.com/aws/aws-sdk-go from 1.40.25 to 1.40.26 (#1551)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.25 to 1.40.26.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.25...v1.40.26)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-21 19:52:46 +03:00
Alexander Rickardsson
f4cecaf296
vmalert: accept http.StatusOK for remotewrite (#1550) 2021-08-20 11:58:32 +03:00
Aliaksandr Valialkin
f46a73dcdd lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape
This will make timestamps for stale markers more consistent for timestamps for other samples
2021-08-19 14:18:05 +03:00
Aliaksandr Valialkin
c14edc860b docs/CHANGELOG.md: document b5d6a0e499 2021-08-19 14:03:20 +03:00
Roman Khavronenko
b5d6a0e499
vmselect: update vm_request_duration_seconds value when request fails (#1537)
Before, metric `vm_request_duration_seconds` was update only on successful
attempts which could be misleading. For example, timeout errors on netstorage
request may be not accounted in the metric and won't be visible on dashboards.
Using `defer` statement to update the metric after query arguments validation
may improve the situation.
2021-08-19 13:58:54 +03:00
dependabot[bot]
cbab5f3b42
build(deps): bump github.com/aws/aws-sdk-go from 1.40.22 to 1.40.25 (#1548)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.22 to 1.40.25.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.22...v1.40.25)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-19 13:56:42 +03:00