Commit graph

1449 commits

Author SHA1 Message Date
Aliaksandr Valialkin
2bcb960f17
all: improve query tracing coverage for indexdb search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-09 20:07:07 +03:00
Howie
76f05f8670
feat: rule limit (#2676)
vmalert: support `limit` param in groups definition

`limit` param limits number of time series samples produced by a single rule
during execution.
On reaching the limit rule will return an err.

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-06-09 08:21:30 +02:00
Aliaksandr Valialkin
12ac255dae
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:06:52 +03:00
Dmytro Kozlov
018d2303c4
Cardinality explorer (#2625)
* Cardinality explorer

* vmui, vmselect: updated field name, added description to spinner

* make vmui-update

* updated const name, make vmui-update

* lib/storage: changes calculation for totalSeries values

* added static files

* wip

* wip

* wip

* wip

* docs/CHANGELOG.md: document cardinality explorer feature

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-08 18:43:05 +03:00
Roman Khavronenko
63b538ecd1
vmagent: update SD duration histogram metric if SD is active (#2677)
The change updates histogram for registering SD update duration
only SD is considered as `active`. SD is active if at least
one scraper for this SD has started.

This change supposed to reduce metrics cardinality produced
by duration histogram which gets updated even if SD isn't configured.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 15:46:44 +03:00
Roman Khavronenko
1ee1e986da
lib/storage: limit max mergeConcurrency value for systems with high number of CPUs (#2673)
Workers count for merges affects the max part size during merges. Such behaviour
protects storage from running out of disk space for scenario when all workers
are merging parts with the max size.

This works very well for most cases. But for systems where high number of CPUs
is allocated for vmstorage components this could significantly impact the max
part size and result in more unmerged parts than expected.

While checking multiple production highly loaded setups it was discovered that
`max_over_time(vm_active_merges{type="storage/big}[1h]}"` rarely exceeds 2,
and `max_over_time(vm_active_merges{type="storage/small}[1h]}"` rarely exceeds 4.
The change in this commit limits the max value for concurrency accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 14:55:09 +03:00
Aliaksandr Valialkin
a5814fe16a
lib/promscrape/discovery/kubernetes: use unsupportedFieldError() function instead of errContext string
This improves code readability and maintainability a bit, since the format string
is passed as string literal into fmt.Errorf.
2022-06-07 01:22:07 +03:00
Aliaksandr Valialkin
8608dd093c
all: follow-up after 8edb390e21
- Remove unused js bloatware from /targets page. This strips down binary size by more than 100Kb
- Add /service-discovery page for API compatibility with Prometheus
- Properly load bootstrap.min.css from /prometheus/targets
- Serve static contents for /targets page from app/vminsert instead of app/vmselect, because /targets page is served from there
2022-06-07 00:57:09 +03:00
Aliaksandr Valialkin
6f0a0e3072
lib/promscrape/discovery/kubernetes: follow-up after 006b8c7534
- make more clear error logs
- simplify testing for newKubeConfig by passing only the path to kube_config file instead of SDConfig struct
2022-06-06 14:40:52 +03:00
Aliaksandr Valialkin
cfefdde042
lib/promauth: follow-up after 006b8c7534
- Take into account `ca`, `key` and `cert` values when generating string representation of TLSConfig.
  Print hashes instead of real values because of security considerations.
- Properly update Config.tlsCertDigets when `key` and `cert` values are set.
  This allows properly updating scrape targets after these values are updated in configs.
- Do not re-generate certificate from `key` and `cert` values per each call to getTLSCert,
  because these values are immutable.
- Do not set `ca` value from `ca_file` value, so it isn't exposed at `/config` page.
- Generate proper error messages on incorrect `key`, `cert` or `ca` values.
2022-06-04 01:01:16 +03:00
Aliaksandr Valialkin
0922ed2b7e
lib/promscrape: add -promscrape.cluster.name command-line flag
This flag is used for proper data de-duplication when the same target is scraped
from multiple vmagent clusters.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
2022-06-04 00:37:01 +03:00
Dmytro Kozlov
8edb390e21
lib/promscrape: adds service discovery visualization for /targets page(#2675)
* lib/promscrape: updated template

* lib/promscrape: fixed click on unhealthy and all btns

* app/vmselect: jquery scripts into static folder

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2022-06-03 15:38:45 +02:00
Nikolay
a18914abee
lib/promscrape/discovery/kubernetes: follow-up after 0b5c874911 (#2672) 2022-06-01 20:44:45 +02:00
hadesy
006b8c7534
promscrape/discovery: support kubeconfig (#2533) 2022-06-01 20:34:00 +02:00
Aliaksandr Valialkin
ca689fec54
docs/CHANGELOG.md: follow-up after 2177089f94 2022-06-01 14:51:26 +03:00
Aliaksandr Valialkin
ea06d2fd3c
lib/storage: stop background merge when storage enters read-only mode
This should prevent from `no space left on device` errors when VictoriaMetrics
under-estimates the additional disk space needed for background merge.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603
2022-06-01 14:36:45 +03:00
Roman Khavronenko
642eb1c534
lib/storage: make indexdb/tagFilters cache size configurable (#2667)
The default size of `indexdb/tagFilters` now can be overridden via
`storage.cacheSizeIndexDBTagFilters` flag.
Please, be careful with changing default size since it may
lead to inefficient work of the vmstorage or OOM exceptions.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:07:53 +02:00
Roman Khavronenko
2177089f94
promrelabel: add support of lowercase and uppercase relabeling actions (#2665)
* promrelabel: add support of `lowercase` and `uppercase` relabeling actions

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2664
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: make golangci-lint happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:02:37 +02:00
Aliaksandr Valialkin
41958ed5dd
all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:29:23 +03:00
Aliaksandr Valialkin
d2567ccdd6
lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum
In this case there is no need in converting int64 to int
2022-06-01 01:42:34 +03:00
Aliaksandr Valialkin
a1add5c2c7
lib/storage: make fmt 2022-05-31 12:54:37 +03:00
Aliaksandr Valialkin
bac75ea8a2
lib/storage: do not take into account series from the next day when match[] filter is passed to /api/v1/status/tsdb 2022-05-31 12:15:26 +03:00
Dmytro Kozlov
11f91532c5
issue-2594: use embedded for static files (#2650)
embed static js and css files from CDN into vmalert, vmagent and vmsingle binaries.

Co-authored-by: f41gh7 <nik@victoriametrics.com>

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2594
2022-05-31 01:55:28 +02:00
Dmytro Kozlov
1eb29794e6
removed redundant return (fixed linter) (#2647)
* removed redundant return

* updated lint package version
2022-05-26 16:24:01 +02:00
Aliaksandr Valialkin
796804e4b0
lib/promscrape: add -promscrape.suppressScrapeErrorsDelay command-line flag
This flag can be used for reducing the amounts of logs when scraping unreliable scrape targets.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575

The patch is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2576 .
Thanks to @jelmd .
2022-05-25 22:59:36 +03:00
Aliaksandr Valialkin
f6d11a49aa
lib/storage: add ability to change the indexdb rotation time offset with -retentionTimezoneOffset command-line flag
This is a follow-up for 0fbf59199a

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574
2022-05-25 16:05:29 +03:00
阳明
0fbf59199a
lib/storage: Remove the effect of time zone on next retention period (#2568) (#2574) 2022-05-25 15:08:24 +03:00
Roman Khavronenko
d5eb6afe26
lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery (#2615)
* lib/promscrape/discovery/kubernetes: properly updates discovered scrape works
previously, added or updated scrapeworks may override previuosly
discovered.
it happens because swosByKey may contain small subset of kubernetes
objects with it's labels.
It happens for objectsUpdated and objectsAdded maps, which include only changed elements

* Properly calculate vm_promscrape_discovery_kubernetes_scrape_works

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-21 01:01:37 +03:00
Boris Petersen
3df8caca15
Add ability to sign requests for all AWS services (#2604)
This adds the ability to utilize sigv4 signing for all AWS services not
just "aps". When the newly introduced property "service" is not set it
will default to "aps".

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-05-18 14:58:31 +02:00
Aliaksandr Valialkin
a0727ab1b1
docs/vmagent.md: typo fix in the description for -promscrape.cluster.replicationFactor command-line flag 2022-05-12 18:50:29 +03:00
Aliaksandr Valialkin
9ea3f0c0d3
lib/awsapi: remove whitelist arg from GetFiltersQueryString(), since it may break new filters in the future
Let users decide which filters to use. If users start using disallowed filters, then AWS will return an error.
2022-05-09 15:33:22 +03:00
Aliaksandr Valialkin
123aa4c79e
lib/promscrape: properly implement ScrapeConfig.clone()
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.

This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
2022-05-07 00:05:40 +03:00
Aliaksandr Valialkin
1dc4cc243b
lib/promscrape: rename promscrape_stale_samples_created_total metric to vm_promscrape_stale_samples_created_total, so its name is consistent with the rest of vm_promscrape_ metrics 2022-05-06 15:33:13 +03:00
Aliaksandr Valialkin
d5b55fe22d
lib/promscrape/discovery/ec2: add ability to filter Availability Zones in ec2_sd_config via az_filters section 2022-05-06 12:43:29 +03:00
Aliaksandr Valialkin
97f9c2f667
lib/promscrape/discovery/ec2: properly pass filters to DescribeAvailabilityZones API call
Previously filters wheren't passed to this call after the commit 0e09fdb8b0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2022-05-05 11:00:23 +03:00
Aliaksandr Valialkin
d285c2fea7
lib/awsapi: pass filtersQueryString arg to GetEC2APIResponse() function, so the caller could decide whether to use the filters during the AWS API query
The filters shouldn't be passed to DescribeAvailabilityZones API call.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

Related commits:
0e09fdb8b0
d289ecded1
2022-05-05 10:29:34 +03:00
Dmytro Kozlov
7dd9f3b98e
{vmbackup, vmbackup/snapshot}: fixed problem with snapshot backup in another snapshot folder (#2535)
* {vmbackup, vmbackup/snapshot}: validate snapshot name

* vmbackup/snapshot: added another checks

* backup/actions: added check that we ignore backup_complete.ignore file

* vmbackup: moved snapshot to lib directory

* lib/snapshot: added functions description

* lib/snapshot: fixed typo

* vmbackup: code cleanup

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 22:12:03 +03:00
Nikolay
d289ecded1
{lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458)
* {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite
moves aws related code into separate lib from lib/promscrape
it allows to write data from vmagent to the AWS managed prometheus (cortex)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 20:24:19 +03:00
Nikolay
3575aabeaf
lib/promscrape: adds correct http status codes for redirect (#2530)
standard http client accepts multiple http status codes as redirect
it should fix issue with incorrect redirects
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2482
2022-05-03 13:31:31 +03:00
Aliaksandr Valialkin
0d86644d65
lib/storage: leave the last sample per each discrete interval during the deduplicaton
This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
2022-05-02 21:50:45 +03:00
Artem Navoiev
37cf509c3a
lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487)
* lib/{storage,flagutil} - Add option for snapshot autoremoval

- add prometheus-like duration as command flag
- add option to delete stale snapshots
- update duration.go flag to re-use own code

* wip

* lib/flagutil: re-use Duration.Set() call in NewDuration

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 11:00:15 +03:00
Aliaksandr Valialkin
20bc2a2c44
lib/flagutil: re-use Duration.Set() call in NewDuration 2022-05-02 10:56:39 +03:00
Dima Lazerka
84683b8569
Fix targetstatus qtpl paths (#2517)
Ran `make quicktemplate-gen` from the root directory
2022-04-29 10:36:03 +03:00
Aliaksandr Valialkin
6be07e8c25
lib/promscrape/discovery/kubernetes: do not drop pod meta-labels even if the corresponding node objects are missing
This reflects the logic used in Prometheus.

See https://github.com/prometheus/prometheus/pull/10080
2022-04-26 15:26:01 +03:00
Aliaksandr Valialkin
9fe1bf5d53
lib/promauth: take into account tls_config and proxy_url when serializing OAuth2Config to string 2022-04-23 00:23:19 +03:00
Aliaksandr Valialkin
eb5d7ad089
lib/promauth: add support for min_version option at tls_config section in the same way as Prometheus does 2022-04-23 00:16:39 +03:00
Aliaksandr Valialkin
174431e31b
lib/promauth: add support for proxy_url option at oauth2 section in the same way as Prometheus does 2022-04-23 00:00:44 +03:00
Aliaksandr Valialkin
18b14aad8e
lib/promauth: add support for tls_config section at oauth2 config in the same way as Prometheus does 2022-04-22 23:51:07 +03:00
Aliaksandr Valialkin
6f79b2b68b
lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects
Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste
2022-04-22 23:14:17 +03:00
Aliaksandr Valialkin
15190fcdae
lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:41 +03:00