mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-20 15:16:42 +00:00
lib/promscrape: add -promscrape.cluster.name
command-line flag
This flag is used for proper data de-duplication when the same target is scraped from multiple vmagent clusters. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
This commit is contained in:
parent
ce8aade80e
commit
6c2fb9d8c4
5 changed files with 19 additions and 2 deletions
|
@ -371,8 +371,12 @@ start a cluster of three `vmagent` instances, where each target is scraped by tw
|
|||
```
|
||||
|
||||
If each target is scraped by multiple `vmagent` instances, then data deduplication must be enabled at remote storage pointed by `-remoteWrite.url`.
|
||||
The `-dedup.minScrapeInterval` must be set to the `scrape_interval` configured at `-promscrape.config`.
|
||||
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
|
||||
|
||||
If multiple `vmagent` clusters scrape the same set of targets, then each cluster must have unique value for the `-promscrape.cluster.name` command-line flag.
|
||||
This is needed for proper data de-duplication. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679) for details.
|
||||
|
||||
## Scraping targets via a proxy
|
||||
|
||||
`vmagent` supports scraping targets via http, https and socks5 proxies. Proxy address must be specified in `proxy_url` option. For example, the following scrape config instructs
|
||||
|
|
|
@ -25,6 +25,7 @@ The following tip changes can be tested by building VictoriaMetrics components f
|
|||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): expose `/api/v1/status/config` endpoint in the same way as Prometheus does. See [these docs](https://prometheus.io/docs/prometheus/latest/querying/api/#config).
|
||||
* FEATURE: add ability to change the `indexdb` rotation timezone offset via `-retentionTimezoneOffset` command-line flag. Previously it was performed at 4am UTC time. This could lead to performance degradation in the middle of the day when VictoriaMetrics runs in time zones located too far from UTC. Thanks to @cnych for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `-promscrape.suppressScrapeErrorsDelay` command-line flag, which can be used for delaying and aggregating the logging of per-target scrape errors. This may reduce the amounts of logs when `vmagent` scrapes many unreliable targets. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575). Thanks to @jelmd for [the initial implementation](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2576).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `-promscrape.cluster.name` command-line flag, which allows proper data de-duplication when the same target is scraped from multiple [vmagent clusters](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679).
|
||||
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): properly apply `alert_relabel_configs` relabeling rules to `-notifier.config` according to [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file). Thanks to @spectvtor for [the bugfix](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2633).
|
||||
* BUGFIX: deny [background merge](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) when the storage enters read-only mode, e.g. when free disk space becomes lower than `-storage.minFreeDiskSpaceBytes`. Background merge needs additional disk space, so it could result in `no space left on device` errors. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603).
|
||||
|
|
|
@ -375,8 +375,12 @@ start a cluster of three `vmagent` instances, where each target is scraped by tw
|
|||
```
|
||||
|
||||
If each target is scraped by multiple `vmagent` instances, then data deduplication must be enabled at remote storage pointed by `-remoteWrite.url`.
|
||||
The `-dedup.minScrapeInterval` must be set to the `scrape_interval` configured at `-promscrape.config`.
|
||||
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
|
||||
|
||||
If multiple `vmagent` clusters scrape the same set of targets, then each cluster must have unique value for the `-promscrape.cluster.name` command-line flag.
|
||||
This is needed for proper data de-duplication. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679) for details.
|
||||
|
||||
## Scraping targets via a proxy
|
||||
|
||||
`vmagent` supports scraping targets via http, https and socks5 proxies. Proxy address must be specified in `proxy_url` option. For example, the following scrape config instructs
|
||||
|
|
|
@ -56,6 +56,9 @@ var (
|
|||
"Can be specified as pod name of Kubernetes StatefulSet - pod-name-Num, where Num is a numeric part of pod name")
|
||||
clusterReplicationFactor = flag.Int("promscrape.cluster.replicationFactor", 1, "The number of members in the cluster, which scrape the same targets. "+
|
||||
"If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication")
|
||||
clusterName = flag.String("promscrape.cluster.name", "", "Optional name of the cluster. If multiple vmagent clusters scrape the same targets, "+
|
||||
"then each cluster must have unique name in order to properly de-duplicate samples received from these clusters. "+
|
||||
"See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679")
|
||||
)
|
||||
|
||||
var clusterMemberID int
|
||||
|
|
|
@ -282,11 +282,16 @@ func (sw *scrapeWork) run(stopCh <-chan struct{}, globalStopCh <-chan struct{})
|
|||
// This also makes consistent scrape times across restarts
|
||||
// for a target with the same ScrapeURL and labels.
|
||||
//
|
||||
// Include clusterMemberNum to the key in order to guarantee that each member in vmagent cluster
|
||||
// Include clusterName to the key in order to guarantee that the same
|
||||
// scrape target is scraped at different offsets per each cluster.
|
||||
// This guarantees that the deduplication consistently leaves samples received from the same vmagent.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
|
||||
//
|
||||
// Include clusterMemberID to the key in order to guarantee that each member in vmagent cluster
|
||||
// scrapes replicated targets at different time offsets. This guarantees that the deduplication consistently leaves samples
|
||||
// received from the same vmagent replica.
|
||||
// See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
|
||||
key := fmt.Sprintf("ClusterMemberNum=%d, ScrapeURL=%s, Labels=%s", clusterMemberID, sw.Config.ScrapeURL, sw.Config.LabelsString())
|
||||
key := fmt.Sprintf("clusterName=%s, clusterMemberID=%d, ScrapeURL=%s, Labels=%s", *clusterName, clusterMemberID, sw.Config.ScrapeURL, sw.Config.LabelsString())
|
||||
h := xxhash.Sum64(bytesutil.ToUnsafeBytes(key))
|
||||
randSleep = uint64(float64(scrapeInterval) * (float64(h) / (1 << 64)))
|
||||
sleepOffset := uint64(time.Now().UnixNano()) % uint64(scrapeInterval)
|
||||
|
|
Loading…
Reference in a new issue