From 076a1f84e1893f28ce3dc605bc0738a149359cf1 Mon Sep 17 00:00:00 2001 From: f41gh7 Date: Tue, 1 Oct 2024 16:49:46 +0200 Subject: [PATCH] vmselect: add support of multi-tenant queries Added ability to query data across multiple tenants. See: VictoriaMetrics/VictoriaMetrics#1434 Currently, the following endpoints work with multi-tenancy: - /prometheus/api/v1/query - /prometheus/api/v1/query_range - /prometheus/api/v1/series - /prometheus/api/v1/labels - /prometheus/api/v1/label//values - /prometheus/api/v1/status/active_queries - /prometheus/api/v1/status/top_queries - /prometheus/api/v1/status/tsdb - /prometheus/api/v1/export - /prometheus/api/v1/export/csv - /vmui A note regarding VMUI: endpoints such as `active_queries` and `top_queries` have been updated to indicate whether query was a single-tenant or multi-tenant, but UI needs to be updated to display this info. cc: @Loori-R --------- Signed-off-by: Zakhar Bessarab Signed-off-by: f41gh7 Co-authored-by: f41gh7 --- docs/Cluster-VictoriaMetrics.md | 58 ++++++++++++++++++++++++++++++++- docs/changelog/CHANGELOG.md | 1 + 2 files changed, 58 insertions(+), 1 deletion(-) diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index e843d051f..2c882faa6 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -113,8 +113,55 @@ such as [Graphite](https://docs.victoriametrics.com/#how-to-send-data-from-graph [InfluxDB line protocol via TCP and UDP](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) and [OpenTSDB telnet put protocol](https://docs.victoriametrics.com/#sending-data-via-telnet-put-protocol). + +`vmselect` can execute queries over multiple [tenants](#multitenancy) via special `multitenant` endpoints `http://vmselect:8481/select/multitenant/`. +Currently supported endpoints for `` are: +- `/prometheus/api/v1/query` +- `/prometheus/api/v1/query_range` +- `/prometheus/api/v1/series` +- `/prometheus/api/v1/labels` +- `/prometheus/api/v1/label//values` +- `/prometheus/api/v1/status/active_queries` +- `/prometheus/api/v1/status/top_queries` +- `/prometheus/api/v1/status/tsdb` +- `/prometheus/api/v1/export` +- `/prometheus/api/v1/export/csv` +- `/vmui` + +It is possible to explicitly specify `accountID` and `projectID` for querying multiple tenants via `vm_account_id` and `vm_project_id` labels in the query. +Alternatively, it is possible to use [`extra_filters[]` and `extra_label`](https://docs.victoriametrics.com/#prometheus-querying-api-enhancements) +query args to apply additional filters for the query. + +For example, the following query fetches the total number of time series for the tenants `accountID=42` and `accountID=7, projectID=9`: +``` +up{vm_account_id="7", vm_project_id="9" or vm_account_id="42"} +``` + +In order to achieve the same via `extra_filters[]` and `extra_label` query args, the following query must be used: +``` +curl 'http://vmselect:8481/select/multitenant/prometheus/api/v1/query' \ + -d 'query=up' \ + -d 'extra_filters[]={vm_account_id="7",vm_project_id="9"}' \ + -d 'extra_filters[]={vm_account_id="42"}' +``` + +The precedence for applying filters for tenants follows this order: + +1. filters tenants from `extra_label` and `extra_filters` query arguments label selectors. + These filters have the highest priority and are applied first when provided through the query arguments. + +2. filters tenants from labels selectors defined at metricsQL query expression. + + + +Note that `vm_account_id` and `vm_project_id` labels support all operators for label matching. For example: +``` +up{vm_account_id!="42"} # selects all the time series except those belonging to accountID=42 +up{vm_account_id=~"4.*"} # selects all the time series belonging to accountIDs starting with 4 +``` + **Security considerations:** it is recommended restricting access to `multitenant` endpoints only to trusted sources, -since untrusted source may break per-tenant data by writing unwanted samples to arbitrary tenants. +since untrusted source may break per-tenant data by writing unwanted samples or get access to data of arbitrary tenants. ## Binaries @@ -706,6 +753,13 @@ Some workloads may need fine-grained resource usage limits. In these cases the f Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/faq/#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage. +- `-search.maxDeleteSeries` at `vmselect` limits the number of unique time + series that can be deleted by a single + [/api/v1/admin/tsdb/delete_series](https://docs.victoriametrics.com/url-examples/#apiv1admintsdbdelete_series) + call. Deleting too many time series may require big amount of CPU and memory + at `vmstorage` and this limit guards against unplanned resource usage spikes. + Also see [How to delete time series](#how-to-delete-time-series) section to + learn about different ways of deleting series. - `-search.maxTagKeys` at `vmstorage` limits the number of items, which may be returned from [/api/v1/labels](https://docs.victoriametrics.com/url-examples/#apiv1labels). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` @@ -1589,6 +1643,8 @@ Below is the output for `/path/to/vmselect -help`: -search.inmemoryBufSizeBytes size Size for in-memory data blocks used during processing search requests. By default, the size is automatically calculated based on available memory. Adjust this flag value if you observe that vm_tmp_blocks_max_inmemory_file_size_bytes metric constantly shows much higher values than vm_tmp_blocks_inmemory_file_size_bytes. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851 Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0) + -search.tenantCacheExpireDuration duration + The expiry duration for list of tenants for multi-tenant queries. (default 5m0s) -search.treatDotsAsIsInRegexps Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter -selectNode array diff --git a/docs/changelog/CHANGELOG.md b/docs/changelog/CHANGELOG.md index 953c14894..17c03b068 100644 --- a/docs/changelog/CHANGELOG.md +++ b/docs/changelog/CHANGELOG.md @@ -37,6 +37,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add service discovery support for [OVH Cloud VPS](https://www.ovhcloud.com/en/vps/) and [OVH Cloud dedicated server](https://ovhcloud.com/en/bare-metal/). See [these docs](https://docs.victoriametrics.com/sd_configs/#ovhcloud_sd_configs) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071). * FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert): bump default values for sending data to `remoteWrite.url`: `remoteWrite.maxQueueSize` from `100_000` to `1_000_000`, `remoteWrite.maxBatchSize` from `1_000` to `10_000`, `remoteWrite.concurrency` from `1` to `4`. The new settings should improve remote write performance of vmalert with default settings. * FEATURE: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager/): add API for creating/scheduling backups. See [documentation](https://docs.victoriametrics.com/vmbackupmanager/#api-methods) +* FEATURE: `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add support of multi-tenant queries. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1434) for details. * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent/) fix service discovery of Azure Virtual Machines for response contains `nextLink` in `Host:Port` format. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6912). * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent/): properly consume messages [from kafka](https://docs.victoriametrics.com/vmagent/#kafka-integration). Previously vmagent could skip some messages during start-up.