vmselect: add support of multi-tenant queries

Added  ability to query data across multiple tenants. See:
VictoriaMetrics/VictoriaMetrics#1434

Currently, the following endpoints work with multi-tenancy:
- /prometheus/api/v1/query
- /prometheus/api/v1/query_range
- /prometheus/api/v1/series
- /prometheus/api/v1/labels
- /prometheus/api/v1/label/<label_name>/values
- /prometheus/api/v1/status/active_queries
- /prometheus/api/v1/status/top_queries
- /prometheus/api/v1/status/tsdb
- /prometheus/api/v1/export
- /prometheus/api/v1/export/csv
- /vmui

A note regarding VMUI: endpoints such as `active_queries` and
`top_queries` have been updated to indicate whether query was a
single-tenant or multi-tenant, but UI needs to be updated to display
this info.
cc: @Loori-R

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
This commit is contained in:
f41gh7 2024-10-01 16:49:46 +02:00
parent 496015aa0e
commit 076a1f84e1
No known key found for this signature in database
GPG key ID: 4558311CF775EC72
2 changed files with 58 additions and 1 deletions

View file

@ -113,8 +113,55 @@ such as [Graphite](https://docs.victoriametrics.com/#how-to-send-data-from-graph
[InfluxDB line protocol via TCP and UDP](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) and
[OpenTSDB telnet put protocol](https://docs.victoriametrics.com/#sending-data-via-telnet-put-protocol).
`vmselect` can execute queries over multiple [tenants](#multitenancy) via special `multitenant` endpoints `http://vmselect:8481/select/multitenant/<suffix>`.
Currently supported endpoints for `<suffix>` are:
- `/prometheus/api/v1/query`
- `/prometheus/api/v1/query_range`
- `/prometheus/api/v1/series`
- `/prometheus/api/v1/labels`
- `/prometheus/api/v1/label/<label_name>/values`
- `/prometheus/api/v1/status/active_queries`
- `/prometheus/api/v1/status/top_queries`
- `/prometheus/api/v1/status/tsdb`
- `/prometheus/api/v1/export`
- `/prometheus/api/v1/export/csv`
- `/vmui`
It is possible to explicitly specify `accountID` and `projectID` for querying multiple tenants via `vm_account_id` and `vm_project_id` labels in the query.
Alternatively, it is possible to use [`extra_filters[]` and `extra_label`](https://docs.victoriametrics.com/#prometheus-querying-api-enhancements)
query args to apply additional filters for the query.
For example, the following query fetches the total number of time series for the tenants `accountID=42` and `accountID=7, projectID=9`:
```
up{vm_account_id="7", vm_project_id="9" or vm_account_id="42"}
```
In order to achieve the same via `extra_filters[]` and `extra_label` query args, the following query must be used:
```
curl 'http://vmselect:8481/select/multitenant/prometheus/api/v1/query' \
-d 'query=up' \
-d 'extra_filters[]={vm_account_id="7",vm_project_id="9"}' \
-d 'extra_filters[]={vm_account_id="42"}'
```
The precedence for applying filters for tenants follows this order:
1. filters tenants from `extra_label` and `extra_filters` query arguments label selectors.
These filters have the highest priority and are applied first when provided through the query arguments.
2. filters tenants from labels selectors defined at metricsQL query expression.
Note that `vm_account_id` and `vm_project_id` labels support all operators for label matching. For example:
```
up{vm_account_id!="42"} # selects all the time series except those belonging to accountID=42
up{vm_account_id=~"4.*"} # selects all the time series belonging to accountIDs starting with 4
```
**Security considerations:** it is recommended restricting access to `multitenant` endpoints only to trusted sources,
since untrusted source may break per-tenant data by writing unwanted samples to arbitrary tenants.
since untrusted source may break per-tenant data by writing unwanted samples or get access to data of arbitrary tenants.
## Binaries
@ -706,6 +753,13 @@ Some workloads may need fine-grained resource usage limits. In these cases the f
Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains
big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/faq/#what-is-high-churn-rate).
In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxDeleteSeries` at `vmselect` limits the number of unique time
series that can be deleted by a single
[/api/v1/admin/tsdb/delete_series](https://docs.victoriametrics.com/url-examples/#apiv1admintsdbdelete_series)
call. Deleting too many time series may require big amount of CPU and memory
at `vmstorage` and this limit guards against unplanned resource usage spikes.
Also see [How to delete time series](#how-to-delete-time-series) section to
learn about different ways of deleting series.
- `-search.maxTagKeys` at `vmstorage` limits the number of items, which may be returned from
[/api/v1/labels](https://docs.victoriametrics.com/url-examples/#apiv1labels). This endpoint is used mostly by Grafana
for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect`
@ -1589,6 +1643,8 @@ Below is the output for `/path/to/vmselect -help`:
-search.inmemoryBufSizeBytes size
Size for in-memory data blocks used during processing search requests. By default, the size is automatically calculated based on available memory. Adjust this flag value if you observe that vm_tmp_blocks_max_inmemory_file_size_bytes metric constantly shows much higher values than vm_tmp_blocks_inmemory_file_size_bytes. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-search.tenantCacheExpireDuration duration
The expiry duration for list of tenants for multi-tenant queries. (default 5m0s)
-search.treatDotsAsIsInRegexps
Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter
-selectNode array

View file

@ -37,6 +37,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add service discovery support for [OVH Cloud VPS](https://www.ovhcloud.com/en/vps/) and [OVH Cloud dedicated server](https://ovhcloud.com/en/bare-metal/). See [these docs](https://docs.victoriametrics.com/sd_configs/#ovhcloud_sd_configs) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071).
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert): bump default values for sending data to `remoteWrite.url`: `remoteWrite.maxQueueSize` from `100_000` to `1_000_000`, `remoteWrite.maxBatchSize` from `1_000` to `10_000`, `remoteWrite.concurrency` from `1` to `4`. The new settings should improve remote write performance of vmalert with default settings.
* FEATURE: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager/): add API for creating/scheduling backups. See [documentation](https://docs.victoriametrics.com/vmbackupmanager/#api-methods)
* FEATURE: `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add support of multi-tenant queries. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1434) for details.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent/) fix service discovery of Azure Virtual Machines for response contains `nextLink` in `Host:Port` format. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6912).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent/): properly consume messages [from kafka](https://docs.victoriametrics.com/vmagent/#kafka-integration). Previously vmagent could skip some messages during start-up.