Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files

This commit is contained in:
Aliaksandr Valialkin 2023-07-26 14:59:49 -07:00
commit 9f1e9c54c8
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
976 changed files with 72419 additions and 14329 deletions

View file

@ -6,16 +6,12 @@ updates:
interval: "daily"
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 0
- package-ecosystem: "bundler"
directory: "/docs"
schedule:
interval: "daily"
open-pull-requests-limit: 0
- package-ecosystem: "gomod"
directory: "/app/vmui/packages/vmui/web"
schedule:
interval: "weekly"
open-pull-requests-limit: 0
- package-ecosystem: "docker"
directory: "/"
@ -23,6 +19,4 @@ updates:
interval: "daily"
- package-ecosystem: "npm"
directory: "/app/vmui/packages/vmui"
schedule:
interval: "weekly"
open-pull-requests-limit: 0

View file

@ -17,7 +17,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.20.5
go-version: 1.20.6
id: go
- name: Code checkout
uses: actions/checkout@master

View file

@ -57,7 +57,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: 1.20.5
go-version: 1.20.6
check-latest: true
cache: true
if: ${{ matrix.language == 'go' }}

View file

@ -32,7 +32,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v4
with:
go-version: 1.20.5
go-version: 1.20.6
check-latest: true
cache: true
@ -56,7 +56,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v4
with:
go-version: 1.20.5
go-version: 1.20.6
check-latest: true
cache: true
@ -81,7 +81,7 @@ jobs:
id: go
uses: actions/setup-go@v4
with:
go-version: 1.20.5
go-version: 1.20.6
check-latest: true
cache: true

53
.github/workflows/sync-docs.yml vendored Normal file
View file

@ -0,0 +1,53 @@
name: publish-docs
on:
push:
branches:
- 'master'
paths:
- 'docs/**'
workflow_dispatch: {}
permissions:
contents: read # This is required for actions/checkout and to commit back image update
deployments: write
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v3
with:
path: main
- name: Checkout private code
uses: actions/checkout@v3
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}
path: docs
- name: Import GPG key
uses: crazy-max/ghaction-import-gpg@v5
with:
gpg_private_key: ${{ secrets.VM_BOT_GPG_PRIVATE_KEY }}
passphrase: ${{ secrets.VM_BOT_PASSPHRASE }}
git_user_signingkey: true
git_commit_gpgsign: true
workdir: docs
- name: Set short git commit SHA
id: vars
run: |
calculatedSha=$(git rev-parse --short ${{ github.sha }})
echo "short_sha=$calculatedSha" >> $GITHUB_OUTPUT
working-directory: main
- name: update code and commit
run: |
rm -rf content
cp -r ../main/docs content
make clean-after-copy
git config --global user.name "${{ steps.import-gpg.outputs.email }}"
git config --global user.email "${{ steps.import-gpg.outputs.email }}"
git add .
git commit -S -m "sync docs with VictoriaMetrics/VictoriaMetrics commit: ${{ steps.vars.outputs.short_sha }}"
git push
working-directory: docs

View file

@ -8,11 +8,14 @@ jobs:
deploy-sandbox:
runs-on: ubuntu-latest
steps:
- name: check inputs
if: github.event.release.tag_name == ''
run: exit 1
- name: Check out code
uses: actions/checkout@v3
with:
repository: VictoriaMetrics/ops
ref: master
token: ${{ secrets.VM_BOT_GH_TOKEN }}
- name: Import GPG key
@ -74,4 +77,4 @@ jobs:
Release [${{ github.event.release.tag_name }}](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/${{ github.event.release.tag_name }}) to sandbox
> Auto-generated by `Github Actions Bot`

3
.gitignore vendored
View file

@ -8,6 +8,7 @@
*.test
*.swp
/gocache-for-docker
/victoria-logs-data
/victoria-metrics-data
/vmagent-remotewrite-data
/vmstorage-data
@ -20,4 +21,4 @@
Gemfile.lock
/_site
_site
*.tmp
*.tmp

View file

@ -316,6 +316,7 @@ The UI allows exploring query results via graphs and tables. It also provides th
- [Metrics explorer](#metrics-explorer) - automatically builds graphs for selected metrics;
- [Cardinality explorer](#cardinality-explorer) - stats about existing metrics in TSDB;
- [Top queries](#top-queries) - shows most frequently executed queries;
- [Active queries](#active-queries) - shows currently executed queries;
- Tools:
- [Trace analyzer](#query-tracing) - playground for loading query traces in JSON format;
- [WITH expressions playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/expand-with-exprs) - test how WITH expressions work;
@ -360,15 +361,24 @@ See the [example VMUI at VictoriaMetrics playground](https://play.victoriametric
* queries with the biggest average execution duration;
* queries that took the most summary time for execution.
## Active queries
[VMUI](#vmui) provides `active queries` tab, which shows currently execute queries.
It provides the following information per each query:
- The query itself, together with the time range and step args passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query).
- The duration of the query execution.
- The client address, who initiated the query execution.
## Metrics explorer
[VMUI](#vmui) provides an ability to explore metrics exported by a particular `job` / `instance` in the following way:
1. Open the `vmui` at `http://victoriametrics:8428/vmui/`.
2. Click the `Explore metrics` tab.
3. Select the `job` you want to explore.
4. Optionally select the `instance` for the selected job to explore.
5. Select metrics you want to explore and compare.
1. Click the `Explore metrics` tab.
1. Select the `job` you want to explore.
1. Optionally select the `instance` for the selected job to explore.
1. Select metrics you want to explore and compare.
It is possible to change the selected time range for the graphs in the top right corner.
@ -391,6 +401,11 @@ matching the specified [series selector](https://prometheus.io/docs/prometheus/l
Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).
In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
See [cardinality explorer playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/cardinality).
See the example of using the cardinality explorer [here](https://victoriametrics.com/blog/cardinality-explorer/).
@ -406,7 +421,7 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter)
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets.
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets. If the provided configuration file contains [unsupported options](https://docs.victoriametrics.com/vmagent.html#unsupported-prometheus-config-sections), then either delete them from the file or just pass `-promscrape.config.strictParse=false` command-line flag to VictoriaMetrics, so it will ignore unsupported options.
The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
@ -443,7 +458,7 @@ DD_DD_URL=http://victoriametrics:8428/datadog
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog agent via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files)
To configure DataDog agent via [configuration file](https://github.com/DataDog/datadog-agent/blob/878600ef7a55c5ef0efb41ed0915f020cf7e3bd0/pkg/config/config_template.yaml#L33)
add the following line:
<div class="with-copy" markdown="1">
@ -471,7 +486,8 @@ Run DataDog using the following ENV variable with VictoriaMetrics as additional
<div class="with-copy" markdown="1">
```
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\"}'
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\": [\"apikey\"]}'
```
</div>
@ -485,7 +501,9 @@ add the following line:
<div class="with-copy" markdown="1">
```
additional_endpoints: http://victoriametrics:8428/datadog
additional_endpoints:
"http://victoriametrics:8428/datadog":
- apikey
```
</div>
@ -895,13 +913,13 @@ to your needs or when testing bugfixes.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make victoria-metrics` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make victoria-metrics` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make victoria-metrics-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make victoria-metrics-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-prod` binary and puts it into the `bin` folder.
### ARM build
@ -911,13 +929,13 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make victoria-metrics-linux-arm` or `make victoria-metrics-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make victoria-metrics-linux-arm` or `make victoria-metrics-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-linux-arm` or `victoria-metrics-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make victoria-metrics-linux-arm-prod` or `make victoria-metrics-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make victoria-metrics-linux-arm-prod` or `make victoria-metrics-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-linux-arm-prod` or `victoria-metrics-linux-arm64-prod` binary respectively and puts it into the `bin` folder.
### Pure Go build (CGO_ENABLED=0)
@ -925,7 +943,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
`Pure Go` mode builds only Go code without [cgo](https://golang.org/cmd/cgo/) dependencies.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make victoria-metrics-pure` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make victoria-metrics-pure` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-pure` binary and puts it into the `bin` folder.
### Building docker images
@ -978,9 +996,9 @@ Navigate to `http://<victoriametrics-addr>:8428/snapshot/delete_all` in order to
Steps for restoring from a snapshot:
1. Stop VictoriaMetrics with `kill -INT`.
2. Restore snapshot contents from backup with [vmrestore](https://docs.victoriametrics.com/vmrestore.html)
1. Restore snapshot contents from backup with [vmrestore](https://docs.victoriametrics.com/vmrestore.html)
to the directory pointed by `-storageDataPath`.
3. Start VictoriaMetrics.
1. Start VictoriaMetrics.
## How to delete time series
@ -1658,7 +1676,8 @@ Retention filters can be evaluated for free by downloading and using enterprise
* `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. It is possible to use recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to reduce the number of time series. See [these docs](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series.
It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage)
and can't be performed if there is not enough of free disk space or if vmstorage
@ -1773,6 +1792,11 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
* `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
* `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.
In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).
## Query tracing
@ -2026,16 +2050,16 @@ The simplest way to migrate data from one single-node (source) to another (desti
to another do the following:
1. Stop the VictoriaMetrics (source) with `kill -INT`;
2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified
1. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified
via `-storageDataPath` from the source node to the empty folder at the destination node.
3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that
1. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that
its `-storageDataPath` points to the copied folder from p.2;
4. Start the VictoriaMetrics (destination). The copied data should be now available.
1. Start the VictoriaMetrics (destination). The copied data should be now available.
Things to consider when copying data:
1. Data formats between single-node and vmstorage node aren't compatible and can't be copied.
2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics.
1. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics.
For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction
of data - see [vmctl. Migrating data from VictoriaMetrics](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics).
@ -2253,9 +2277,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-dryRun
Whether to check config files without running VictoriaMetrics. The following config files are checked: -promscrape.config, -relabelConfig and -streamAggr.config. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -2289,9 +2313,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8428")
-httpListenAddr.useProxyProtocol
@ -2320,7 +2344,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-influxTrimTimestamp duration
Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-inmemoryDataFlushInterval duration
The interval for guaranteed saving of in-memory data to disk. The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). Smaller intervals increase disk IO load. Minimum supported value is 1s (default 5s)
The interval for guaranteed saving of in-memory data to disk. The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). Smaller intervals increase disk IO load. Minimum supported value is 1s (default 5s)
-insert.maxQueueDuration duration
The maximum duration to wait in the queue when -maxConcurrentInserts concurrent insert requests are executed (default 1m0s)
-internStringCacheExpireDuration duration
@ -2328,7 +2352,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-logNewSeries
Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics
-loggerDisableTimestamps
@ -2348,7 +2372,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
The maximum number of concurrent insert requests. The default value should work for most cases, since it minimizes memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
@ -2357,10 +2381,10 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-maxLabelsPerTimeseries int
The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-opentsdbHTTPListenAddr string
@ -2615,11 +2639,13 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-storageDataPath string
Path to storage data (default "victoria-metrics-data")
-streamAggr.config string
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput and -streamAggr.dedupInterval
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval
-streamAggr.dedupInterval duration
Input samples are de-duplicated with this interval before being aggregated. Only the last sample per each time series per each interval is aggregated if the interval is greater than zero
-streamAggr.dropInput
Whether to drop all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html
-streamAggr.keepInput
Whether to keep input samples after the aggregation with -streamAggr.config. By default, the input is dropped after the aggregation, so only the aggregate data is stored. See https://docs.victoriametrics.com/stream-aggregation.html
Whether to keep all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string

View file

@ -36,8 +36,8 @@ var (
"-promscrape.config, -relabelConfig and -streamAggr.config. Unknown config entries aren't allowed in -promscrape.config by default. "+
"This can be changed with -promscrape.config.strictParse=false command-line flag")
inmemoryDataFlushInterval = flag.Duration("inmemoryDataFlushInterval", 5*time.Second, "The interval for guaranteed saving of in-memory data to disk. "+
"The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"Smaller intervals increase disk IO load. Minimum supported value is 1s")
downsamplingPeriods = flagutil.NewArrayString("downsampling.period", "Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs "+
"to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details")

View file

@ -24,30 +24,30 @@ import (
"github.com/VictoriaMetrics/metrics"
)
// RequestHandler processes ElasticSearch insert requests
// RequestHandler processes Elasticsearch insert requests
func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
w.Header().Add("Content-Type", "application/json")
// This header is needed for Logstash
w.Header().Set("X-Elastic-Product", "Elasticsearch")
if strings.HasPrefix(path, "/_ilm/policy") {
// Return fake response for ElasticSearch ilm request.
// Return fake response for Elasticsearch ilm request.
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_index_template") {
// Return fake response for ElasticSearch index template request.
// Return fake response for Elasticsearch index template request.
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_ingest") {
// Return fake response for ElasticSearch ingest pipeline request.
// Return fake response for Elasticsearch ingest pipeline request.
// See: https://www.elastic.co/guide/en/elasticsearch/reference/8.8/put-pipeline-api.html
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_nodes") {
// Return fake response for ElasticSearch nodes discovery request.
// Return fake response for Elasticsearch nodes discovery request.
// See: https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster.html
fmt.Fprintf(w, `{}`)
return true
@ -56,8 +56,8 @@ func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
case "/":
switch r.Method {
case http.MethodGet:
// Return fake response for ElasticSearch ping request.
// See the latest available version for ElasticSearch at https://github.com/elastic/elasticsearch/releases
// Return fake response for Elasticsearch ping request.
// See the latest available version for Elasticsearch at https://github.com/elastic/elasticsearch/releases
fmt.Fprintf(w, `{
"version": {
"number": "8.8.0"
@ -69,7 +69,7 @@ func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
return true
case "/_license":
// Return fake response for ElasticSearch license request.
// Return fake response for Elasticsearch license request.
fmt.Fprintf(w, `{
"license": {
"uid": "cbff45e7-c553-41f7-ae4f-9205eabd80xx",
@ -199,12 +199,15 @@ func readBulkLine(sc *bufio.Scanner, timeField, msgField string,
return false, fmt.Errorf("cannot parse json-encoded log entry: %w", err)
}
timestamp, err := extractTimestampFromFields(timeField, p.Fields)
ts, err := extractTimestampFromFields(timeField, p.Fields)
if err != nil {
return false, fmt.Errorf("cannot parse timestamp: %w", err)
}
if ts == 0 {
ts = time.Now().UnixNano()
}
p.RenameField(msgField, "_msg")
processLogMessage(timestamp, p.Fields)
processLogMessage(ts, p.Fields)
logjson.PutParser(p)
return true, nil
}
@ -222,10 +225,15 @@ func extractTimestampFromFields(timeField string, fields []logstorage.Field) (in
f.Value = ""
return timestamp, nil
}
return time.Now().UnixNano(), nil
return 0, nil
}
func parseElasticsearchTimestamp(s string) (int64, error) {
if s == "0" || s == "" {
// Special case - zero or empty timestamp must be substituted
// with the current time by the caller.
return 0, nil
}
if len(s) < len("YYYY-MM-DD") || s[len("YYYY")] != '-' {
// Try parsing timestamp in milliseconds
n, err := strconv.ParseInt(s, 10, 64)

View file

@ -99,12 +99,15 @@ func readLine(sc *bufio.Scanner, timeField, msgField string, processLogMessage f
if err := p.ParseLogMessage(line); err != nil {
return false, fmt.Errorf("cannot parse json-encoded log entry: %w", err)
}
timestamp, err := extractTimestampFromFields(timeField, p.Fields)
ts, err := extractTimestampFromFields(timeField, p.Fields)
if err != nil {
return false, fmt.Errorf("cannot parse timestamp: %w", err)
}
if ts == 0 {
ts = time.Now().UnixNano()
}
p.RenameField(msgField, "_msg")
processLogMessage(timestamp, p.Fields)
processLogMessage(ts, p.Fields)
logjson.PutParser(p)
return true, nil
}
@ -122,10 +125,15 @@ func extractTimestampFromFields(timeField string, fields []logstorage.Field) (in
f.Value = ""
return timestamp, nil
}
return time.Now().UnixNano(), nil
return 0, nil
}
func parseISO8601Timestamp(s string) (int64, error) {
if s == "0" || s == "" {
// Special case for returning the current timestamp.
// It must be automatically converted to the current timestamp by the caller.
return 0, nil
}
t, err := time.Parse(time.RFC3339, s)
if err != nil {
return 0, fmt.Errorf("cannot parse timestamp %q: %w", s, err)

56
app/vlinsert/loki/loki.go Normal file
View file

@ -0,0 +1,56 @@
package loki
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/insertutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/metrics"
)
var (
lokiRequestsJSONTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/loki/api/v1/push",format="json"}`)
lokiRequestsProtobufTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/loki/api/v1/push",format="protobuf"}`)
)
// RequestHandler processes Loki insert requests
//
// See https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
if path != "/api/v1/push" {
return false
}
contentType := r.Header.Get("Content-Type")
switch contentType {
case "application/json":
lokiRequestsJSONTotal.Inc()
return handleJSON(r, w)
default:
// Protobuf request body should be handled by default accoring to https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
lokiRequestsProtobufTotal.Inc()
return handleProtobuf(r, w)
}
}
func getCommonParams(r *http.Request) (*insertutils.CommonParams, error) {
cp, err := insertutils.GetCommonParams(r)
if err != nil {
return nil, err
}
// If parsed tenant is (0,0) it is likely to be default tenant
// Try parsing tenant from Loki headers
if cp.TenantID.AccountID == 0 && cp.TenantID.ProjectID == 0 {
org := r.Header.Get("X-Scope-OrgID")
if org != "" {
tenantID, err := logstorage.GetTenantIDFromString(org)
if err != nil {
return nil, err
}
cp.TenantID = tenantID
}
}
return cp, nil
}

View file

@ -0,0 +1,190 @@
package loki
import (
"fmt"
"io"
"math"
"net/http"
"strconv"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson"
)
var (
rowsIngestedJSONTotal = metrics.NewCounter(`vl_rows_ingested_total{type="loki",format="json"}`)
parserPool fastjson.ParserPool
)
func handleJSON(r *http.Request, w http.ResponseWriter) bool {
reader := r.Body
if r.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(reader)
if err != nil {
httpserver.Errorf(w, r, "cannot initialize gzip reader: %s", err)
return true
}
defer common.PutGzipReader(zr)
reader = zr
}
wcr := writeconcurrencylimiter.GetReader(reader)
data, err := io.ReadAll(wcr)
writeconcurrencylimiter.PutReader(wcr)
if err != nil {
httpserver.Errorf(w, r, "cannot read request body: %s", err)
return true
}
cp, err := getCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "cannot parse common params from request: %s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
n, err := parseJSONRequest(data, processLogMessage)
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
if err != nil {
httpserver.Errorf(w, r, "cannot parse Loki request: %s", err)
return true
}
rowsIngestedJSONTotal.Add(n)
return true
}
func parseJSONRequest(data []byte, processLogMessage func(timestamp int64, fields []logstorage.Field)) (int, error) {
p := parserPool.Get()
defer parserPool.Put(p)
v, err := p.ParseBytes(data)
if err != nil {
return 0, fmt.Errorf("cannot parse JSON request body: %w", err)
}
streamsV := v.Get("streams")
if streamsV == nil {
return 0, fmt.Errorf("missing `streams` item in the parsed JSON: %q", v)
}
streams, err := streamsV.Array()
if err != nil {
return 0, fmt.Errorf("`streams` item in the parsed JSON must contain an array; got %q", streamsV)
}
currentTimestamp := time.Now().UnixNano()
var commonFields []logstorage.Field
rowsIngested := 0
for _, stream := range streams {
// populate common labels from `stream` dict
commonFields = commonFields[:0]
labelsV := stream.Get("stream")
var labels *fastjson.Object
if labelsV != nil {
o, err := labelsV.Object()
if err != nil {
return rowsIngested, fmt.Errorf("`stream` item in the parsed JSON must contain an object; got %q", labelsV)
}
labels = o
}
labels.Visit(func(k []byte, v *fastjson.Value) {
if err != nil {
return
}
vStr, errLocal := v.StringBytes()
if errLocal != nil {
err = fmt.Errorf("unexpected label value type for %q:%q; want string", k, v)
return
}
commonFields = append(commonFields, logstorage.Field{
Name: bytesutil.ToUnsafeString(k),
Value: bytesutil.ToUnsafeString(vStr),
})
})
if err != nil {
return rowsIngested, fmt.Errorf("error when parsing `stream` object: %w", err)
}
// populate messages from `values` array
linesV := stream.Get("values")
if linesV == nil {
return rowsIngested, fmt.Errorf("missing `values` item in the parsed JSON %q", stream)
}
lines, err := linesV.Array()
if err != nil {
return rowsIngested, fmt.Errorf("`values` item in the parsed JSON must contain an array; got %q", linesV)
}
fields := commonFields
for _, line := range lines {
lineA, err := line.Array()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected contents of `values` item; want array; got %q", line)
}
if len(lineA) != 2 {
return rowsIngested, fmt.Errorf("unexpected number of values in `values` item array %q; got %d want 2", line, len(lineA))
}
// parse timestamp
timestamp, err := lineA[0].StringBytes()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected log timestamp type for %q; want string", lineA[0])
}
ts, err := parseLokiTimestamp(bytesutil.ToUnsafeString(timestamp))
if err != nil {
return rowsIngested, fmt.Errorf("cannot parse log timestamp %q: %w", timestamp, err)
}
if ts == 0 {
ts = currentTimestamp
}
// parse log message
msg, err := lineA[1].StringBytes()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected log message type for %q; want string", lineA[1])
}
fields = append(fields[:len(commonFields)], logstorage.Field{
Name: "_msg",
Value: bytesutil.ToUnsafeString(msg),
})
processLogMessage(ts, fields)
}
rowsIngested += len(lines)
}
return rowsIngested, nil
}
func parseLokiTimestamp(s string) (int64, error) {
if s == "" {
// Special case - an empty timestamp must be substituted with the current time by the caller.
return 0, nil
}
n, err := strconv.ParseInt(s, 10, 64)
if err != nil {
// Fall back to parsing floating-point value
f, err := strconv.ParseFloat(s, 64)
if err != nil {
return 0, err
}
if f > math.MaxInt64 {
return 0, fmt.Errorf("too big timestamp in nanoseconds: %v; mustn't exceed %v", f, int64(math.MaxInt64))
}
if f < math.MinInt64 {
return 0, fmt.Errorf("too small timestamp in nanoseconds: %v; must be bigger or equal to %v", f, int64(math.MinInt64))
}
n = int64(f)
}
if n < 0 {
return 0, fmt.Errorf("too small timestamp in nanoseconds: %d; must be bigger than 0", n)
}
return n, nil
}

View file

@ -0,0 +1,130 @@
package loki
import (
"fmt"
"strings"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func TestParseJSONRequestFailure(t *testing.T) {
f := func(s string) {
t.Helper()
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
t.Fatalf("unexpected call to parseJSONRequest callback!")
})
if err == nil {
t.Fatalf("expecting non-nil error")
}
if n != 0 {
t.Fatalf("unexpected number of parsed lines: %d; want 0", n)
}
}
f(``)
// Invalid json
f(`{}`)
f(`[]`)
f(`"foo"`)
f(`123`)
// invalid type for `streams` item
f(`{"streams":123}`)
// Missing `values` item
f(`{"streams":[{}]}`)
// Invalid type for `values` item
f(`{"streams":[{"values":"foobar"}]}`)
// Invalid type for `stream` item
f(`{"streams":[{"stream":[],"values":[]}]}`)
// Invalid type for `values` individual item
f(`{"streams":[{"values":[123]}]}`)
// Invalid length of `values` individual item
f(`{"streams":[{"values":[[]]}]}`)
f(`{"streams":[{"values":[["123"]]}]}`)
f(`{"streams":[{"values":[["123","456","789"]]}]}`)
// Invalid type for timestamp inside `values` individual item
f(`{"streams":[{"values":[[123,"456"]}]}`)
// Invalid type for log message
f(`{"streams":[{"values":[["123",1234]]}]}`)
}
func TestParseJSONRequestSuccess(t *testing.T) {
f := func(s string, resultExpected string) {
t.Helper()
var lines []string
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
var a []string
for _, f := range fields {
a = append(a, f.String())
}
line := fmt.Sprintf("_time:%d %s", timestamp, strings.Join(a, " "))
lines = append(lines, line)
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(lines) {
t.Fatalf("unexpected number of lines parsed; got %d; want %d", n, len(lines))
}
result := strings.Join(lines, "\n")
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Empty streams
f(`{"streams":[]}`, ``)
f(`{"streams":[{"values":[]}]}`, ``)
f(`{"streams":[{"stream":{},"values":[]}]}`, ``)
f(`{"streams":[{"stream":{"foo":"bar"},"values":[]}]}`, ``)
// Empty stream labels
f(`{"streams":[{"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
f(`{"streams":[{"stream":{},"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
// Non-empty stream labels
f(`{"streams":[{"stream":{
"label1": "value1",
"label2": "value2"
},"values":[
["1577836800000000001", "foo bar"],
["1477836900005000002", "abc"],
["147.78369e9", "foobar"]
]}]}`, `_time:1577836800000000001 "label1":"value1" "label2":"value2" "_msg":"foo bar"
_time:1477836900005000002 "label1":"value1" "label2":"value2" "_msg":"abc"
_time:147783690000 "label1":"value1" "label2":"value2" "_msg":"foobar"`)
// Multiple streams
f(`{
"streams": [
{
"stream": {
"foo": "bar",
"a": "b"
},
"values": [
["1577836800000000001", "foo bar"],
["1577836900005000002", "abc"]
]
},
{
"stream": {
"x": "y"
},
"values": [
["1877836900005000002", "yx"]
]
}
]
}`, `_time:1577836800000000001 "foo":"bar" "a":"b" "_msg":"foo bar"
_time:1577836900005000002 "foo":"bar" "a":"b" "_msg":"abc"
_time:1877836900005000002 "x":"y" "_msg":"yx"`)
}

View file

@ -0,0 +1,78 @@
package loki
import (
"fmt"
"strconv"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func BenchmarkParseJSONRequest(b *testing.B) {
for _, streams := range []int{5, 10} {
for _, rows := range []int{100, 1000} {
for _, labels := range []int{10, 50} {
b.Run(fmt.Sprintf("streams_%d/rows_%d/labels_%d", streams, rows, labels), func(b *testing.B) {
benchmarkParseJSONRequest(b, streams, rows, labels)
})
}
}
}
}
func benchmarkParseJSONRequest(b *testing.B, streams, rows, labels int) {
b.ReportAllocs()
b.SetBytes(int64(streams * rows))
b.RunParallel(func(pb *testing.PB) {
data := getJSONBody(streams, rows, labels)
for pb.Next() {
_, err := parseJSONRequest(data, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %s", err))
}
}
})
}
func getJSONBody(streams, rows, labels int) []byte {
body := append([]byte{}, `{"streams":[`...)
now := time.Now().UnixNano()
valuePrefix := fmt.Sprintf(`["%d","value_`, now)
for i := 0; i < streams; i++ {
body = append(body, `{"stream":{`...)
for j := 0; j < labels; j++ {
body = append(body, `"label_`...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, `":"value_`...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, '"')
if j < labels-1 {
body = append(body, ',')
}
}
body = append(body, `}, "values":[`...)
for j := 0; j < rows; j++ {
body = append(body, valuePrefix...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, `"]`...)
if j < rows-1 {
body = append(body, ',')
}
}
body = append(body, `]}`...)
if i < streams-1 {
body = append(body, ',')
}
}
body = append(body, `]}`...)
return body
}

View file

@ -0,0 +1,171 @@
package loki
import (
"fmt"
"io"
"net/http"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
var (
rowsIngestedProtobufTotal = metrics.NewCounter(`vl_rows_ingested_total{type="loki",format="protobuf"}`)
bytesBufPool bytesutil.ByteBufferPool
pushReqsPool sync.Pool
)
func handleProtobuf(r *http.Request, w http.ResponseWriter) bool {
wcr := writeconcurrencylimiter.GetReader(r.Body)
data, err := io.ReadAll(wcr)
writeconcurrencylimiter.PutReader(wcr)
if err != nil {
httpserver.Errorf(w, r, "cannot read request body: %s", err)
return true
}
cp, err := getCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "cannot parse common params from request: %s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
n, err := parseProtobufRequest(data, processLogMessage)
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
if err != nil {
httpserver.Errorf(w, r, "cannot parse loki request: %s", err)
return true
}
rowsIngestedProtobufTotal.Add(n)
return true
}
func parseProtobufRequest(data []byte, processLogMessage func(timestamp int64, fields []logstorage.Field)) (int, error) {
bb := bytesBufPool.Get()
defer bytesBufPool.Put(bb)
buf, err := snappy.Decode(bb.B[:cap(bb.B)], data)
if err != nil {
return 0, fmt.Errorf("cannot decode snappy-encoded request body: %w", err)
}
bb.B = buf
req := getPushRequest()
defer putPushRequest(req)
err = req.Unmarshal(bb.B)
if err != nil {
return 0, fmt.Errorf("cannot parse request body: %s", err)
}
var commonFields []logstorage.Field
rowsIngested := 0
streams := req.Streams
currentTimestamp := time.Now().UnixNano()
for i := range streams {
stream := &streams[i]
// st.Labels contains labels for the stream.
// Labels are same for all entries in the stream.
commonFields, err = parsePromLabels(commonFields[:0], stream.Labels)
if err != nil {
return rowsIngested, fmt.Errorf("cannot parse stream labels %q: %s", stream.Labels, err)
}
fields := commonFields
entries := stream.Entries
for j := range entries {
entry := &entries[j]
fields = append(fields[:len(commonFields)], logstorage.Field{
Name: "_msg",
Value: entry.Line,
})
ts := entry.Timestamp.UnixNano()
if ts == 0 {
ts = currentTimestamp
}
processLogMessage(ts, fields)
}
rowsIngested += len(stream.Entries)
}
return rowsIngested, nil
}
// parsePromLabels parses log fields in Prometheus text exposition format from s, appends them to dst and returns the result.
//
// See test data of promtail for examples: https://github.com/grafana/loki/blob/a24ef7b206e0ca63ee74ca6ecb0a09b745cd2258/pkg/push/types_test.go
func parsePromLabels(dst []logstorage.Field, s string) ([]logstorage.Field, error) {
// Make sure s is wrapped into `{...}`
s = strings.TrimSpace(s)
if len(s) < 2 {
return nil, fmt.Errorf("too short string to parse: %q", s)
}
if s[0] != '{' {
return nil, fmt.Errorf("missing `{` at the beginning of %q", s)
}
if s[len(s)-1] != '}' {
return nil, fmt.Errorf("missing `}` at the end of %q", s)
}
s = s[1 : len(s)-1]
for len(s) > 0 {
// Parse label name
n := strings.IndexByte(s, '=')
if n < 0 {
return nil, fmt.Errorf("cannot find `=` char for label value at %s", s)
}
name := s[:n]
s = s[n+1:]
// Parse label value
qs, err := strconv.QuotedPrefix(s)
if err != nil {
return nil, fmt.Errorf("cannot parse value for label %q at %s: %w", name, s, err)
}
s = s[len(qs):]
value, err := strconv.Unquote(qs)
if err != nil {
return nil, fmt.Errorf("cannot unquote value %q for label %q: %w", qs, name, err)
}
// Append the found field to dst.
dst = append(dst, logstorage.Field{
Name: name,
Value: value,
})
// Check whether there are other labels remaining
if len(s) == 0 {
break
}
if !strings.HasPrefix(s, ",") {
return nil, fmt.Errorf("missing `,` char at %s", s)
}
s = s[1:]
s = strings.TrimPrefix(s, " ")
}
return dst, nil
}
func getPushRequest() *PushRequest {
v := pushReqsPool.Get()
if v == nil {
return &PushRequest{}
}
return v.(*PushRequest)
}
func putPushRequest(req *PushRequest) {
req.Reset()
pushReqsPool.Put(req)
}

View file

@ -0,0 +1,171 @@
package loki
import (
"fmt"
"strings"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/golang/snappy"
)
func TestParseProtobufRequestSuccess(t *testing.T) {
f := func(s string, resultExpected string) {
t.Helper()
var pr PushRequest
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
msg := ""
for _, f := range fields {
if f.Name == "_msg" {
msg = f.Value
}
}
var a []string
for _, f := range fields {
if f.Name == "_msg" {
continue
}
item := fmt.Sprintf("%s=%q", f.Name, f.Value)
a = append(a, item)
}
labels := "{" + strings.Join(a, ", ") + "}"
pr.Streams = append(pr.Streams, Stream{
Labels: labels,
Entries: []Entry{
{
Timestamp: time.Unix(0, timestamp),
Line: msg,
},
},
})
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(pr.Streams) {
t.Fatalf("unexpected number of streams; got %d; want %d", len(pr.Streams), n)
}
data, err := pr.Marshal()
if err != nil {
t.Fatalf("unexpected error when marshaling PushRequest: %s", err)
}
encodedData := snappy.Encode(nil, data)
var lines []string
n, err = parseProtobufRequest(encodedData, func(timestamp int64, fields []logstorage.Field) {
var a []string
for _, f := range fields {
a = append(a, f.String())
}
line := fmt.Sprintf("_time:%d %s", timestamp, strings.Join(a, " "))
lines = append(lines, line)
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(lines) {
t.Fatalf("unexpected number of lines parsed; got %d; want %d", n, len(lines))
}
result := strings.Join(lines, "\n")
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Empty streams
f(`{"streams":[]}`, ``)
f(`{"streams":[{"values":[]}]}`, ``)
f(`{"streams":[{"stream":{},"values":[]}]}`, ``)
f(`{"streams":[{"stream":{"foo":"bar"},"values":[]}]}`, ``)
// Empty stream labels
f(`{"streams":[{"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
f(`{"streams":[{"stream":{},"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
// Non-empty stream labels
f(`{"streams":[{"stream":{
"label1": "value1",
"label2": "value2"
},"values":[
["1577836800000000001", "foo bar"],
["1477836900005000002", "abc"],
["147.78369e9", "foobar"]
]}]}`, `_time:1577836800000000001 "label1":"value1" "label2":"value2" "_msg":"foo bar"
_time:1477836900005000002 "label1":"value1" "label2":"value2" "_msg":"abc"
_time:147783690000 "label1":"value1" "label2":"value2" "_msg":"foobar"`)
// Multiple streams
f(`{
"streams": [
{
"stream": {
"foo": "bar",
"a": "b"
},
"values": [
["1577836800000000001", "foo bar"],
["1577836900005000002", "abc"]
]
},
{
"stream": {
"x": "y"
},
"values": [
["1877836900005000002", "yx"]
]
}
]
}`, `_time:1577836800000000001 "foo":"bar" "a":"b" "_msg":"foo bar"
_time:1577836900005000002 "foo":"bar" "a":"b" "_msg":"abc"
_time:1877836900005000002 "x":"y" "_msg":"yx"`)
}
func TestParsePromLabelsSuccess(t *testing.T) {
f := func(s string) {
t.Helper()
fields, err := parsePromLabels(nil, s)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
var a []string
for _, f := range fields {
a = append(a, fmt.Sprintf("%s=%q", f.Name, f.Value))
}
result := "{" + strings.Join(a, ", ") + "}"
if result != s {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, s)
}
}
f("{}")
f(`{foo="bar"}`)
f(`{foo="bar", baz="x", y="z"}`)
f(`{foo="ba\"r\\z\n", a="", b="\"\\"}`)
}
func TestParsePromLabelsFailure(t *testing.T) {
f := func(s string) {
t.Helper()
fields, err := parsePromLabels(nil, s)
if err == nil {
t.Fatalf("expecting non-nil error")
}
if len(fields) > 0 {
t.Fatalf("unexpected non-empty fields: %s", fields)
}
}
f("")
f("{")
f(`{foo}`)
f(`{foo=bar}`)
f(`{foo="bar}`)
f(`{foo="ba\",r}`)
f(`{foo="bar" baz="aa"}`)
f(`foobar`)
f(`foo{bar="baz"}`)
}

View file

@ -0,0 +1,65 @@
package loki
import (
"fmt"
"strconv"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/golang/snappy"
)
func BenchmarkParseProtobufRequest(b *testing.B) {
for _, streams := range []int{5, 10} {
for _, rows := range []int{100, 1000} {
for _, labels := range []int{10, 50} {
b.Run(fmt.Sprintf("streams_%d/rows_%d/labels_%d", streams, rows, labels), func(b *testing.B) {
benchmarkParseProtobufRequest(b, streams, rows, labels)
})
}
}
}
}
func benchmarkParseProtobufRequest(b *testing.B, streams, rows, labels int) {
b.ReportAllocs()
b.SetBytes(int64(streams * rows))
b.RunParallel(func(pb *testing.PB) {
body := getProtobufBody(streams, rows, labels)
for pb.Next() {
_, err := parseProtobufRequest(body, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %s", err))
}
}
})
}
func getProtobufBody(streams, rows, labels int) []byte {
var pr PushRequest
for i := 0; i < streams; i++ {
var st Stream
st.Labels = `{`
for j := 0; j < labels; j++ {
st.Labels += `label_` + strconv.Itoa(j) + `="value_` + strconv.Itoa(j) + `"`
if j < labels-1 {
st.Labels += `,`
}
}
st.Labels += `}`
for j := 0; j < rows; j++ {
st.Entries = append(st.Entries, Entry{Timestamp: time.Now(), Line: "value_" + strconv.Itoa(j)})
}
pr.Streams = append(pr.Streams, st)
}
body, _ := pr.Marshal()
encodedBody := snappy.Encode(nil, body)
return encodedBody
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,38 @@
syntax = "proto3";
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/push.proto
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
package logproto;
import "gogoproto/gogo.proto";
import "google/protobuf/timestamp.proto";
option go_package = "github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/loki";
message PushRequest {
repeated StreamAdapter streams = 1 [
(gogoproto.jsontag) = "streams",
(gogoproto.customtype) = "Stream"
];
}
message StreamAdapter {
string labels = 1 [(gogoproto.jsontag) = "labels"];
repeated EntryAdapter entries = 2 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "entries"
];
// hash contains the original hash of the stream.
uint64 hash = 3 [(gogoproto.jsontag) = "-"];
}
message EntryAdapter {
google.protobuf.Timestamp timestamp = 1 [
(gogoproto.stdtime) = true,
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "ts"
];
string line = 2 [(gogoproto.jsontag) = "line"];
}

View file

@ -0,0 +1,110 @@
package loki
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/timestamp.go
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
import (
"errors"
"strconv"
"time"
"github.com/gogo/protobuf/types"
)
const (
// Seconds field of the earliest valid Timestamp.
// This is time.Date(1, 1, 1, 0, 0, 0, 0, time.UTC).Unix().
minValidSeconds = -62135596800
// Seconds field just after the latest valid Timestamp.
// This is time.Date(10000, 1, 1, 0, 0, 0, 0, time.UTC).Unix().
maxValidSeconds = 253402300800
)
// validateTimestamp determines whether a Timestamp is valid.
// A valid timestamp represents a time in the range
// [0001-01-01, 10000-01-01) and has a Nanos field
// in the range [0, 1e9).
//
// If the Timestamp is valid, validateTimestamp returns nil.
// Otherwise, it returns an error that describes
// the problem.
//
// Every valid Timestamp can be represented by a time.Time, but the converse is not true.
func validateTimestamp(ts *types.Timestamp) error {
if ts == nil {
return errors.New("timestamp: nil Timestamp")
}
if ts.Seconds < minValidSeconds {
return errors.New("timestamp: " + formatTimestamp(ts) + " before 0001-01-01")
}
if ts.Seconds >= maxValidSeconds {
return errors.New("timestamp: " + formatTimestamp(ts) + " after 10000-01-01")
}
if ts.Nanos < 0 || ts.Nanos >= 1e9 {
return errors.New("timestamp: " + formatTimestamp(ts) + ": nanos not in range [0, 1e9)")
}
return nil
}
// formatTimestamp is equivalent to fmt.Sprintf("%#v", ts)
// but avoids the escape incurred by using fmt.Sprintf, eliminating
// unnecessary heap allocations.
func formatTimestamp(ts *types.Timestamp) string {
if ts == nil {
return "nil"
}
seconds := strconv.FormatInt(ts.Seconds, 10)
nanos := strconv.FormatInt(int64(ts.Nanos), 10)
return "&types.Timestamp{Seconds: " + seconds + ",\nNanos: " + nanos + ",\n}"
}
func sizeOfStdTime(t time.Time) int {
ts, err := timestampProto(t)
if err != nil {
return 0
}
return ts.Size()
}
func stdTimeMarshalTo(t time.Time, data []byte) (int, error) {
ts, err := timestampProto(t)
if err != nil {
return 0, err
}
return ts.MarshalTo(data)
}
func stdTimeUnmarshal(t *time.Time, data []byte) error {
ts := &types.Timestamp{}
if err := ts.Unmarshal(data); err != nil {
return err
}
tt, err := timestampFromProto(ts)
if err != nil {
return err
}
*t = tt
return nil
}
func timestampFromProto(ts *types.Timestamp) (time.Time, error) {
// Don't return the zero value on error, because corresponds to a valid
// timestamp. Instead return whatever time.Unix gives us.
var t time.Time
if ts == nil {
t = time.Unix(0, 0).UTC() // treat nil like the empty Timestamp
} else {
t = time.Unix(ts.Seconds, int64(ts.Nanos)).UTC()
}
return t, validateTimestamp(ts)
}
func timestampProto(t time.Time) (types.Timestamp, error) {
ts := types.Timestamp{
Seconds: t.Unix(),
Nanos: int32(t.Nanosecond()),
}
return ts, validateTimestamp(&ts)
}

481
app/vlinsert/loki/types.go Normal file
View file

@ -0,0 +1,481 @@
package loki
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/types.go
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
import (
"fmt"
"io"
"time"
)
// Stream contains a unique labels set as a string and a set of entries for it.
// We are not using the proto generated version but this custom one so that we
// can improve serialization see benchmark.
type Stream struct {
Labels string `protobuf:"bytes,1,opt,name=labels,proto3" json:"labels"`
Entries []Entry `protobuf:"bytes,2,rep,name=entries,proto3,customtype=EntryAdapter" json:"entries"`
Hash uint64 `protobuf:"varint,3,opt,name=hash,proto3" json:"-"`
}
// Entry is a log entry with a timestamp.
type Entry struct {
Timestamp time.Time `protobuf:"bytes,1,opt,name=timestamp,proto3,stdtime" json:"ts"`
Line string `protobuf:"bytes,2,opt,name=line,proto3" json:"line"`
}
// Marshal implements the proto.Marshaler interface.
func (m *Stream) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
// MarshalTo marshals m to dst.
func (m *Stream) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to the sized buffer.
func (m *Stream) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
if m.Hash != 0 {
i = encodeVarintPush(dAtA, i, m.Hash)
i--
dAtA[i] = 0x18
}
if len(m.Entries) > 0 {
for iNdEx := len(m.Entries) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Entries[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintPush(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0x12
}
}
if len(m.Labels) > 0 {
i -= len(m.Labels)
copy(dAtA[i:], m.Labels)
i = encodeVarintPush(dAtA, i, uint64(len(m.Labels)))
i--
dAtA[i] = 0xa
}
return len(dAtA) - i, nil
}
// Marshal implements the proto.Marshaler interface.
func (m *Entry) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
// MarshalTo marshals m to dst.
func (m *Entry) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to the sized buffer.
func (m *Entry) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
if len(m.Line) > 0 {
i -= len(m.Line)
copy(dAtA[i:], m.Line)
i = encodeVarintPush(dAtA, i, uint64(len(m.Line)))
i--
dAtA[i] = 0x12
}
n7, err7 := stdTimeMarshalTo(m.Timestamp, dAtA[i-sizeOfStdTime(m.Timestamp):])
if err7 != nil {
return 0, err7
}
i -= n7
i = encodeVarintPush(dAtA, i, uint64(n7))
i--
dAtA[i] = 0xa
return len(dAtA) - i, nil
}
// Unmarshal unmarshals the given data into m.
func (m *Stream) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
var wire uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
wire |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
fieldNum := int32(wire >> 3)
wireType := int(wire & 0x7)
if wireType == 4 {
return fmt.Errorf("proto: StreamAdapter: wiretype end group for non-group")
}
if fieldNum <= 0 {
return fmt.Errorf("proto: StreamAdapter: illegal tag %d (wire type %d)", fieldNum, wire)
}
switch fieldNum {
case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Labels", wireType)
}
var stringLen uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
stringLen |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
intStringLen := int(stringLen)
if intStringLen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + intStringLen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Labels = string(dAtA[iNdEx:postIndex])
iNdEx = postIndex
case 2:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Entries", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Entries = append(m.Entries, Entry{})
if err := m.Entries[len(m.Entries)-1].Unmarshal(dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 3:
if wireType != 0 {
return fmt.Errorf("proto: wrong wireType = %d for field Hash", wireType)
}
m.Hash = 0
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
m.Hash |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
default:
iNdEx = preIndex
skippy, err := skipPush(dAtA[iNdEx:])
if err != nil {
return err
}
if skippy < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) > l {
return io.ErrUnexpectedEOF
}
iNdEx += skippy
}
}
if iNdEx > l {
return io.ErrUnexpectedEOF
}
return nil
}
// Unmarshal unmarshals the given data into m.
func (m *Entry) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
var wire uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
wire |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
fieldNum := int32(wire >> 3)
wireType := int(wire & 0x7)
if wireType == 4 {
return fmt.Errorf("proto: EntryAdapter: wiretype end group for non-group")
}
if fieldNum <= 0 {
return fmt.Errorf("proto: EntryAdapter: illegal tag %d (wire type %d)", fieldNum, wire)
}
switch fieldNum {
case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Timestamp", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if err := stdTimeUnmarshal(&m.Timestamp, dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 2:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Line", wireType)
}
var stringLen uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
stringLen |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
intStringLen := int(stringLen)
if intStringLen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + intStringLen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Line = string(dAtA[iNdEx:postIndex])
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipPush(dAtA[iNdEx:])
if err != nil {
return err
}
if skippy < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) > l {
return io.ErrUnexpectedEOF
}
iNdEx += skippy
}
}
if iNdEx > l {
return io.ErrUnexpectedEOF
}
return nil
}
// Size returns the size of the serialized Stream.
func (m *Stream) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = len(m.Labels)
if l > 0 {
n += 1 + l + sovPush(uint64(l))
}
if len(m.Entries) > 0 {
for _, e := range m.Entries {
l = e.Size()
n += 1 + l + sovPush(uint64(l))
}
}
if m.Hash != 0 {
n += 1 + sovPush(m.Hash)
}
return n
}
// Size returns the size of the serialized Entry
func (m *Entry) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = sizeOfStdTime(m.Timestamp)
n += 1 + l + sovPush(uint64(l))
l = len(m.Line)
if l > 0 {
n += 1 + l + sovPush(uint64(l))
}
return n
}
// Equal returns true if the two Streams are equal.
func (m *Stream) Equal(that interface{}) bool {
if that == nil {
return m == nil
}
that1, ok := that.(*Stream)
if !ok {
that2, ok := that.(Stream)
if ok {
that1 = &that2
} else {
return false
}
}
if that1 == nil {
return m == nil
} else if m == nil {
return false
}
if m.Labels != that1.Labels {
return false
}
if len(m.Entries) != len(that1.Entries) {
return false
}
for i := range m.Entries {
if !m.Entries[i].Equal(that1.Entries[i]) {
return false
}
}
return m.Hash == that1.Hash
}
// Equal returns true if the two Entries are equal.
func (m *Entry) Equal(that interface{}) bool {
if that == nil {
return m == nil
}
that1, ok := that.(*Entry)
if !ok {
that2, ok := that.(Entry)
if ok {
that1 = &that2
} else {
return false
}
}
if that1 == nil {
return m == nil
} else if m == nil {
return false
}
if !m.Timestamp.Equal(that1.Timestamp) {
return false
}
if m.Line != that1.Line {
return false
}
return true
}

View file

@ -6,6 +6,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/elasticsearch"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/jsonline"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/loki"
)
// Init initializes vlinsert
@ -33,6 +34,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case strings.HasPrefix(path, "/elasticsearch/"):
path = strings.TrimPrefix(path, "/elasticsearch")
return elasticsearch.RequestHandler(path, w, r)
case strings.HasPrefix(path, "/loki/"):
path = strings.TrimPrefix(path, "/loki")
return loki.RequestHandler(path, w, r)
default:
return false
}

View file

@ -12,8 +12,8 @@ import (
var (
maxSortBufferSize = flagutil.NewBytes("select.maxSortBufferSize", 1024*1024, "Query results from /select/logsql/query are automatically sorted by _time "+
"if their summary size doesn't exceed this value; otherwise query results are streamed in the response without sorting; "+
"too big value for this flag may result in high memory usage, since the sorting is performed in memory")
"if their summary size doesn't exceed this value; otherwise, query results are streamed in the response without sorting; "+
"too big value for this flag may result in high memory usage since the sorting is performed in memory")
)
// ProcessQueryRequest handles /select/logsql/query request

View file

@ -1,14 +1,14 @@
{
"files": {
"main.css": "./static/css/main.f5cb3747.css",
"main.js": "./static/js/main.46d11611.js",
"static/js/27.c1ccfd29.chunk.js": "./static/js/27.c1ccfd29.chunk.js",
"main.css": "./static/css/main.5f461a27.css",
"main.js": "./static/js/main.7566144a.js",
"static/js/522.b5ae4365.chunk.js": "./static/js/522.b5ae4365.chunk.js",
"static/media/Lato-Regular.ttf": "./static/media/Lato-Regular.d714fec1633b69a9c2e9.ttf",
"static/media/Lato-Bold.ttf": "./static/media/Lato-Bold.32360ba4b57802daa4d6.ttf",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.f5cb3747.css",
"static/js/main.46d11611.js"
"static/css/main.5f461a27.css",
"static/js/main.7566144a.js"
]
}

View file

@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.46d11611.js"></script><link href="./static/css/main.f5cb3747.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.7566144a.js"></script><link href="./static/css/main.5f461a27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +0,0 @@
"use strict";(self.webpackChunkvmui=self.webpackChunkvmui||[]).push([[27],{27:function(e,t,n){n.r(t),n.d(t,{getCLS:function(){return y},getFCP:function(){return g},getFID:function(){return C},getLCP:function(){return P},getTTFB:function(){return D}});var i,r,a,o,u=function(e,t){return{name:e,value:void 0===t?-1:t,delta:0,entries:[],id:"v2-".concat(Date.now(),"-").concat(Math.floor(8999999999999*Math.random())+1e12)}},c=function(e,t){try{if(PerformanceObserver.supportedEntryTypes.includes(e)){if("first-input"===e&&!("PerformanceEventTiming"in self))return;var n=new PerformanceObserver((function(e){return e.getEntries().map(t)}));return n.observe({type:e,buffered:!0}),n}}catch(e){}},f=function(e,t){var n=function n(i){"pagehide"!==i.type&&"hidden"!==document.visibilityState||(e(i),t&&(removeEventListener("visibilitychange",n,!0),removeEventListener("pagehide",n,!0)))};addEventListener("visibilitychange",n,!0),addEventListener("pagehide",n,!0)},s=function(e){addEventListener("pageshow",(function(t){t.persisted&&e(t)}),!0)},m=function(e,t,n){var i;return function(r){t.value>=0&&(r||n)&&(t.delta=t.value-(i||0),(t.delta||void 0===i)&&(i=t.value,e(t)))}},v=-1,p=function(){return"hidden"===document.visibilityState?0:1/0},d=function(){f((function(e){var t=e.timeStamp;v=t}),!0)},l=function(){return v<0&&(v=p(),d(),s((function(){setTimeout((function(){v=p(),d()}),0)}))),{get firstHiddenTime(){return v}}},g=function(e,t){var n,i=l(),r=u("FCP"),a=function(e){"first-contentful-paint"===e.name&&(f&&f.disconnect(),e.startTime<i.firstHiddenTime&&(r.value=e.startTime,r.entries.push(e),n(!0)))},o=window.performance&&performance.getEntriesByName&&performance.getEntriesByName("first-contentful-paint")[0],f=o?null:c("paint",a);(o||f)&&(n=m(e,r,t),o&&a(o),s((function(i){r=u("FCP"),n=m(e,r,t),requestAnimationFrame((function(){requestAnimationFrame((function(){r.value=performance.now()-i.timeStamp,n(!0)}))}))})))},h=!1,T=-1,y=function(e,t){h||(g((function(e){T=e.value})),h=!0);var n,i=function(t){T>-1&&e(t)},r=u("CLS",0),a=0,o=[],v=function(e){if(!e.hadRecentInput){var t=o[0],i=o[o.length-1];a&&e.startTime-i.startTime<1e3&&e.startTime-t.startTime<5e3?(a+=e.value,o.push(e)):(a=e.value,o=[e]),a>r.value&&(r.value=a,r.entries=o,n())}},p=c("layout-shift",v);p&&(n=m(i,r,t),f((function(){p.takeRecords().map(v),n(!0)})),s((function(){a=0,T=-1,r=u("CLS",0),n=m(i,r,t)})))},E={passive:!0,capture:!0},w=new Date,L=function(e,t){i||(i=t,r=e,a=new Date,F(removeEventListener),S())},S=function(){if(r>=0&&r<a-w){var e={entryType:"first-input",name:i.type,target:i.target,cancelable:i.cancelable,startTime:i.timeStamp,processingStart:i.timeStamp+r};o.forEach((function(t){t(e)})),o=[]}},b=function(e){if(e.cancelable){var t=(e.timeStamp>1e12?new Date:performance.now())-e.timeStamp;"pointerdown"==e.type?function(e,t){var n=function(){L(e,t),r()},i=function(){r()},r=function(){removeEventListener("pointerup",n,E),removeEventListener("pointercancel",i,E)};addEventListener("pointerup",n,E),addEventListener("pointercancel",i,E)}(t,e):L(t,e)}},F=function(e){["mousedown","keydown","touchstart","pointerdown"].forEach((function(t){return e(t,b,E)}))},C=function(e,t){var n,a=l(),v=u("FID"),p=function(e){e.startTime<a.firstHiddenTime&&(v.value=e.processingStart-e.startTime,v.entries.push(e),n(!0))},d=c("first-input",p);n=m(e,v,t),d&&f((function(){d.takeRecords().map(p),d.disconnect()}),!0),d&&s((function(){var a;v=u("FID"),n=m(e,v,t),o=[],r=-1,i=null,F(addEventListener),a=p,o.push(a),S()}))},k={},P=function(e,t){var n,i=l(),r=u("LCP"),a=function(e){var t=e.startTime;t<i.firstHiddenTime&&(r.value=t,r.entries.push(e),n())},o=c("largest-contentful-paint",a);if(o){n=m(e,r,t);var v=function(){k[r.id]||(o.takeRecords().map(a),o.disconnect(),k[r.id]=!0,n(!0))};["keydown","click"].forEach((function(e){addEventListener(e,v,{once:!0,capture:!0})})),f(v,!0),s((function(i){r=u("LCP"),n=m(e,r,t),requestAnimationFrame((function(){requestAnimationFrame((function(){r.value=performance.now()-i.timeStamp,k[r.id]=!0,n(!0)}))}))}))}},D=function(e){var t,n=u("TTFB");t=function(){try{var t=performance.getEntriesByType("navigation")[0]||function(){var e=performance.timing,t={entryType:"navigation",startTime:0};for(var n in e)"navigationStart"!==n&&"toJSON"!==n&&(t[n]=Math.max(e[n]-e.navigationStart,0));return t}();if(n.value=n.delta=t.responseStart,n.value<0||n.value>performance.now())return;n.entries=[t],e(n)}catch(e){}},"complete"===document.readyState?setTimeout(t,0):addEventListener("load",(function(){return setTimeout(t,0)}))}}}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -7,7 +7,7 @@
/*! regenerator-runtime -- Copyright (c) 2014-present, Facebook, Inc. -- license (MIT): https://github.com/facebook/regenerator/blob/main/LICENSE */
/**
* @remix-run/router v1.5.0
* @remix-run/router v1.7.2
*
* Copyright (c) Remix Software Inc.
*
@ -18,7 +18,7 @@
*/
/**
* React Router DOM v6.10.0
* React Router DOM v6.14.2
*
* Copyright (c) Remix Software Inc.
*
@ -29,7 +29,7 @@
*/
/**
* React Router v6.10.0
* React Router v6.14.2
*
* Copyright (c) Remix Software Inc.
*

View file

@ -22,8 +22,8 @@ var (
storageDataPath = flag.String("storageDataPath", "victoria-logs-data", "Path to directory with the VictoriaLogs data; "+
"see https://docs.victoriametrics.com/VictoriaLogs/#storage")
inmemoryDataFlushInterval = flag.Duration("inmemoryDataFlushInterval", 5*time.Second, "The interval for guaranteed saving of in-memory data to disk. "+
"The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"Smaller intervals increase disk IO load. Minimum supported value is 1s")
logNewStreams = flag.Bool("logNewStreams", false, "Whether to log creation of new streams; this can be useful for debugging of high cardinality issues with log streams; "+
"see https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields ; see also -logIngestedRows")

View file

@ -54,7 +54,7 @@ and sending the data to the Prometheus-compatible remote storage:
The path can point either to local file or to http url. `vmagent` doesn't support some sections of Prometheus config file,
so you may need either to delete these sections or to run `vmagent` with `-promscrape.config.strictParse=false` command-line flag.
In this case `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections).
* `-remoteWrite.url` with Prometheus-compatible remote storage endpoint such as VictoriaMetrics.
* `-remoteWrite.url` with Prometheus-compatible remote storage endpoint such as VictoriaMetrics, where to send the data to.
Example command for writing the data received via [supported push-based protocols](#how-to-push-data-to-vmagent)
to [single-node VictoriaMetrics](https://docs.victoriametrics.com/) located at `victoria-metrics-host:8428`:
@ -158,6 +158,20 @@ and then it sends the buffered data to the remote storage in order to prevent da
so there is no need in specifying multiple `-remoteWrite.url` flags when writing data to the same cluster.
See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety).
### Sharding among remote storages
By default `vmagent` replicates data among remote storage systems enumerated via `-remoteWrite.url` command-line flag.
If the `-remoteWrite.shardByURL` command-line flag is set, then `vmagent` spreads evenly
the outgoing [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series)
among all the remote storage systems enumerated via `-remoteWrite.url`. Note that samples for the same
time series are routed to the same remote storage system if `-remoteWrite.shardByURL` flag is specified.
This allows building scalable data processing pipelines when a single remote storage cannot keep up with the data ingestion workload.
For example, this allows building horizontally scalable [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
by routing outgoing samples for the same time series of [counter](https://docs.victoriametrics.com/keyConcepts.html#counter)
and [histogram](https://docs.victoriametrics.com/keyConcepts.html#histogram) types from top-level `vmagent` instances
to the same second-level `vmagent` instance, so they are aggregated properly.
See also [how to scrape big number of targets](#scraping-big-number-of-targets).
### Relabeling and filtering
@ -465,6 +479,7 @@ with [additional enhancements](#relabeling-enhancements). The relabeling can be
This relabeling can be debugged by clicking the `debug` link at the corresponding target on the `http://vmagent:8429/targets` page
or on the `http://vmagent:8429/service-discovery` page. See [these docs](#relabel-debug) for details.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
* At the `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file.
This relabeling is used for modifying labels in scraped metrics and for dropping unneeded metrics.
@ -512,8 +527,9 @@ The following articles contain useful information about Prometheus relabeling:
{% endraw %}
* An optional `if` filter can be used for conditional relabeling. The `if` filter may contain
arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
For example, the following relabeling rule drops metrics, which don't match `foo{bar="baz"}` series selector, while leaving the rest of metrics:
arbitrary [time series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
The `action` is performed only for [samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples), which match the provided `if` filter.
For example, the following relabeling rule keeps metrics matching `foo{bar="baz"}` series selector, while dropping the rest of metrics:
```yaml
- if: 'foo{bar="baz"}'
@ -528,6 +544,18 @@ The following articles contain useful information about Prometheus relabeling:
regex: 'foo;baz'
```
The `if` option may contain more than one filter. In this case the `action` is performed if at least a single filter
matches the given [sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
For example, the following relabeling rule adds `foo="bar"` label to samples with `job="foo"` or `instance="bar"` labels:
```yaml
- target_label: foo
replacement: bar
if:
- '{job="foo"}'
- '{instance="bar"}'
```
* The `regex` value can be split into multiple lines for improved readability and maintainability.
These lines are automatically joined with `|` char when parsed. For example, the following configs are equivalent:
@ -645,18 +673,20 @@ provide the following tools for debugging target-level and metric-level relabeli
- Target-level debugging (e.g. `relabel_configs` section at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs))
can be performed by navigating to `http://vmagent:8429/targets` page (`http://victoriametrics:8428/targets` page for single-node VictoriaMetrics)
and clicking the `debug target relabeling` link at the target, which must be debugged.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual target relabeling rules applied to the discovered target labels.
The page shows also the target URL generated after applying all the relabeling rules.
The `http://vmagent:8429/targets` page shows only active targets. If you need to understand why some target
is dropped during the relabeling, then navigate to `http://vmagent:8428/service-discovery` page
(`http://victoriametrics:8428/service-discovery` for single-node VictoriaMetrics), find the dropped target
and click the `debug` link there. The opened page shows step-by-step results for the actual relabeling rules,
which result to target drop.
and click the `debug` link there. The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual relabeling rules, which result to target drop.
- Metric-level debugging (e.g. `metric_relabel_configs` section at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs)
can be performed by navigating to `http://vmagent:8429/targets` page (`http://victoriametrics:8428/targets` page for single-node VictoriaMetrics)
and clicking the `debug metrics relabeling` link at the target, which must be debugged.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual metric relabeling rules applied to the given target labels.
## Prometheus staleness markers
@ -750,6 +780,9 @@ If each target is scraped by multiple `vmagent` instances, then data deduplicati
The `-dedup.minScrapeInterval` must be set to the `scrape_interval` configured at `-promscrape.config`.
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
See also [how to shard data among multiple remote storage systems](#sharding-among-remote-storages).
## High availability
It is possible to run multiple **identically configured** `vmagent` instances or `vmagent`
@ -1097,13 +1130,13 @@ It may be needed to build `vmagent` from source code when developing or testing
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmagent` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmagent` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds the `vmagent` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmagent-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmagent-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmagent-prod` binary and puts it into the `bin` folder.
### Building docker images
@ -1126,13 +1159,13 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics)
1. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics)
It builds `vmagent-linux-arm` or `vmagent-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmagent-linux-arm-prod` or `make vmagent-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmagent-linux-arm-prod` or `make vmagent-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmagent-linux-arm-prod` or `vmagent-linux-arm64-prod` binary respectively and puts it into the `bin` folder.
## Profiling
@ -1196,9 +1229,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-dryRun
Whether to check config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -1226,9 +1259,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr=''. See also -httpListenAddr.useProxyProtocol (default ":8429")
-httpListenAddr.useProxyProtocol
@ -1263,7 +1296,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-kafka.consumer.topic array
Kafka topic names for data consumption. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
@ -1310,15 +1343,15 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
The maximum number of concurrent insert requests. The default value should work for most cases, since it minimizes memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-opentsdbHTTPListenAddr string
@ -1539,19 +1572,24 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.sendTimeout array
Timeout for sending a single block of data to the corresponding -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.shardByURL
Whether to shard outgoing series across all the remote storage systems enumerated via -remoteWrite.url . By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages
-remoteWrite.showURL
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteWrite.significantFigures array
The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.config array
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput and -remoteWrite.streamAggr.dedupInterval
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.dedupInterval array
Input samples are de-duplicated with this interval before being aggregated. Only the last sample per each time series per each interval is aggregated if the interval is greater than zero
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.dropInput array
Whether to drop all the input samples after the aggregation with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.keepInput array
Whether to keep input samples after the aggregation with -remoteWrite.streamAggr.config. By default, the input is dropped after the aggregation, so only the aggregate data is sent to the -remoteWrite.url. See https://docs.victoriametrics.com/stream-aggregation.html
Whether to keep all the input samples after the aggregation with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. By default, system CA is used
@ -1571,7 +1609,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.tmpDataPath string
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
-remoteWrite.url array
Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelConfig array
Optional path to relabel configs for the corresponding -remoteWrite.url. See also -remoteWrite.relabelConfig. The path can point either to local file or to http url. See https://docs.victoriametrics.com/vmagent.html#relabeling

View file

@ -33,10 +33,14 @@ import (
var (
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol "+
"or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
"Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. "+
"The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set. "+
"See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
shardByURL = flag.Bool("remoteWrite.shardByURL", false, "Whether to shard outgoing series across all the remote storage systems enumerated via -remoteWrite.url . "+
"By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
@ -67,10 +71,13 @@ var (
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/stream-aggregation.html . "+
"See also -remoteWrite.streamAggr.keepInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep input samples after the aggregation with -remoteWrite.streamAggr.config. "+
"By default, the input is dropped after the aggregation, so only the aggregate data is sent to the -remoteWrite.url. "+
"See https://docs.victoriametrics.com/stream-aggregation.html")
"See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
)
@ -93,7 +100,7 @@ func MultitenancyEnabled() bool {
}
// Contains the current relabelConfigs.
var allRelabelConfigs atomic.Value
var allRelabelConfigs atomic.Pointer[relabelConfigs]
// maxQueues limits the maximum value for `-remoteWrite.queues`. There is no sense in setting too high value,
// since it may lead to high memory usage due to big number of buffers.
@ -319,7 +326,7 @@ func Stop() {
// If at is nil, then the data is pushed to the configured `-remoteWrite.url`.
// If at isn't nil, the data is pushed to the configured `-remoteWrite.multitenantURL`.
//
// Note that wr may be modified by Push due to relabeling and rounding.
// Note that wr may be modified by Push because of relabeling and rounding.
func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
if at == nil && len(*remoteWriteMultitenantURLs) > 0 {
// Write data to default tenant if at isn't set while -remoteWrite.multitenantURL is set.
@ -346,7 +353,7 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load().(*relabelConfigs)
rcs := allRelabelConfigs.Load()
pcsGlobal := rcs.global
if pcsGlobal.Len() > 0 || len(labelsGlobal) > 0 {
rctx = getRelabelCtx()
@ -400,10 +407,46 @@ func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarsha
// Nothing to push
return
}
// Push block to remote storages in parallel in order to reduce the time needed for sending the data to multiple remote storage systems.
if len(rwctxs) == 1 {
// Fast path - just push data to the configured single remote storage
rwctxs[0].Push(tssBlock)
return
}
// We need to push tssBlock to multiple remote storages.
// This is either sharding or replication depending on -remoteWrite.shardByURL command-line flag value.
if *shardByURL {
// Shard the data among rwctxs
tssByURL := make([][]prompbmarshal.TimeSeries, len(rwctxs))
for _, ts := range tssBlock {
h := getLabelsHash(ts.Labels)
idx := h % uint64(len(tssByURL))
tssByURL[idx] = append(tssByURL[idx], ts)
}
// Push sharded data to remote storages in parallel in order to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
for i, rwctx := range rwctxs {
tssShard := tssByURL[i]
if len(tssShard) == 0 {
continue
}
go func(rwctx *remoteWriteCtx, tss []prompbmarshal.TimeSeries) {
defer wg.Done()
rwctx.Push(tss)
}(rwctx, tssShard)
}
wg.Wait()
return
}
// Replicate data among rwctxs.
// Push block to remote storages in parallel in order to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
for _, rwctx := range rwctxs {
wg.Add(1)
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
rwctx.Push(tssBlock)
@ -507,6 +550,7 @@ type remoteWriteCtx struct {
sas atomic.Pointer[streamaggr.Aggregators]
streamAggrKeepInput bool
streamAggrDropInput bool
pss []*pendingSeries
pssNextIdx uint64
@ -579,6 +623,7 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
}
rwctx.sas.Store(sas)
rwctx.streamAggrKeepInput = streamAggrKeepInput.GetOptionalArg(argIdx)
rwctx.streamAggrDropInput = streamAggrDropInput.GetOptionalArg(argIdx)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(1)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_success_timestamp_seconds{path=%q}`, sasFile)).Set(fasttime.UnixTimestamp())
}
@ -612,7 +657,7 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// Apply relabeling
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
rcs := allRelabelConfigs.Load().(*relabelConfigs)
rcs := allRelabelConfigs.Load()
pcs := rcs.perURL[rwctx.idx]
if pcs.Len() > 0 {
rctx = getRelabelCtx()
@ -620,7 +665,7 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// from affecting time series for other remoteWrite.url configs.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
// and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
v = tssRelabelPool.Get().(*[]prompbmarshal.TimeSeries)
v = tssPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
rowsCountBeforeRelabel := getRowsCount(tss)
tss = rctx.applyRelabeling(tss, nil, pcs)
@ -632,20 +677,45 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// Apply stream aggregation if any
sas := rwctx.sas.Load()
sas.Push(tss)
if sas == nil || rwctx.streamAggrKeepInput {
// Push samples to the remote storage
rwctx.pushInternal(tss)
if sas != nil {
matchIdxs := matchIdxsPool.Get()
matchIdxs.B = sas.Push(tss, matchIdxs.B)
if !rwctx.streamAggrKeepInput {
if rctx == nil {
rctx = getRelabelCtx()
// Make a copy of tss before dropping aggregated series
v = tssPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
}
tss = dropAggregatedSeries(tss, matchIdxs.B, rwctx.streamAggrDropInput)
}
matchIdxsPool.Put(matchIdxs)
}
rwctx.pushInternal(tss)
// Return back relabeling contexts to the pool
if rctx != nil {
*v = prompbmarshal.ResetTimeSeries(tss)
tssRelabelPool.Put(v)
tssPool.Put(v)
putRelabelCtx(rctx)
}
}
var matchIdxsPool bytesutil.ByteBufferPool
func dropAggregatedSeries(src []prompbmarshal.TimeSeries, matchIdxs []byte, dropInput bool) []prompbmarshal.TimeSeries {
dst := src[:0]
for i, match := range matchIdxs {
if match == 0 {
continue
}
dst = append(dst, src[i])
}
tail := src[len(dst):]
_ = prompbmarshal.ResetTimeSeries(tail)
return dst
}
func (rwctx *remoteWriteCtx) pushInternal(tss []prompbmarshal.TimeSeries) {
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
@ -682,7 +752,7 @@ func (rwctx *remoteWriteCtx) reinitStreamAggr() {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_success_timestamp_seconds{path=%q}`, sasFile)).Set(fasttime.UnixTimestamp())
}
var tssRelabelPool = &sync.Pool{
var tssPool = &sync.Pool{
New: func() interface{} {
a := []prompbmarshal.TimeSeries{}
return &a

View file

@ -74,12 +74,12 @@ test-vmalert:
go test -v -race -cover ./app/vmalert/config
go test -v -race -cover ./app/vmalert/remotewrite
go test -v -race -cover ./app/vmalert/utils
go test -v -race -cover ./app/vmalert/unittest
run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules/rules2-good.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093 \
-notifier.url=http://127.0.0.1:9093 \
-datasource.url=http://demo.robustperception.io:9090 \
-notifier.blackhole \
-remoteWrite.url=http://localhost:8428 \
-remoteRead.url=http://localhost:8428 \
-external.label=cluster=east-1 \
@ -103,6 +103,10 @@ replay-vmalert: vmalert
-replay.timeFrom=2021-05-11T07:21:43Z \
-replay.timeTo=2021-05-29T18:40:43Z
unittest-vmalert: vmalert
./bin/vmalert -unittestFile=app/vmalert/unittest/testdata/test1.yaml \
-unittestFile=app/vmalert/unittest/testdata/test2.yaml
vmalert-linux-amd64:
APP_NAME=vmalert CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch

View file

@ -732,6 +732,249 @@ See full description for these flags in `./vmalert -help`.
* `query` template function is disabled for performance reasons (might be changed in future);
* `limit` group's param has no effect during replay (might be changed in future);
## Unit Testing for Rules
> Unit testing is available from v1.92.0.
> Unit tests do not respect `-clusterMode` for now.
You can use `vmalert` to run unit tests for alerting and recording rules.
In unit test mode vmalert performs the following actions:
* sets up an isolated VictoriaMetrics instance;
* simulates the periodic ingestion of time series;
* queries the ingested data for recording and alerting rules evaluation;
* tests whether the firing alerts or resulting recording rules match the expected results.
See how to run vmalert in unit test mode below:
```
# Run vmalert with one or multiple test files via -unittestFile cmd-line flag
./vmalert -unittestFile=test1.yaml -unittestFile=test2.yaml
```
vmalert is compatible with [Prometheus config format for tests](https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/#test-file-format)
except `promql_expr_test` field. Use `metricsql_expr_test` field name instead. The name is different because vmalert
validates and executes [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions,
which aren't always backward compatible with [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/).
### Test file format
The configuration format for files specified in `-unittestFile` cmd-line flag is the following:
```
# Path to the files or http url containing [rule groups](https://docs.victoriametrics.com/vmalert.html#groups) configuration.
# Enterprise version of vmalert supports S3 and GCS paths to rules.
rule_files:
[ - <string> ]
# The evaluation interval for rules specified in `rule_files`
[ evaluation_interval: <duration> | default = 1m ]
# Groups listed below will be evaluated by order.
# Not All the groups need not be mentioned, if not, they will be evaluated by define order in rule_files.
group_eval_order:
[ - <string> ]
# The list of unit test files to be checked during evaluation.
tests:
[ - <test_group> ]
```
#### `<test_group>`
```
# Interval between samples for input series
interval: <duration>
# Time series to persist into the database according to configured <interval> before running tests.
input_series:
[ - <series> ]
# Name of the test group, optional
[ name: <string> ]
# Unit tests for alerting rules
alert_rule_test:
[ - <alert_test_case> ]
# Unit tests for Metricsql expressions.
metricsql_expr_test:
[ - <metricsql_expr_test> ]
# External labels accessible for templating.
external_labels:
[ <labelname>: <string> ... ]
```
#### `<series>`
```
# series in the following format '<metric name>{<label name>=<label value>, ...}'
# Examples:
# series_name{label1="value1", label2="value2"}
# go_goroutines{job="prometheus", instance="localhost:9090"}
series: <string>
# values support several special equations:
# 'a+bxc' becomes 'a a+b a+(2*b) a+(3*b) … a+(c*b)'
# Read this as series starts at a, then c further samples incrementing by b.
# 'a-bxc' becomes 'a a-b a-(2*b) a-(3*b) … a-(c*b)'
# Read this as series starts at a, then c further samples decrementing by b (or incrementing by negative b).
# '_' represents a missing sample from scrape
# 'stale' indicates a stale sample
# Examples:
# 1. '-2+4x3' becomes '-2 2 6 10' - series starts at -2, then 3 further samples incrementing by 4.
# 2. ' 1-2x4' becomes '1 -1 -3 -5 -7' - series starts at 1, then 4 further samples decrementing by 2.
# 3. ' 1x4' becomes '1 1 1 1 1' - shorthand for '1+0x4', series starts at 1, then 4 further samples incrementing by 0.
# 4. ' 1 _x3 stale' becomes '1 _ _ _ stale' - the missing sample cannot increment, so 3 missing samples are produced by the '_x3' expression.
values: <string>
```
#### `<alert_test_case>`
vmalert by default adds `alertgroup` and `alertname` to the generated alerts and time series.
So you will need to specify both `groupname` and `alertname` under a single `<alert_test_case>`,
but no need to add them under `exp_alerts`.
You can also pass `--disableAlertgroupLabel` to prevent vmalert from adding `alertgroup` label.
```
# The time elapsed from time=0s when this alerting rule should be checked.
# Means this rule should be firing at this point, or shouldn't be firing if 'exp_alerts' is empty.
eval_time: <duration>
# Name of the group name to be tested.
groupname: <string>
# Name of the alert to be tested.
alertname: <string>
# List of the expected alerts that are firing under the given alertname at
# the given evaluation time. If you want to test if an alerting rule should
# not be firing, then you can mention only the fields above and leave 'exp_alerts' empty.
exp_alerts:
[ - <alert> ]
```
#### `<alert>`
```
# These are the expanded labels and annotations of the expected alert.
# Note: labels also include the labels of the sample associated with the alert
exp_labels:
[ <labelname>: <string> ]
exp_annotations:
[ <labelname>: <string> ]
```
#### `<metricsql_expr_test>`
```
# Expression to evaluate
expr: <string>
# The time elapsed from time=0s when this expression be evaluated.
eval_time: <duration>
# Expected samples at the given evaluation time.
exp_samples:
[ - <sample> ]
```
#### `<sample>`
```
# Labels of the sample in usual series notation '<metric name>{<label name>=<label value>, ...}'
# Examples:
# series_name{label1="value1", label2="value2"}
# go_goroutines{job="prometheus", instance="localhost:9090"}
labels: <string>
# The expected value of the Metricsql expression.
value: <number>
```
### Example
This is an example input file for unit testing which will pass.
`test.yaml` is the test file which follows the syntax above and `alerts.yaml` contains the alerting rules.
With `rules.yaml` in the same directory, run `./vmalert -unittestFile=./unittest/testdata/test.yaml`.
#### `test.yaml`
```
rule_files:
- rules.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="prometheus", instance="localhost:9090"}'
values: "0+0x1440"
metricsql_expr_test:
- expr: suquery_interval_test
eval_time: 4m
exp_samples:
- labels: '{__name__="suquery_interval_test", datacenter="dc-123", instance="localhost:9090", job="prometheus"}'
value: 1
alert_rule_test:
- eval_time: 2h
groupname: group1
alertname: InstanceDown
exp_alerts:
- exp_labels:
job: prometheus
severity: page
instance: localhost:9090
datacenter: dc-123
exp_annotations:
summary: "Instance localhost:9090 down"
description: "localhost:9090 of job prometheus has been down for more than 5 minutes."
- eval_time: 0
groupname: group1
alertname: AlwaysFiring
exp_alerts:
- exp_labels:
datacenter: dc-123
- eval_time: 0
groupname: group1
alertname: InstanceDown
exp_alerts: []
external_labels:
datacenter: dc-123
```
#### `alerts.yaml`
```
# This is the rules file.
groups:
- name: group1
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
- alert: AlwaysFiring
expr: 1
- name: group2
rules:
- record: job:test:count_over_time1m
expr: sum without(instance) (count_over_time(test[1m]))
- record: suquery_interval_test
expr: count_over_time(up[5m:])
```
## Monitoring
`vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page.
@ -901,7 +1144,7 @@ command-line flags with their descriptions.
The shortlist of configuration flags is the following:
{% raw %}
```
```console
-clusterMode
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-configCheckInterval duration
@ -967,9 +1210,9 @@ The shortlist of configuration flags is the following:
-dryRun
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -1000,9 +1243,9 @@ The shortlist of configuration flags is the following:
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8880")
-httpListenAddr.useProxyProtocol
@ -1012,7 +1255,7 @@ The shortlist of configuration flags is the following:
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
@ -1030,10 +1273,10 @@ The shortlist of configuration flags is the following:
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-notifier.basicAuth.password array
@ -1088,6 +1331,8 @@ The shortlist of configuration flags is the following:
-notifier.url array
Prometheus Alertmanager URL, e.g. http://127.0.0.1:9093. List all Alertmanager URLs if it runs in the cluster mode to ensure high availability.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.blackhole bool
Whether to blackhole alerting notifications. Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). `-notifier.url`, `-notifier.config` and `-notifier.blackhole` are mutually exclusive.
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-promscrape.consul.waitTime duration
@ -1280,6 +1525,11 @@ The shortlist of configuration flags is the following:
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-unittestFile array
Path to the unit test files. When set, vmalert starts in unit test mode and performs only tests on configured files.
Examples:
-unittestFile="./unittest/testdata/test1.yaml,./unittest/testdata/test2.yaml".
See more information here https://docs.victoriametrics.com/vmalert.html#unit-testing-for-rules.
-version
Show VictoriaMetrics version
```
@ -1329,7 +1579,7 @@ The configuration file allows to configure static notifiers, discover notifiers
and [DNS](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config):
For example:
```
```yaml
static_configs:
- targets:
- localhost:9093
@ -1354,7 +1604,7 @@ to ensure [high availability](https://github.com/prometheus/alertmanager#high-av
The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go)
is the following:
```
```yaml
# Per-target Notifier timeout when pushing alerts.
[ timeout: <duration> | default = 10s ]
@ -1473,6 +1723,7 @@ docker push my-repo:my-version-name
```
To run the built image in `victoria-metrics-k8s-stack` or `VMAlert` CR object apply the following config change:
```yaml
kind: VMAlert
spec:
@ -1484,13 +1735,13 @@ spec:
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-prod` binary and puts it into the `bin` folder.
### ARM build
@ -1500,11 +1751,11 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm` or `vmalert-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-linux-arm-prod` or `make vmalert-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmalert-linux-arm-prod` or `make vmalert-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm-prod` or `vmalert-linux-arm64-prod` binary respectively and puts it into the `bin` folder.

View file

@ -866,8 +866,8 @@ func TestAlertingRule_Template(t *testing.T) {
for hash, expAlert := range tc.expAlerts {
gotAlert := tc.rule.alerts[hash]
if gotAlert == nil {
t.Fatalf("alert %d is missing; labels: %v; annotations: %v",
hash, expAlert.Labels, expAlert.Annotations)
t.Fatalf("alert %d is missing; labels: %v; annotations: %v", hash, expAlert.Labels, expAlert.Annotations)
break
}
if !reflect.DeepEqual(expAlert.Annotations, gotAlert.Annotations) {
t.Fatalf("expected to have annotations %#v; got %#v", expAlert.Annotations, gotAlert.Annotations)

View file

@ -1,9 +1,12 @@
package datasource
import (
"bytes"
"context"
"net/http"
"net/url"
"sort"
"strconv"
"time"
)
@ -108,3 +111,69 @@ type Label struct {
Name string
Value string
}
// Labels is collection of Label
type Labels []Label
func (ls Labels) Len() int { return len(ls) }
func (ls Labels) Swap(i, j int) { ls[i], ls[j] = ls[j], ls[i] }
func (ls Labels) Less(i, j int) bool { return ls[i].Name < ls[j].Name }
func (ls Labels) String() string {
var b bytes.Buffer
b.WriteByte('{')
for i, l := range ls {
if i > 0 {
b.WriteByte(',')
b.WriteByte(' ')
}
b.WriteString(l.Name)
b.WriteByte('=')
b.WriteString(strconv.Quote(l.Value))
}
b.WriteByte('}')
return b.String()
}
// LabelCompare return negative if a is less than b, return 0 if they are the same
// eg.
// a=[]Label{{Name: "a", Value: "1"}},b=[]Label{{Name: "b", Value: "1"}}, return -1
// a=[]Label{{Name: "a", Value: "2"}},b=[]Label{{Name: "a", Value: "1"}}, return 1
// a=[]Label{{Name: "a", Value: "1"}},b=[]Label{{Name: "a", Value: "1"}}, return 0
func LabelCompare(a, b Labels) int {
l := len(a)
if len(b) < l {
l = len(b)
}
for i := 0; i < l; i++ {
if a[i].Name != b[i].Name {
if a[i].Name < b[i].Name {
return -1
}
return 1
}
if a[i].Value != b[i].Value {
if a[i].Value < b[i].Value {
return -1
}
return 1
}
}
// if all labels so far were in common, the set with fewer labels comes first.
return len(a) - len(b)
}
// ConvertToLabels convert map to Labels
func ConvertToLabels(m map[string]string) (labelset Labels) {
for k, v := range m {
labelset = append(labelset, Label{
Name: k,
Value: v,
})
}
// sort label
sort.Slice(labelset, func(i, j int) bool { return labelset[i].Name < labelset[j].Name })
return
}

View file

@ -54,6 +54,16 @@ func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestam
}
r.URL.Path += graphitePath
q := r.URL.Query()
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("format", "json")
q.Set("target", query)
q.Set("until", "now")
for k, vs := range s.extraParams {
if q.Has(k) { // extraParams are prior to params in URL
q.Del(k)
@ -62,14 +72,6 @@ func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestam
q.Add(k, v)
}
}
q.Set("format", "json")
q.Set("target", query)
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("until", "now")
r.URL.RawQuery = q.Encode()
}

View file

@ -168,7 +168,7 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1232
timestamp = timestamp.Truncate(s.evaluationInterval)
}
q.Set("time", fmt.Sprintf("%d", timestamp.Unix()))
q.Set("time", timestamp.Format(time.RFC3339))
if !*disableStepParam && s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
@ -191,8 +191,8 @@ func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, s
r.URL.Path += "/api/v1/query_range"
}
q := r.URL.Query()
q.Add("start", fmt.Sprintf("%d", start.Unix()))
q.Add("end", fmt.Sprintf("%d", end.Unix()))
q.Add("start", start.Format(time.RFC3339))
q.Add("end", end.Format(time.RFC3339))
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943

View file

@ -8,7 +8,6 @@ import (
"net/url"
"reflect"
"sort"
"strconv"
"strings"
"testing"
"time"
@ -50,8 +49,8 @@ func TestVMInstantQuery(t *testing.T) {
if timeParam == "" {
t.Errorf("expected 'time' in query param, got nil instead")
}
if _, err := strconv.ParseInt(timeParam, 10, 64); err != nil {
t.Errorf("failed to parse 'time' query param: %s", err)
if _, err := time.Parse(time.RFC3339, timeParam); err != nil {
t.Errorf("failed to parse 'time' query param %q: %s", timeParam, err)
}
switch c {
case 0:
@ -193,7 +192,6 @@ func TestVMInstantQuery(t *testing.T) {
},
}
metricsEqual(t, res.Data, exp)
}
func TestVMInstantQueryWithRetry(t *testing.T) {
@ -309,14 +307,14 @@ func TestVMRangeQuery(t *testing.T) {
if startTS == "" {
t.Errorf("expected 'start' in query param, got nil instead")
}
if _, err := strconv.ParseInt(startTS, 10, 64); err != nil {
if _, err := time.Parse(time.RFC3339, startTS); err != nil {
t.Errorf("failed to parse 'start' query param: %s", err)
}
endTS := r.URL.Query().Get("end")
if endTS == "" {
t.Errorf("expected 'end' in query param, got nil instead")
}
if _, err := strconv.ParseInt(endTS, 10, 64); err != nil {
if _, err := time.Parse(time.RFC3339, endTS); err != nil {
t.Errorf("failed to parse 'end' query param: %s", err)
}
step := r.URL.Query().Get("step")
@ -455,8 +453,8 @@ func TestRequestParams(t *testing.T) {
false,
&VMStorage{},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "time": {timestamp.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -464,8 +462,9 @@ func TestRequestParams(t *testing.T) {
true,
&VMStorage{},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("end=%d&query=%s&start=%d", timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
ts := timestamp.Format(time.RFC3339)
exp := url.Values{"query": {query}, "start": {ts}, "end": {ts}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -495,8 +494,8 @@ func TestRequestParams(t *testing.T) {
lookBack: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Add(-time.Minute).Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "time": {timestamp.Add(-time.Minute).Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -508,8 +507,8 @@ func TestRequestParams(t *testing.T) {
func(t *testing.T, r *http.Request) {
evalInterval := 15 * time.Second
tt := timestamp.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "step": {evalInterval.String()}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -523,8 +522,8 @@ func TestRequestParams(t *testing.T) {
evalInterval := 15 * time.Second
tt := timestamp.Add(-time.Minute)
tt = tt.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "step": {evalInterval.String()}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -534,8 +533,12 @@ func TestRequestParams(t *testing.T) {
queryStep: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(time.Minute.Seconds()), timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{
"query": {query},
"step": {fmt.Sprintf("%ds", int(time.Minute.Seconds()))},
"time": {timestamp.Format(time.RFC3339)},
}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -547,8 +550,8 @@ func TestRequestParams(t *testing.T) {
func(t *testing.T, r *http.Request) {
evalInterval := 3 * time.Hour
tt := timestamp.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(evalInterval.Seconds()), tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "step": {fmt.Sprintf("%ds", int(evalInterval.Seconds()))}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -558,8 +561,8 @@ func TestRequestParams(t *testing.T) {
extraParams: url.Values{"round_digits": {"10"}},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&round_digits=10&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "round_digits": {"10"}, "time": {timestamp.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -572,9 +575,14 @@ func TestRequestParams(t *testing.T) {
},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("end=%d&max_lookback=1h&nocache=1&query=%s&start=%d",
timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{
"query": {query},
"end": {timestamp.Format(time.RFC3339)},
"start": {timestamp.Format(time.RFC3339)},
"nocache": {"1"},
"max_lookback": {"1h"},
}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -584,8 +592,8 @@ func TestRequestParams(t *testing.T) {
QueryParams: url.Values{"round_digits": {"2"}},
}),
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&round_digits=2&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
exp := url.Values{"query": {query}, "round_digits": {"2"}, "time": {timestamp.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
@ -603,6 +611,20 @@ func TestRequestParams(t *testing.T) {
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"graphite extra params allows to override from",
false,
&VMStorage{
dataSourceType: datasourceGraphite,
extraParams: url.Values{
"from": {"-10m"},
},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("format=json&from=-10m&target=%s&until=now", query)
checkEqualString(t, exp, r.URL.RawQuery)
},
},
}
for _, tc := range testCases {
@ -627,7 +649,7 @@ func TestRequestParams(t *testing.T) {
}
func TestHeaders(t *testing.T) {
var testCases = []struct {
testCases := []struct {
name string
vmFn func() *VMStorage
checkFn func(t *testing.T, r *http.Request)
@ -692,7 +714,8 @@ func TestHeaders(t *testing.T) {
authCfg: cfg,
extraHeaders: []keyValue{
{key: "Authorization", value: "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="},
}}
},
}
},
checkFn: func(t *testing.T, r *http.Request) {
u, p, _ := r.BasicAuth()

View file

@ -274,7 +274,7 @@ func (g *Group) close() {
var skipRandSleepOnGroupStart bool
func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *remotewrite.Client, rr datasource.QuerierBuilder) {
func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw remotewrite.RWClient, rr datasource.QuerierBuilder) {
defer func() { close(g.finishedCh) }()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
@ -384,11 +384,17 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
g.Interval = ng.Interval
t.Stop()
t = time.NewTicker(g.Interval)
evalTS = time.Now()
}
g.mu.Unlock()
logger.Infof("group %q re-started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
case <-t.C:
missed := (time.Since(evalTS) / g.Interval) - 1
if missed < 0 {
// missed can become < 0 due to irregular delays during evaluation
// which can result in time.Since(evalTS) < g.Interval
missed = 0
}
if missed > 0 {
g.metrics.iterationMissed.Inc()
}
@ -416,7 +422,7 @@ type executor struct {
notifiers func() []notifier.Notifier
notifierHeaders map[string]string
rw *remotewrite.Client
rw remotewrite.RWClient
previouslySentSeriesToRWMu sync.Mutex
// previouslySentSeriesToRW stores series sent to RW on previous iteration

View file

@ -91,7 +91,12 @@ absolute path to all .tpl files in root.
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's Name as label to generated alerts and time series.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
unitTestFiles = flagutil.NewArrayString("unittestFile", `Path to the unit test files. When set, vmalert starts in unit test mode and performs only tests on configured files.
Examples:
-unittestFile="./unittest/testdata/test1.yaml,./unittest/testdata/test2.yaml".
See more information here https://docs.victoriametrics.com/vmalert.html#unit-testing-for-rules.
`)
)
var alertURLGeneratorFn notifier.AlertURLGenerator
@ -117,6 +122,13 @@ func main() {
logger.Fatalf("failed to parse %q: %s", *ruleTemplatesPath, err)
}
if len(*unitTestFiles) > 0 {
if unitRule(*unitTestFiles...) {
os.Exit(1)
}
os.Exit(0)
}
if *dryRun {
groups, err := config.Parse(*rulePath, notifier.ValidateTemplates, true)
if err != nil {
@ -399,7 +411,7 @@ func configsEqual(a, b []config.Group) bool {
// setConfigSuccess sets config reload status to 1.
func setConfigSuccess(at uint64) {
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
configTimestamp.Set(at)
// reset the error if any
setConfigErr(nil)
}

View file

@ -19,7 +19,7 @@ type manager struct {
querierBuilder datasource.QuerierBuilder
notifiers func() []notifier.Notifier
rw *remotewrite.Client
rw remotewrite.RWClient
// remote read builder.
rr datasource.QuerierBuilder
@ -120,7 +120,7 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
return fmt.Errorf("config contains recording rules but `-remoteWrite.url` isn't set")
}
if arPresent && m.notifiers == nil {
return fmt.Errorf("config contains alerting rules but neither `-notifier.url` nor `-notifier.config` aren't set")
return fmt.Errorf("config contains alerting rules but neither `-notifier.url` nor `-notifier.config` nor `-notifier.blackhole` aren't set")
}
type updateItem struct {

View file

@ -257,7 +257,7 @@ func TestManagerUpdate(t *testing.T) {
func TestManagerUpdateNegative(t *testing.T) {
testCases := []struct {
notifiers []notifier.Notifier
rw *remotewrite.Client
rw remotewrite.RWClient
cfg config.Group
expErr string
}{
@ -340,3 +340,21 @@ func loadCfg(t *testing.T, path []string, validateAnnotations, validateExpressio
}
return cfg
}
func TestUrlValuesToStrings(t *testing.T) {
mapQueryParams := map[string][]string{
"param1": {"param1"},
"param2": {"anotherparam"},
}
expectedRes := []string{"param1=param1", "param2=anotherparam"}
res := urlValuesToStrings(mapQueryParams)
if len(res) != len(expectedRes) {
t.Errorf("Expected length %d, but got %d", len(expectedRes), len(res))
}
for ind, val := range expectedRes {
if val != res[ind] {
t.Errorf("Expected %v; but got %v", val, res[ind])
}
}
}

View file

@ -19,6 +19,9 @@ var (
addrs = flagutil.NewArrayString("notifier.url", "Prometheus Alertmanager URL, e.g. http://127.0.0.1:9093. "+
"List all Alertmanager URLs if it runs in the cluster mode to ensure high availability.")
blackHole = flag.Bool("notifier.blackhole", false, "Whether to blackhole alerting notifications. "+
"Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). "+
"`-notifier.url`, `-notifier.config` and `-notifier.blackhole` are mutually exclusive.")
basicAuthUsername = flagutil.NewArrayString("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArrayString("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
@ -90,6 +93,17 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
templates.UpdateWithFuncs(templates.FuncsWithExternalURL(eu))
if *blackHole {
if len(*addrs) > 0 || *configPath != "" {
return nil, fmt.Errorf("only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified")
}
staticNotifiersFn = func() []Notifier {
return []Notifier{newBlackHoleNotifier()}
}
return staticNotifiersFn, nil
}
if *configPath == "" && len(*addrs) == 0 {
return nil, nil
}

View file

@ -1,8 +1,9 @@
package notifier
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
)
func TestInit(t *testing.T) {
@ -35,3 +36,58 @@ func TestInit(t *testing.T) {
t.Fatalf("expected to get \"127.0.0.2/api/v2/alerts\"; got %q instead", nf2.Addr())
}
}
func TestInitNegative(t *testing.T) {
oldConfigPath := *configPath
oldAddrs := *addrs
oldBlackHole := *blackHole
defer func() {
*configPath = oldConfigPath
*addrs = oldAddrs
*blackHole = oldBlackHole
}()
f := func(path, addr string, bh bool) {
*configPath = path
*addrs = flagutil.ArrayString{addr}
*blackHole = bh
if _, err := Init(nil, nil, ""); err == nil {
t.Fatalf("expected to get error; got nil instead")
}
}
// *configPath, *addrs and *blackhole are mutually exclusive
f("/dummy/path", "127.0.0.1", false)
f("/dummy/path", "", true)
f("", "127.0.0.1", true)
}
func TestBlackHole(t *testing.T) {
oldBlackHole := *blackHole
defer func() { *blackHole = oldBlackHole }()
*blackHole = true
fn, err := Init(nil, nil, "")
if err != nil {
t.Fatalf("%s", err)
}
nfs := fn()
if len(nfs) != 1 {
t.Fatalf("expected to get 1 notifier; got %d", len(nfs))
}
targets := GetTargets()
if targets == nil || targets[TargetStatic] == nil {
t.Fatalf("expected to get static targets in response")
}
if len(targets[TargetStatic]) != 1 {
t.Fatalf("expected to get 1 static targets in response; but got %d", len(targets[TargetStatic]))
}
nf1 := targets[TargetStatic][0]
if nf1.Addr() != "blackhole" {
t.Fatalf("expected to get \"blackhole\"; got %q instead", nf1.Addr())
}
}

View file

@ -0,0 +1,36 @@
package notifier
import "context"
// blackHoleNotifier is a Notifier stub, used when no notifications need
// to be sent.
type blackHoleNotifier struct {
addr string
metrics *metrics
}
// Send will send no notifications, but increase the metric.
func (bh *blackHoleNotifier) Send(_ context.Context, alerts []Alert, _ map[string]string) error { //nolint:revive
bh.metrics.alertsSent.Add(len(alerts))
return nil
}
// Addr of black hole notifier
func (bh blackHoleNotifier) Addr() string {
return bh.addr
}
// Close unregister the metrics
func (bh *blackHoleNotifier) Close() {
bh.metrics.alertsSent.Unregister()
bh.metrics.alertsSendErrors.Unregister()
}
// newBlackHoleNotifier creates a new blackHoleNotifier
func newBlackHoleNotifier() *blackHoleNotifier {
address := "blackhole"
return &blackHoleNotifier{
addr: address,
metrics: newMetrics(address),
}
}

View file

@ -0,0 +1,50 @@
package notifier
import (
"context"
"testing"
"time"
metricset "github.com/VictoriaMetrics/metrics"
)
func TestBlackHoleNotifier_Send(t *testing.T) {
bh := newBlackHoleNotifier()
if err := bh.Send(context.Background(), []Alert{{
GroupID: 0,
Name: "alert0",
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
t.Errorf("unexpected error %s", err)
}
alertCount := bh.metrics.alertsSent.Get()
if alertCount != 1 {
t.Errorf("expect value 1; instead got %d", alertCount)
}
}
func TestBlackHoleNotifier_Close(t *testing.T) {
bh := newBlackHoleNotifier()
if err := bh.Send(context.Background(), []Alert{{
GroupID: 0,
Name: "alert0",
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
t.Errorf("unexpected error %s", err)
}
bh.Close()
defaultMetrics := metricset.GetDefaultSet()
alertMetricName := "vmalert_alerts_sent_total{addr=\"blackhole\"}"
for _, name := range defaultMetrics.ListMetricNames() {
if name == alertMetricName {
t.Errorf("Metric name should have unregistered.But still present")
}
}
}

View file

@ -0,0 +1,320 @@
package remotewrite
import (
"bytes"
"context"
"flag"
"fmt"
"io"
"net/http"
"path"
"strings"
"sync"
"time"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
var (
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
sendTimeout = flag.Duration("remoteWrite.sendTimeout", 30*time.Second, "Timeout for sending data to the configured -remoteWrite.url.")
retryMinInterval = flag.Duration("remoteWrite.retryMinInterval", time.Second, "The minimum delay between retry attempts. Every next retry attempt will double the delay to prevent hammering of remote database. See also -remoteWrite.retryMaxInterval")
retryMaxTime = flag.Duration("remoteWrite.retryMaxTime", time.Second*30, "The max time spent on retry attempts for the failed remote-write request. Change this value if it is expected for remoteWrite.url to be unreachable for more than -remoteWrite.retryMaxTime. See also -remoteWrite.retryMinInterval")
)
// Client is an asynchronous HTTP client for writing
// timeseries via remote write protocol.
type Client struct {
addr string
c *http.Client
authCfg *promauth.Config
input chan prompbmarshal.TimeSeries
flushInterval time.Duration
maxBatchSize int
maxQueueSize int
wg sync.WaitGroup
doneCh chan struct{}
}
// Config is config for remote write.
type Config struct {
// Addr of remote storage
Addr string
AuthCfg *promauth.Config
// Concurrency defines number of readers that
// concurrently read from the queue and flush data
Concurrency int
// MaxBatchSize defines max number of timeseries
// to be flushed at once
MaxBatchSize int
// MaxQueueSize defines max length of input queue
// populated by Push method.
// Push will be rejected once queue is full.
MaxQueueSize int
// FlushInterval defines time interval for flushing batches
FlushInterval time.Duration
// Transport will be used by the underlying http.Client
Transport *http.Transport
}
const (
defaultConcurrency = 4
defaultMaxBatchSize = 1e3
defaultMaxQueueSize = 1e5
defaultFlushInterval = 5 * time.Second
defaultWriteTimeout = 30 * time.Second
)
// NewClient returns asynchronous client for
// writing timeseries via remotewrite protocol.
func NewClient(ctx context.Context, cfg Config) (*Client, error) {
if cfg.Addr == "" {
return nil, fmt.Errorf("config.Addr can't be empty")
}
if cfg.MaxBatchSize == 0 {
cfg.MaxBatchSize = defaultMaxBatchSize
}
if cfg.MaxQueueSize == 0 {
cfg.MaxQueueSize = defaultMaxQueueSize
}
if cfg.FlushInterval == 0 {
cfg.FlushInterval = defaultFlushInterval
}
if cfg.Transport == nil {
cfg.Transport = http.DefaultTransport.(*http.Transport).Clone()
}
cc := defaultConcurrency
if cfg.Concurrency > 0 {
cc = cfg.Concurrency
}
c := &Client{
c: &http.Client{
Timeout: *sendTimeout,
Transport: cfg.Transport,
},
addr: strings.TrimSuffix(cfg.Addr, "/"),
authCfg: cfg.AuthCfg,
flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
}
for i := 0; i < cc; i++ {
c.run(ctx)
}
return c, nil
}
// Push adds timeseries into queue for writing into remote storage.
// Push returns and error if client is stopped or if queue is full.
func (c *Client) Push(s prompbmarshal.TimeSeries) error {
select {
case <-c.doneCh:
return fmt.Errorf("client is closed")
case c.input <- s:
return nil
default:
return fmt.Errorf("failed to push timeseries - queue is full (%d entries). "+
"Queue size is controlled by -remoteWrite.maxQueueSize flag",
c.maxQueueSize)
}
}
// Close stops the client and waits for all goroutines
// to exit.
func (c *Client) Close() error {
if c.doneCh == nil {
return fmt.Errorf("client is already closed")
}
close(c.input)
close(c.doneCh)
c.wg.Wait()
return nil
}
func (c *Client) run(ctx context.Context) {
ticker := time.NewTicker(c.flushInterval)
wr := &prompbmarshal.WriteRequest{}
shutdown := func() {
for ts := range c.input {
wr.Timeseries = append(wr.Timeseries, ts)
}
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
logger.Infof("shutting down remote write client and flushing remained %d series", len(wr.Timeseries))
c.flush(lastCtx, wr)
cancel()
}
c.wg.Add(1)
go func() {
defer c.wg.Done()
defer ticker.Stop()
for {
select {
case <-c.doneCh:
shutdown()
return
case <-ctx.Done():
shutdown()
return
case <-ticker.C:
c.flush(ctx, wr)
case ts, ok := <-c.input:
if !ok {
continue
}
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
c.flush(ctx, wr)
}
}
}
}()
}
var (
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
sendDuration = metrics.NewFloatCounter(`vmalert_remotewrite_send_duration_seconds_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
_ = metrics.NewGauge(`vmalert_remotewrite_concurrency`, func() float64 {
return float64(*concurrency)
})
)
// flush is a blocking function that marshals WriteRequest and sends
// it to remote-write endpoint. Flush performs limited amount of retries
// if request fails.
func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
if len(wr.Timeseries) < 1 {
return
}
defer prompbmarshal.ResetWriteRequest(wr)
defer bufferFlushDuration.UpdateDuration(time.Now())
data, err := wr.Marshal()
if err != nil {
logger.Errorf("failed to marshal WriteRequest: %s", err)
return
}
b := snappy.Encode(nil, data)
retryInterval, maxRetryInterval := *retryMinInterval, *retryMaxTime
if retryInterval > maxRetryInterval {
retryInterval = maxRetryInterval
}
timeStart := time.Now()
defer sendDuration.Add(time.Since(timeStart).Seconds())
L:
for attempts := 0; ; attempts++ {
err := c.send(ctx, b)
if err == nil {
sentRows.Add(len(wr.Timeseries))
sentBytes.Add(len(b))
return
}
_, isNotRetriable := err.(*nonRetriableError)
logger.Warnf("attempt %d to send request failed: %s (retriable: %v)", attempts+1, err, !isNotRetriable)
if isNotRetriable {
// exit fast if error isn't retriable
break
}
// check if request has been cancelled before backoff
select {
case <-ctx.Done():
logger.Errorf("interrupting retry attempt %d: context cancelled", attempts+1)
break L
default:
}
timeLeftForRetries := maxRetryInterval - time.Since(timeStart)
if timeLeftForRetries < 0 {
// the max retry time has passed, so we give up
break
}
if retryInterval > timeLeftForRetries {
retryInterval = timeLeftForRetries
}
// sleeping to prevent remote db hammering
time.Sleep(retryInterval)
retryInterval *= 2
}
droppedRows.Add(len(wr.Timeseries))
droppedBytes.Add(len(b))
logger.Errorf("attempts to send remote-write request failed - dropping %d time series",
len(wr.Timeseries))
}
func (c *Client) send(ctx context.Context, data []byte) error {
r := bytes.NewReader(data)
req, err := http.NewRequest(http.MethodPost, c.addr, r)
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
// RFC standard compliant headers
req.Header.Set("Content-Encoding", "snappy")
req.Header.Set("Content-Type", "application/x-protobuf")
// Prometheus compliant headers
req.Header.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.authCfg != nil {
c.authCfg.SetHeaders(req, true)
}
if !*disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
}
resp, err := c.c.Do(req.WithContext(ctx))
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL.Redacted(), err, len(data), r.Size())
}
defer func() { _ = resp.Body.Close() }()
body, _ := io.ReadAll(resp.Body)
// according to https://prometheus.io/docs/concepts/remote_write_spec/
// Prometheus remote Write compatible receivers MUST
switch resp.StatusCode / 100 {
case 2:
// respond with a HTTP 2xx status code when the write is successful.
return nil
case 4:
if resp.StatusCode != http.StatusTooManyRequests {
// MUST NOT retry write requests on HTTP 4xx responses other than 429
return &nonRetriableError{fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL.Redacted(), body)}
}
fallthrough
default:
return fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL.Redacted(), body)
}
}
type nonRetriableError struct {
err error
}
func (e *nonRetriableError) Error() string {
return e.err.Error()
}

View file

@ -0,0 +1,97 @@
package remotewrite
import (
"bytes"
"fmt"
"io"
"net/http"
"path"
"strings"
"sync"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
// DebugClient won't push series periodically, but will write data to remote endpoint
// immediately when Push() is called
type DebugClient struct {
addr string
c *http.Client
wg sync.WaitGroup
}
// NewDebugClient initiates and returns a new DebugClient
func NewDebugClient() (*DebugClient, error) {
if *addr == "" {
return nil, nil
}
t, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
c := &DebugClient{
c: &http.Client{
Timeout: *sendTimeout,
Transport: t,
},
addr: strings.TrimSuffix(*addr, "/"),
}
return c, nil
}
// Push sends the given timeseries to the remote storage.
func (c *DebugClient) Push(s prompbmarshal.TimeSeries) error {
c.wg.Add(1)
defer c.wg.Done()
wr := &prompbmarshal.WriteRequest{Timeseries: []prompbmarshal.TimeSeries{s}}
data, err := wr.Marshal()
if err != nil {
return fmt.Errorf("failed to marshal the given time series: %w", err)
}
return c.send(data)
}
// Close stops the DebugClient
func (c *DebugClient) Close() error {
c.wg.Wait()
return nil
}
func (c *DebugClient) send(data []byte) error {
b := snappy.Encode(nil, data)
r := bytes.NewReader(b)
req, err := http.NewRequest(http.MethodPost, c.addr, r)
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
// RFC standard compliant headers
req.Header.Set("Content-Encoding", "snappy")
req.Header.Set("Content-Type", "application/x-protobuf")
// Prometheus compliant headers
req.Header.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if !*disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
}
resp, err := c.c.Do(req)
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL.Redacted(), err, len(data), r.Size())
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode/100 == 2 {
return nil
}
body, _ := io.ReadAll(resp.Body)
return fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL.Redacted(), body)
}

View file

@ -0,0 +1,50 @@
package remotewrite
import (
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
func TestDebugClient_Push(t *testing.T) {
testSrv := newRWServer()
oldAddr := *addr
*addr = testSrv.URL
defer func() {
*addr = oldAddr
}()
client, err := NewDebugClient()
if err != nil {
t.Fatalf("failed to create debug client: %s", err)
}
const rowsN = 100
var sent int
for i := 0; i < rowsN; i++ {
s := prompbmarshal.TimeSeries{
Samples: []prompbmarshal.Sample{{
Value: float64(i),
Timestamp: time.Now().Unix(),
}},
}
err := client.Push(s)
if err != nil {
t.Fatalf("unexpected err: %s", err)
}
if err == nil {
sent++
}
}
if sent == 0 {
t.Fatalf("0 series sent")
}
if err := client.Close(); err != nil {
t.Fatalf("failed to close client: %s", err)
}
got := testSrv.accepted()
if got != sent {
t.Fatalf("expected to have %d series; got %d", sent, got)
}
}

View file

@ -1,320 +1,13 @@
package remotewrite
import (
"bytes"
"context"
"flag"
"fmt"
"io"
"net/http"
"path"
"strings"
"sync"
"time"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
var (
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
sendTimeout = flag.Duration("remoteWrite.sendTimeout", 30*time.Second, "Timeout for sending data to the configured -remoteWrite.url.")
retryMinInterval = flag.Duration("remoteWrite.retryMinInterval", time.Second, "The minimum delay between retry attempts. Every next retry attempt will double the delay to prevent hammering of remote database. See also -remoteWrite.retryMaxInterval")
retryMaxTime = flag.Duration("remoteWrite.retryMaxTime", time.Second*30, "The max time spent on retry attempts for the failed remote-write request. Change this value if it is expected for remoteWrite.url to be unreachable for more than -remoteWrite.retryMaxTime. See also -remoteWrite.retryMinInterval")
)
// Client is an asynchronous HTTP client for writing
// timeseries via remote write protocol.
type Client struct {
addr string
c *http.Client
authCfg *promauth.Config
input chan prompbmarshal.TimeSeries
flushInterval time.Duration
maxBatchSize int
maxQueueSize int
wg sync.WaitGroup
doneCh chan struct{}
}
// Config is config for remote write.
type Config struct {
// Addr of remote storage
Addr string
AuthCfg *promauth.Config
// Concurrency defines number of readers that
// concurrently read from the queue and flush data
Concurrency int
// MaxBatchSize defines max number of timeseries
// to be flushed at once
MaxBatchSize int
// MaxQueueSize defines max length of input queue
// populated by Push method.
// Push will be rejected once queue is full.
MaxQueueSize int
// FlushInterval defines time interval for flushing batches
FlushInterval time.Duration
// Transport will be used by the underlying http.Client
Transport *http.Transport
}
const (
defaultConcurrency = 4
defaultMaxBatchSize = 1e3
defaultMaxQueueSize = 1e5
defaultFlushInterval = 5 * time.Second
defaultWriteTimeout = 30 * time.Second
)
// NewClient returns asynchronous client for
// writing timeseries via remotewrite protocol.
func NewClient(ctx context.Context, cfg Config) (*Client, error) {
if cfg.Addr == "" {
return nil, fmt.Errorf("config.Addr can't be empty")
}
if cfg.MaxBatchSize == 0 {
cfg.MaxBatchSize = defaultMaxBatchSize
}
if cfg.MaxQueueSize == 0 {
cfg.MaxQueueSize = defaultMaxQueueSize
}
if cfg.FlushInterval == 0 {
cfg.FlushInterval = defaultFlushInterval
}
if cfg.Transport == nil {
cfg.Transport = http.DefaultTransport.(*http.Transport).Clone()
}
cc := defaultConcurrency
if cfg.Concurrency > 0 {
cc = cfg.Concurrency
}
c := &Client{
c: &http.Client{
Timeout: *sendTimeout,
Transport: cfg.Transport,
},
addr: strings.TrimSuffix(cfg.Addr, "/"),
authCfg: cfg.AuthCfg,
flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
}
for i := 0; i < cc; i++ {
c.run(ctx)
}
return c, nil
}
// Push adds timeseries into queue for writing into remote storage.
// Push returns and error if client is stopped or if queue is full.
func (c *Client) Push(s prompbmarshal.TimeSeries) error {
select {
case <-c.doneCh:
return fmt.Errorf("client is closed")
case c.input <- s:
return nil
default:
return fmt.Errorf("failed to push timeseries - queue is full (%d entries). "+
"Queue size is controlled by -remoteWrite.maxQueueSize flag",
c.maxQueueSize)
}
}
// Close stops the client and waits for all goroutines
// to exit.
func (c *Client) Close() error {
if c.doneCh == nil {
return fmt.Errorf("client is already closed")
}
close(c.input)
close(c.doneCh)
c.wg.Wait()
return nil
}
func (c *Client) run(ctx context.Context) {
ticker := time.NewTicker(c.flushInterval)
wr := &prompbmarshal.WriteRequest{}
shutdown := func() {
for ts := range c.input {
wr.Timeseries = append(wr.Timeseries, ts)
}
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
logger.Infof("shutting down remote write client and flushing remained %d series", len(wr.Timeseries))
c.flush(lastCtx, wr)
cancel()
}
c.wg.Add(1)
go func() {
defer c.wg.Done()
defer ticker.Stop()
for {
select {
case <-c.doneCh:
shutdown()
return
case <-ctx.Done():
shutdown()
return
case <-ticker.C:
c.flush(ctx, wr)
case ts, ok := <-c.input:
if !ok {
continue
}
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
c.flush(ctx, wr)
}
}
}
}()
}
var (
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
sendDuration = metrics.NewFloatCounter(`vmalert_remotewrite_send_duration_seconds_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
_ = metrics.NewGauge(`vmalert_remotewrite_concurrency`, func() float64 {
return float64(*concurrency)
})
)
// flush is a blocking function that marshals WriteRequest and sends
// it to remote-write endpoint. Flush performs limited amount of retries
// if request fails.
func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
if len(wr.Timeseries) < 1 {
return
}
defer prompbmarshal.ResetWriteRequest(wr)
defer bufferFlushDuration.UpdateDuration(time.Now())
data, err := wr.Marshal()
if err != nil {
logger.Errorf("failed to marshal WriteRequest: %s", err)
return
}
b := snappy.Encode(nil, data)
retryInterval, maxRetryInterval := *retryMinInterval, *retryMaxTime
if retryInterval > maxRetryInterval {
retryInterval = maxRetryInterval
}
timeStart := time.Now()
defer sendDuration.Add(time.Since(timeStart).Seconds())
L:
for attempts := 0; ; attempts++ {
err := c.send(ctx, b)
if err == nil {
sentRows.Add(len(wr.Timeseries))
sentBytes.Add(len(b))
return
}
_, isNotRetriable := err.(*nonRetriableError)
logger.Warnf("attempt %d to send request failed: %s (retriable: %v)", attempts+1, err, !isNotRetriable)
if isNotRetriable {
// exit fast if error isn't retriable
break
}
// check if request has been cancelled before backoff
select {
case <-ctx.Done():
logger.Errorf("interrupting retry attempt %d: context cancelled", attempts+1)
break L
default:
}
timeLeftForRetries := maxRetryInterval - time.Since(timeStart)
if timeLeftForRetries < 0 {
// the max retry time has passed, so we give up
break
}
if retryInterval > timeLeftForRetries {
retryInterval = timeLeftForRetries
}
// sleeping to prevent remote db hammering
time.Sleep(retryInterval)
retryInterval *= 2
}
droppedRows.Add(len(wr.Timeseries))
droppedBytes.Add(len(b))
logger.Errorf("attempts to send remote-write request failed - dropping %d time series",
len(wr.Timeseries))
}
func (c *Client) send(ctx context.Context, data []byte) error {
r := bytes.NewReader(data)
req, err := http.NewRequest(http.MethodPost, c.addr, r)
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
// RFC standard compliant headers
req.Header.Set("Content-Encoding", "snappy")
req.Header.Set("Content-Type", "application/x-protobuf")
// Prometheus compliant headers
req.Header.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.authCfg != nil {
c.authCfg.SetHeaders(req, true)
}
if !*disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
}
resp, err := c.c.Do(req.WithContext(ctx))
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL.Redacted(), err, len(data), r.Size())
}
defer func() { _ = resp.Body.Close() }()
body, _ := io.ReadAll(resp.Body)
// according to https://prometheus.io/docs/concepts/remote_write_spec/
// Prometheus remote Write compatible receivers MUST
switch resp.StatusCode / 100 {
case 2:
// respond with a HTTP 2xx status code when the write is successful.
return nil
case 4:
if resp.StatusCode != http.StatusTooManyRequests {
// MUST NOT retry write requests on HTTP 4xx responses other than 429
return &nonRetriableError{fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL.Redacted(), body)}
}
fallthrough
default:
return fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL.Redacted(), body)
}
}
type nonRetriableError struct {
err error
}
func (e *nonRetriableError) Error() string {
return e.err.Error()
// RWClient represents an HTTP client for pushing data via remote write protocol
type RWClient interface {
// Push pushes the give time series to remote storage
Push(s prompbmarshal.TimeSeries) error
// Close stops the client. Client can't be reused after Close call.
Close() error
}

View file

@ -33,7 +33,7 @@ var (
"Progress bar rendering might be verbose or break the logs parsing, so it is recommended to be disabled when not used in interactive mode.")
)
func replay(groupsCfg []config.Group, qb datasource.QuerierBuilder, rw *remotewrite.Client) error {
func replay(groupsCfg []config.Group, qb datasource.QuerierBuilder, rw remotewrite.RWClient) error {
if *replayMaxDatapoints < 1 {
return fmt.Errorf("replay.maxDatapointsPerQuery can't be lower than 1")
}
@ -78,7 +78,7 @@ func replay(groupsCfg []config.Group, qb datasource.QuerierBuilder, rw *remotewr
return nil
}
func (g *Group) replay(start, end time.Time, rw *remotewrite.Client) int {
func (g *Group) replay(start, end time.Time, rw remotewrite.RWClient) int {
var total int
step := g.Interval * time.Duration(*replayMaxDatapoints)
ri := rangeIterator{start: start, end: end, step: step}
@ -119,7 +119,7 @@ func (g *Group) replay(start, end time.Time, rw *remotewrite.Client) int {
return total
}
func replayRule(rule Rule, start, end time.Time, rw *remotewrite.Client) (int, error) {
func replayRule(rule Rule, start, end time.Time, rw remotewrite.RWClient) (int, error) {
var err error
var tss []prompbmarshal.TimeSeries
for i := 0; i < *replayRuleRetryAttempts; i++ {

View file

@ -0,0 +1,40 @@
package main
import (
"testing"
)
func TestUnitRule(t *testing.T) {
testCases := []struct {
name string
disableGroupLabel bool
files []string
failed bool
}{
{
name: "run multi files",
files: []string{"./unittest/testdata/test1.yaml", "./unittest/testdata/test2.yaml"},
failed: false,
},
{
name: "disable group label",
disableGroupLabel: true,
files: []string{"./unittest/testdata/disable-group-label.yaml"},
failed: false,
},
{
name: "failing test",
files: []string{"./unittest/testdata/failed-test.yaml"},
failed: true,
},
}
for _, tc := range testCases {
oldFlag := *disableAlertGroupLabel
*disableAlertGroupLabel = tc.disableGroupLabel
fail := unitRule(tc.files...)
if fail != tc.failed {
t.Fatalf("failed to test %s, expect %t, got %t", tc.name, tc.failed, fail)
}
*disableAlertGroupLabel = oldFlag
}
}

436
app/vmalert/unittest.go Normal file
View file

@ -0,0 +1,436 @@
package main
import (
"context"
"flag"
"fmt"
"net/http"
"os"
"path/filepath"
"reflect"
"sort"
"time"
"gopkg.in/yaml.v2"
vmalertconfig "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/unittest"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
"github.com/VictoriaMetrics/metrics"
)
var (
storagePath string
// insert series from 1970-01-01T00:00:00
testStartTime = time.Unix(0, 0).UTC()
testPromWriteHTTPPath = "http://127.0.0.1" + *httpListenAddr + "/api/v1/write"
testDataSourcePath = "http://127.0.0.1" + *httpListenAddr + "/prometheus"
testRemoteWritePath = "http://127.0.0.1" + *httpListenAddr
testHealthHTTPPath = "http://127.0.0.1" + *httpListenAddr + "/health"
)
const (
testStoragePath = "vmalert-unittest"
testLogLevel = "ERROR"
)
func unitRule(files ...string) bool {
storagePath = filepath.Join(os.TempDir(), testStoragePath)
processFlags()
vminsert.Init()
vmselect.Init()
// storagePath will be created again when closing vmselect, so remove it again.
defer fs.MustRemoveAll(storagePath)
defer vminsert.Stop()
defer vmselect.Stop()
return rulesUnitTest(files...)
}
func rulesUnitTest(files ...string) bool {
var failed bool
for _, f := range files {
if err := ruleUnitTest(f); err != nil {
fmt.Println(" FAILED")
fmt.Printf("\nfailed to run unit test for file %q: \n%s", f, err)
failed = true
} else {
fmt.Println(" SUCCESS")
}
}
return failed
}
func ruleUnitTest(filename string) []error {
fmt.Println("\nUnit Testing: ", filename)
b, err := os.ReadFile(filename)
if err != nil {
return []error{fmt.Errorf("failed to read file: %w", err)}
}
var unitTestInp unitTestFile
if err := yaml.UnmarshalStrict(b, &unitTestInp); err != nil {
return []error{fmt.Errorf("failed to unmarshal file: %w", err)}
}
if err := resolveAndGlobFilepaths(filepath.Dir(filename), &unitTestInp); err != nil {
return []error{fmt.Errorf("failed to resolve path for `rule_files`: %w", err)}
}
if unitTestInp.EvaluationInterval.Duration() == 0 {
fmt.Println("evaluation_interval set to 1m by default")
unitTestInp.EvaluationInterval = &promutils.Duration{D: 1 * time.Minute}
}
groupOrderMap := make(map[string]int)
for i, gn := range unitTestInp.GroupEvalOrder {
if _, ok := groupOrderMap[gn]; ok {
return []error{fmt.Errorf("group name repeated in `group_eval_order`: %s", gn)}
}
groupOrderMap[gn] = i
}
testGroups, err := vmalertconfig.Parse(unitTestInp.RuleFiles, nil, true)
if err != nil {
return []error{fmt.Errorf("failed to parse `rule_files`: %w", err)}
}
var errs []error
for _, t := range unitTestInp.Tests {
if err := verifyTestGroup(t); err != nil {
errs = append(errs, err)
continue
}
testErrs := t.test(unitTestInp.EvaluationInterval.Duration(), groupOrderMap, testGroups)
errs = append(errs, testErrs...)
}
if len(errs) > 0 {
return errs
}
return nil
}
func verifyTestGroup(group testGroup) error {
var testGroupName string
if group.TestGroupName != "" {
testGroupName = fmt.Sprintf("testGroupName: %s\n", group.TestGroupName)
}
for _, at := range group.AlertRuleTests {
if at.Alertname == "" {
return fmt.Errorf("\n%s missing required filed \"alertname\"", testGroupName)
}
if !*disableAlertGroupLabel && at.GroupName == "" {
return fmt.Errorf("\n%s missing required filed \"groupname\" when flag \"disableAlertGroupLabel\" is false", testGroupName)
}
if at.EvalTime == nil {
return fmt.Errorf("\n%s missing required filed \"eval_time\"", testGroupName)
}
}
for _, et := range group.MetricsqlExprTests {
if et.Expr == "" {
return fmt.Errorf("\n%s missing required filed \"expr\"", testGroupName)
}
if et.EvalTime == nil {
return fmt.Errorf("\n%s missing required filed \"eval_time\"", testGroupName)
}
}
return nil
}
func processFlags() {
flag.Parse()
for _, fv := range []struct {
flag string
value string
}{
{flag: "storageDataPath", value: storagePath},
{flag: "loggerLevel", value: testLogLevel},
{flag: "search.disableCache", value: "true"},
// set storage retention time to 100 years, allow to store series from 1970-01-01T00:00:00.
{flag: "retentionPeriod", value: "100y"},
{flag: "datasource.url", value: testDataSourcePath},
{flag: "remoteWrite.url", value: testRemoteWritePath},
} {
// panics if flag doesn't exist
if err := flag.Lookup(fv.flag).Value.Set(fv.value); err != nil {
logger.Fatalf("unable to set %q with value %q, err: %v", fv.flag, fv.value, err)
}
}
}
func setUp() {
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
go httpserver.Serve(*httpListenAddr, false, func(w http.ResponseWriter, r *http.Request) bool {
switch r.URL.Path {
case "/prometheus/api/v1/query":
if err := prometheus.QueryHandler(nil, time.Now(), w, r); err != nil {
httpserver.Errorf(w, r, "%s", err)
}
return true
case "/prometheus/api/v1/write", "/api/v1/write":
if err := promremotewrite.InsertHandler(r); err != nil {
httpserver.Errorf(w, r, "%s", err)
}
return true
default:
}
return false
})
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
readyCheckFunc := func() bool {
resp, err := http.Get(testHealthHTTPPath)
if err != nil {
return false
}
_ = resp.Body.Close()
return resp.StatusCode == 200
}
checkCheck:
for {
select {
case <-ctx.Done():
logger.Fatalf("http server can't be ready in 30s")
default:
if readyCheckFunc() {
break checkCheck
}
time.Sleep(3 * time.Second)
}
}
}
func tearDown() {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Errorf("cannot stop the webservice: %s", err)
}
vmstorage.Stop()
metrics.UnregisterAllMetrics()
fs.MustRemoveAll(storagePath)
}
// resolveAndGlobFilepaths joins all relative paths in a configuration
// with a given base directory and replaces all globs with matching files.
func resolveAndGlobFilepaths(baseDir string, utf *unitTestFile) error {
for i, rf := range utf.RuleFiles {
if rf != "" && !filepath.IsAbs(rf) {
utf.RuleFiles[i] = filepath.Join(baseDir, rf)
}
}
var globbedFiles []string
for _, rf := range utf.RuleFiles {
m, err := filepath.Glob(rf)
if err != nil {
return err
}
if len(m) == 0 {
fmt.Fprintln(os.Stderr, " WARNING: no file match pattern", rf)
}
globbedFiles = append(globbedFiles, m...)
}
utf.RuleFiles = globbedFiles
return nil
}
func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]int, testGroups []vmalertconfig.Group) (checkErrs []error) {
// set up vmstorage and http server for ingest and read queries
setUp()
// tear down vmstorage and clean the data dir
defer tearDown()
err := unittest.WriteInputSeries(tg.InputSeries, tg.Interval, testStartTime, testPromWriteHTTPPath)
if err != nil {
return []error{err}
}
q, err := datasource.Init(nil)
if err != nil {
return []error{fmt.Errorf("failed to init datasource: %v", err)}
}
rw, err := remotewrite.NewDebugClient()
if err != nil {
return []error{fmt.Errorf("failed to init wr: %v", err)}
}
alertEvalTimesMap := map[time.Duration]struct{}{}
alertExpResultMap := map[time.Duration]map[string]map[string][]unittest.ExpAlert{}
for _, at := range tg.AlertRuleTests {
et := at.EvalTime.Duration()
alertEvalTimesMap[et] = struct{}{}
if _, ok := alertExpResultMap[et]; !ok {
alertExpResultMap[et] = make(map[string]map[string][]unittest.ExpAlert)
}
if _, ok := alertExpResultMap[et][at.GroupName]; !ok {
alertExpResultMap[et][at.GroupName] = make(map[string][]unittest.ExpAlert)
}
alertExpResultMap[et][at.GroupName][at.Alertname] = at.ExpAlerts
}
alertEvalTimes := make([]time.Duration, 0, len(alertEvalTimesMap))
for k := range alertEvalTimesMap {
alertEvalTimes = append(alertEvalTimes, k)
}
sort.Slice(alertEvalTimes, func(i, j int) bool {
return alertEvalTimes[i] < alertEvalTimes[j]
})
// sort group eval order according to the given "group_eval_order".
sort.Slice(testGroups, func(i, j int) bool {
return groupOrderMap[testGroups[i].Name] < groupOrderMap[testGroups[j].Name]
})
// create groups with given rule
var groups []*Group
for _, group := range testGroups {
ng := newGroup(group, q, *evaluationInterval, tg.ExternalLabels)
groups = append(groups, ng)
}
e := &executor{
rw: rw,
notifiers: func() []notifier.Notifier { return nil },
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
evalIndex := 0
maxEvalTime := testStartTime.Add(tg.maxEvalTime())
for ts := testStartTime; ts.Before(maxEvalTime) || ts.Equal(maxEvalTime); ts = ts.Add(evalInterval) {
for _, g := range groups {
resolveDuration := getResolveDuration(g.Interval, *resendDelay, *maxResolveDuration)
errs := e.execConcurrently(context.Background(), g.Rules, ts, g.Concurrency, resolveDuration, g.Limit)
for err := range errs {
if err != nil {
checkErrs = append(checkErrs, fmt.Errorf("\nfailed to exec group: %q, time: %s, err: %w", g.Name,
ts, err))
}
}
// flush series after each group evaluation
vmstorage.Storage.DebugFlush()
}
// check alert_rule_test case at every eval time
for evalIndex < len(alertEvalTimes) {
if ts.Sub(testStartTime) > alertEvalTimes[evalIndex] ||
alertEvalTimes[evalIndex] >= ts.Add(evalInterval).Sub(testStartTime) {
break
}
gotAlertsMap := map[string]map[string]unittest.LabelsAndAnnotations{}
for _, g := range groups {
if *disableAlertGroupLabel {
g.Name = ""
}
if _, ok := alertExpResultMap[time.Duration(ts.UnixNano())][g.Name]; !ok {
continue
}
if _, ok := gotAlertsMap[g.Name]; !ok {
gotAlertsMap[g.Name] = make(map[string]unittest.LabelsAndAnnotations)
}
for _, rule := range g.Rules {
ar, isAlertRule := rule.(*AlertingRule)
if !isAlertRule {
continue
}
if _, ok := alertExpResultMap[time.Duration(ts.UnixNano())][g.Name][ar.Name]; ok {
for _, got := range ar.alerts {
if got.State != notifier.StateFiring {
continue
}
laa := unittest.LabelAndAnnotation{
Labels: datasource.ConvertToLabels(got.Labels),
Annotations: datasource.ConvertToLabels(got.Annotations),
}
gotAlertsMap[g.Name][ar.Name] = append(gotAlertsMap[g.Name][ar.Name], laa)
}
}
}
}
for groupname, gres := range alertExpResultMap[alertEvalTimes[evalIndex]] {
for alertname, res := range gres {
var expAlerts unittest.LabelsAndAnnotations
for _, expAlert := range res {
if expAlert.ExpLabels == nil {
expAlert.ExpLabels = make(map[string]string)
}
// alertGroupNameLabel is added as additional labels when `disableAlertGroupLabel` is false
if !*disableAlertGroupLabel {
expAlert.ExpLabels[alertGroupNameLabel] = groupname
}
// alertNameLabel is added as additional labels in vmalert.
expAlert.ExpLabels[alertNameLabel] = alertname
expAlerts = append(expAlerts, unittest.LabelAndAnnotation{
Labels: datasource.ConvertToLabels(expAlert.ExpLabels),
Annotations: datasource.ConvertToLabels(expAlert.ExpAnnotations),
})
}
sort.Sort(expAlerts)
gotAlerts := gotAlertsMap[groupname][alertname]
sort.Sort(gotAlerts)
if !reflect.DeepEqual(expAlerts, gotAlerts) {
var testGroupName string
if tg.TestGroupName != "" {
testGroupName = fmt.Sprintf("testGroupName: %s,\n", tg.TestGroupName)
}
expString := unittest.IndentLines(expAlerts.String(), " ")
gotString := unittest.IndentLines(gotAlerts.String(), " ")
checkErrs = append(checkErrs, fmt.Errorf("\n%s groupname: %s, alertname: %s, time: %s, \n exp:%v, \n got:%v ",
testGroupName, groupname, alertname, alertEvalTimes[evalIndex].String(), expString, gotString))
}
}
}
evalIndex++
}
}
checkErrs = append(checkErrs, unittest.CheckMetricsqlCase(tg.MetricsqlExprTests, q)...)
return checkErrs
}
// unitTestFile holds the contents of a single unit test file
type unitTestFile struct {
RuleFiles []string `yaml:"rule_files"`
EvaluationInterval *promutils.Duration `yaml:"evaluation_interval"`
GroupEvalOrder []string `yaml:"group_eval_order"`
Tests []testGroup `yaml:"tests"`
}
// testGroup is a group of input series and test cases associated with it
type testGroup struct {
Interval *promutils.Duration `yaml:"interval"`
InputSeries []unittest.Series `yaml:"input_series"`
AlertRuleTests []unittest.AlertTestCase `yaml:"alert_rule_test"`
MetricsqlExprTests []unittest.MetricsqlTestCase `yaml:"metricsql_expr_test"`
ExternalLabels map[string]string `yaml:"external_labels"`
TestGroupName string `yaml:"name"`
}
// maxEvalTime returns the max eval time among all alert_rule_test and metricsql_expr_test
func (tg *testGroup) maxEvalTime() time.Duration {
var maxd time.Duration
for _, alert := range tg.AlertRuleTests {
if alert.EvalTime.Duration() > maxd {
maxd = alert.EvalTime.Duration()
}
}
for _, met := range tg.MetricsqlExprTests {
if met.EvalTime.Duration() > maxd {
maxd = met.EvalTime.Duration()
}
}
return maxd
}

View file

@ -0,0 +1,19 @@
package unittest
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
// AlertTestCase holds alert_rule_test cases defined in test file
type AlertTestCase struct {
EvalTime *promutils.Duration `yaml:"eval_time"`
GroupName string `yaml:"groupname"`
Alertname string `yaml:"alertname"`
ExpAlerts []ExpAlert `yaml:"exp_alerts"`
}
// ExpAlert holds exp_alerts defined in test file
type ExpAlert struct {
ExpLabels map[string]string `yaml:"exp_labels"`
ExpAnnotations map[string]string `yaml:"exp_annotations"`
}

View file

@ -0,0 +1,182 @@
package unittest
import (
"bytes"
"fmt"
"io"
"net/http"
"regexp"
"strconv"
"strings"
"time"
testutil "github.com/VictoriaMetrics/VictoriaMetrics/app/victoria-metrics/test"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
"github.com/VictoriaMetrics/metricsql"
)
// Series holds input_series defined in the test file
type Series struct {
Series string `yaml:"series"`
Values string `yaml:"values"`
}
// sequenceValue is an omittable value in a sequence of time series values.
type sequenceValue struct {
Value float64
Omitted bool
}
func httpWrite(address string, r io.Reader) {
resp, err := http.Post(address, "", r)
if err != nil {
logger.Fatalf("failed to send to storage: %v", err)
}
resp.Body.Close()
}
// WriteInputSeries send input series to vmstorage and flush them
func WriteInputSeries(input []Series, interval *promutils.Duration, startStamp time.Time, dst string) error {
r := testutil.WriteRequest{}
for _, data := range input {
expr, err := metricsql.Parse(data.Series)
if err != nil {
return fmt.Errorf("failed to parse series %s: %v", data.Series, err)
}
promvals, err := parseInputValue(data.Values, true)
if err != nil {
return fmt.Errorf("failed to parse input series value %s: %v", data.Values, err)
}
metricExpr, ok := expr.(*metricsql.MetricExpr)
if !ok {
return fmt.Errorf("failed to parse series %s to metric expr: %v", data.Series, err)
}
samples := make([]testutil.Sample, 0, len(promvals))
ts := startStamp
for _, v := range promvals {
if !v.Omitted {
samples = append(samples, testutil.Sample{
Timestamp: ts.UnixMilli(),
Value: v.Value,
})
}
ts = ts.Add(interval.Duration())
}
var ls []testutil.Label
for _, filter := range metricExpr.LabelFilterss[0] {
ls = append(ls, testutil.Label{Name: filter.Label, Value: filter.Value})
}
r.Timeseries = append(r.Timeseries, testutil.TimeSeries{Labels: ls, Samples: samples})
}
data, err := testutil.Compress(r)
if err != nil {
return fmt.Errorf("failed to compress data: %v", err)
}
// write input series to vm
httpWrite(dst, bytes.NewBuffer(data))
vmstorage.Storage.DebugFlush()
return nil
}
// parseInputValue support input like "1", "1+1x1 _ -4 3+20x1", see more examples in test.
func parseInputValue(input string, origin bool) ([]sequenceValue, error) {
var res []sequenceValue
items := strings.Split(input, " ")
reg := regexp.MustCompile(`\D?\d*\D?`)
for _, item := range items {
if item == "stale" {
res = append(res, sequenceValue{Value: decimal.StaleNaN})
continue
}
vals := reg.FindAllString(item, -1)
switch len(vals) {
case 1:
if vals[0] == "_" {
res = append(res, sequenceValue{Omitted: true})
continue
}
v, err := strconv.ParseFloat(vals[0], 64)
if err != nil {
return nil, err
}
res = append(res, sequenceValue{Value: v})
continue
case 2:
p1 := vals[0][:len(vals[0])-1]
v2, err := strconv.ParseInt(vals[1], 10, 64)
if err != nil {
return nil, err
}
option := vals[0][len(vals[0])-1]
switch option {
case '+':
v1, err := strconv.ParseFloat(p1, 64)
if err != nil {
return nil, err
}
res = append(res, sequenceValue{Value: v1 + float64(v2)})
case 'x':
for i := int64(0); i <= v2; i++ {
if p1 == "_" {
if i == 0 {
i = 1
}
res = append(res, sequenceValue{Omitted: true})
continue
}
v1, err := strconv.ParseFloat(p1, 64)
if err != nil {
return nil, err
}
if !origin || v1 == 0 {
res = append(res, sequenceValue{Value: v1 * float64(i)})
continue
}
newVal := fmt.Sprintf("%s+0x%s", p1, vals[1])
newRes, err := parseInputValue(newVal, false)
if err != nil {
return nil, err
}
res = append(res, newRes...)
break
}
default:
return nil, fmt.Errorf("got invalid operation %b", option)
}
case 3:
r1, err := parseInputValue(fmt.Sprintf("%s%s", vals[1], vals[2]), false)
if err != nil {
return nil, err
}
p1 := vals[0][:len(vals[0])-1]
v1, err := strconv.ParseFloat(p1, 64)
if err != nil {
return nil, err
}
option := vals[0][len(vals[0])-1]
var isAdd bool
if option == '+' {
isAdd = true
}
for _, r := range r1 {
if isAdd {
res = append(res, sequenceValue{
Value: r.Value + v1,
})
} else {
res = append(res, sequenceValue{
Value: v1 - r.Value,
})
}
}
default:
return nil, fmt.Errorf("unsupported input %s", input)
}
}
return res, nil
}

View file

@ -0,0 +1,93 @@
package unittest
import (
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
)
func TestParseInputValue(t *testing.T) {
testCases := []struct {
input string
exp []sequenceValue
failed bool
}{
{
"",
nil,
true,
},
{
"testfailed",
nil,
true,
},
// stale doesn't support operations
{
"stalex3",
nil,
true,
},
{
"-4",
[]sequenceValue{{Value: -4}},
false,
},
{
"_",
[]sequenceValue{{Omitted: true}},
false,
},
{
"stale",
[]sequenceValue{{Value: decimal.StaleNaN}},
false,
},
{
"-4x1",
[]sequenceValue{{Value: -4}, {Value: -4}},
false,
},
{
"_x1",
[]sequenceValue{{Omitted: true}},
false,
},
{
"1+1x4",
[]sequenceValue{{Value: 1}, {Value: 2}, {Value: 3}, {Value: 4}, {Value: 5}},
false,
},
{
"2-1x4",
[]sequenceValue{{Value: 2}, {Value: 1}, {Value: 0}, {Value: -1}, {Value: -2}},
false,
},
{
"1+1x1 _ -4 stale 3+20x1",
[]sequenceValue{{Value: 1}, {Value: 2}, {Omitted: true}, {Value: -4}, {Value: decimal.StaleNaN}, {Value: 3}, {Value: 23}},
false,
},
}
for _, tc := range testCases {
output, err := parseInputValue(tc.input, true)
if err != nil != tc.failed {
t.Fatalf("failed to parse %s, expect %t, got %t", tc.input, tc.failed, err != nil)
}
if len(tc.exp) != len(output) {
t.Fatalf("expect %v, got %v", tc.exp, output)
}
for i := 0; i < len(tc.exp); i++ {
if tc.exp[i].Omitted != output[i].Omitted {
t.Fatalf("expect %v, got %v", tc.exp, output)
}
if tc.exp[i].Value != output[i].Value {
if decimal.IsStaleNaN(tc.exp[i].Value) && decimal.IsStaleNaN(output[i].Value) {
continue
}
t.Fatalf("expect %v, got %v", tc.exp, output)
}
}
}
}

View file

@ -0,0 +1,92 @@
package unittest
import (
"context"
"fmt"
"net/url"
"reflect"
"sort"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
"github.com/VictoriaMetrics/metricsql"
)
// MetricsqlTestCase holds metricsql_expr_test cases defined in test file
type MetricsqlTestCase struct {
Expr string `yaml:"expr"`
EvalTime *promutils.Duration `yaml:"eval_time"`
ExpSamples []expSample `yaml:"exp_samples"`
}
type expSample struct {
Labels string `yaml:"labels"`
Value float64 `yaml:"value"`
}
// CheckMetricsqlCase will check metricsql_expr_test cases
func CheckMetricsqlCase(cases []MetricsqlTestCase, q datasource.QuerierBuilder) (checkErrs []error) {
queries := q.BuildWithParams(datasource.QuerierParams{QueryParams: url.Values{"nocache": {"1"}, "latency_offset": {"1ms"}}, DataSourceType: "prometheus"})
Outer:
for _, mt := range cases {
result, _, err := queries.Query(context.Background(), mt.Expr, mt.EvalTime.ParseTime())
if err != nil {
checkErrs = append(checkErrs, fmt.Errorf(" expr: %q, time: %s, err: %w", mt.Expr,
mt.EvalTime.Duration().String(), err))
continue
}
var gotSamples []parsedSample
for _, s := range result.Data {
sort.Slice(s.Labels, func(i, j int) bool {
return s.Labels[i].Name < s.Labels[j].Name
})
gotSamples = append(gotSamples, parsedSample{
Labels: s.Labels,
Value: s.Values[0],
})
}
var expSamples []parsedSample
for _, s := range mt.ExpSamples {
expLb := datasource.Labels{}
if s.Labels != "" {
metricsqlExpr, err := metricsql.Parse(s.Labels)
if err != nil {
checkErrs = append(checkErrs, fmt.Errorf("\n expr: %q, time: %s, err: %v", mt.Expr,
mt.EvalTime.Duration().String(), fmt.Errorf("failed to parse labels %q: %w", s.Labels, err)))
continue Outer
}
metricsqlMetricExpr, ok := metricsqlExpr.(*metricsql.MetricExpr)
if !ok {
checkErrs = append(checkErrs, fmt.Errorf("\n expr: %q, time: %s, err: %v", mt.Expr,
mt.EvalTime.Duration().String(), fmt.Errorf("got unsupported metricsql type")))
continue Outer
}
for _, l := range metricsqlMetricExpr.LabelFilterss[0] {
expLb = append(expLb, datasource.Label{
Name: l.Label,
Value: l.Value,
})
}
}
sort.Slice(expLb, func(i, j int) bool {
return expLb[i].Name < expLb[j].Name
})
expSamples = append(expSamples, parsedSample{
Labels: expLb,
Value: s.Value,
})
}
sort.Slice(expSamples, func(i, j int) bool {
return datasource.LabelCompare(expSamples[i].Labels, expSamples[j].Labels) <= 0
})
sort.Slice(gotSamples, func(i, j int) bool {
return datasource.LabelCompare(gotSamples[i].Labels, gotSamples[j].Labels) <= 0
})
if !reflect.DeepEqual(expSamples, gotSamples) {
checkErrs = append(checkErrs, fmt.Errorf("\n expr: %q, time: %s,\n exp: %v\n got: %v", mt.Expr,
mt.EvalTime.Duration().String(), parsedSamplesString(expSamples), parsedSamplesString(gotSamples)))
}
}
return
}

View file

@ -0,0 +1,43 @@
rule_files:
- rules.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="vmagent2", instance="localhost:9090"}'
values: "0+0x1440"
metricsql_expr_test:
- expr: suquery_interval_test
eval_time: 4m
exp_samples:
- labels: '{__name__="suquery_interval_test",datacenter="dc-123", instance="localhost:9090", job="vmagent2"}'
value: 1
alert_rule_test:
- eval_time: 2h
alertname: InstanceDown
exp_alerts:
- exp_labels:
job: vmagent2
severity: page
instance: localhost:9090
datacenter: dc-123
exp_annotations:
summary: "Instance localhost:9090 down"
description: "localhost:9090 of job vmagent2 has been down for more than 5 minutes."
- eval_time: 0
alertname: AlwaysFiring
exp_alerts:
- exp_labels:
datacenter: dc-123
- eval_time: 0
alertname: InstanceDown
exp_alerts: []
external_labels:
datacenter: dc-123

View file

@ -0,0 +1,41 @@
rule_files:
- rules.yaml
tests:
- interval: 1m
name: "Failing test"
input_series:
- series: test
values: "0"
metricsql_expr_test:
- expr: test
eval_time: 0m
exp_samples:
- value: 0
labels: test
# will failed cause there is no "Test" group and rule defined
alert_rule_test:
- eval_time: 0m
groupname: Test
alertname: Test
exp_alerts:
- exp_labels: {}
- interval: 1m
name: Failing alert test
input_series:
- series: 'up{job="test"}'
values: 0x10
alert_rule_test:
# will failed cause rule is firing
- eval_time: 5m
groupname: group1
alertname: InstanceDown
exp_alerts: []
# will failed cause missing groupname
- eval_time: 5m
alertname: AlwaysFiring
exp_alerts: []

View file

@ -0,0 +1,30 @@
# can be executed successfully but will take more than 1 minute
# not included in unit test now
evaluation_interval: 100d
rule_files:
- rules.yaml
tests:
- interval: 1d
input_series:
- series: test
# Max time in time.Duration is 106751d from 1970 (2^63/10^9), i.e. 2262.
# But VictoriaMetrics supports maxTimestamp value +2 days from now. see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/827.
# We input series to 2024-01-01T00:00:00 here.
values: "0+1x19723"
metricsql_expr_test:
- expr: timestamp(test)
eval_time: 0m
exp_samples:
- value: 0
- expr: test
eval_time: 100d
exp_samples:
- labels: test
value: 100
- expr: timestamp(test)
eval_time: 19000d
exp_samples:
- value: 1641600000 # 19000d -> seconds.

View file

@ -0,0 +1,39 @@
groups:
- name: group1
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
- alert: AlwaysFiring
expr: 1
- alert: SameAlertNameWithDifferentGroup
expr: absent(test)
for: 1m
- name: group2
rules:
- record: t1
expr: test
- record: job:test:count_over_time1m
expr: sum without(instance) (count_over_time(test[1m]))
- record: suquery_interval_test
expr: count_over_time(up[5m:])
- alert: SameAlertNameWithDifferentGroup
expr: absent(test)
for: 5m
- name: group3
rules:
- record: t2
expr: t1
- name: group4
rules:
- record: t3
expr: t1

View file

@ -0,0 +1,99 @@
rule_files:
- rules.yaml
evaluation_interval: 1m
group_eval_order: ["group4", "group2", "group3"]
tests:
- interval: 1m
name: "basic test"
input_series:
- series: "test"
values: "_x5 1x5 _ stale"
alert_rule_test:
- eval_time: 1m
groupname: group1
alertname: SameAlertNameWithDifferentGroup
exp_alerts:
- {}
- eval_time: 1m
groupname: group2
alertname: SameAlertNameWithDifferentGroup
exp_alerts: []
- eval_time: 6m
groupname: group1
alertname: SameAlertNameWithDifferentGroup
exp_alerts: []
metricsql_expr_test:
- expr: test
eval_time: 11m
exp_samples:
- labels: '{__name__="test"}'
value: 1
- expr: test
eval_time: 12m
exp_samples: []
- interval: 1m
name: "basic test2"
input_series:
- series: 'up{job="vmagent1", instance="localhost:9090"}'
values: "0+0x1440"
- series: "test"
values: "0+1x1440"
metricsql_expr_test:
- expr: count(ALERTS) by (alertgroup, alertname, alertstate)
eval_time: 4m
exp_samples:
- labels: '{alertgroup="group1", alertname="AlwaysFiring", alertstate="firing"}'
value: 1
- labels: '{alertgroup="group1", alertname="InstanceDown", alertstate="pending"}'
value: 1
- expr: t1
eval_time: 4m
exp_samples:
- value: 4
labels: '{__name__="t1", datacenter="dc-123"}'
- expr: t2
eval_time: 4m
exp_samples:
- value: 4
labels: '{__name__="t2", datacenter="dc-123"}'
- expr: t3
eval_time: 4m
exp_samples:
# t3 is 3 instead of 4 cause it's rules3 is evaluated before rules1
- value: 3
labels: '{__name__="t3", datacenter="dc-123"}'
alert_rule_test:
- eval_time: 10m
groupname: group1
alertname: InstanceDown
exp_alerts:
- exp_labels:
job: vmagent1
severity: page
instance: localhost:9090
datacenter: dc-123
exp_annotations:
summary: "Instance localhost:9090 down"
description: "localhost:9090 of job vmagent1 has been down for more than 5 minutes."
- eval_time: 0
groupname: group1
alertname: AlwaysFiring
exp_alerts:
- exp_labels:
datacenter: dc-123
- eval_time: 0
groupname: alerts
alertname: InstanceDown
exp_alerts: []
external_labels:
datacenter: dc-123

View file

@ -0,0 +1,46 @@
rule_files:
- rules.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="vmagent2", instance="localhost:9090"}'
values: "0+0x1440"
metricsql_expr_test:
- expr: suquery_interval_test
eval_time: 4m
exp_samples:
- labels: '{__name__="suquery_interval_test",datacenter="dc-123", instance="localhost:9090", job="vmagent2"}'
value: 1
alert_rule_test:
- eval_time: 2h
groupname: group1
alertname: InstanceDown
exp_alerts:
- exp_labels:
job: vmagent2
severity: page
instance: localhost:9090
datacenter: dc-123
exp_annotations:
summary: "Instance localhost:9090 down"
description: "localhost:9090 of job vmagent2 has been down for more than 5 minutes."
- eval_time: 0
groupname: group1
alertname: AlwaysFiring
exp_alerts:
- exp_labels:
datacenter: dc-123
- eval_time: 0
groupname: group1
alertname: InstanceDown
exp_alerts: []
external_labels:
datacenter: dc-123

View file

@ -0,0 +1,83 @@
package unittest
import (
"fmt"
"strconv"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
)
// parsedSample is a sample with parsed Labels
type parsedSample struct {
Labels datasource.Labels
Value float64
}
func (ps *parsedSample) String() string {
return ps.Labels.String() + " " + strconv.FormatFloat(ps.Value, 'E', -1, 64)
}
func parsedSamplesString(pss []parsedSample) string {
if len(pss) == 0 {
return "nil"
}
s := pss[0].String()
for _, ps := range pss[1:] {
s += ", " + ps.String()
}
return s
}
// LabelAndAnnotation holds labels and annotations
type LabelAndAnnotation struct {
Labels datasource.Labels
Annotations datasource.Labels
}
func (la *LabelAndAnnotation) String() string {
return "Labels:" + la.Labels.String() + "\nAnnotations:" + la.Annotations.String()
}
// LabelsAndAnnotations is collection of LabelAndAnnotation
type LabelsAndAnnotations []LabelAndAnnotation
func (la LabelsAndAnnotations) Len() int { return len(la) }
func (la LabelsAndAnnotations) Swap(i, j int) { la[i], la[j] = la[j], la[i] }
func (la LabelsAndAnnotations) Less(i, j int) bool {
diff := datasource.LabelCompare(la[i].Labels, la[j].Labels)
if diff != 0 {
return diff < 0
}
return datasource.LabelCompare(la[i].Annotations, la[j].Annotations) < 0
}
func (la LabelsAndAnnotations) String() string {
if len(la) == 0 {
return "[]"
}
s := "[\n0:" + IndentLines("\n"+la[0].String(), " ")
for i, l := range la[1:] {
s += ",\n" + fmt.Sprintf("%d", i+1) + ":" + IndentLines("\n"+l.String(), " ")
}
s += "\n]"
return s
}
// IndentLines prefixes each line in the supplied string with the given "indent" string.
func IndentLines(lines, indent string) string {
sb := strings.Builder{}
n := strings.Split(lines, "\n")
for i, l := range n {
if i > 0 {
sb.WriteString(indent)
}
sb.WriteString(l)
if i != len(n)-1 {
sb.WriteRune('\n')
}
}
return sb.String()
}

View file

@ -277,13 +277,13 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmauth` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmauth` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmauth` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmauth-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmauth-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmauth-prod` binary and puts it into the `bin` folder.
### Building docker images
@ -344,9 +344,9 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
-configCheckInterval duration
interval for config file re-read. Zero value disables config re-reading. By default, refreshing is disabled, send SIGHUP for config refresh.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -368,9 +368,9 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8427")
-httpListenAddr.useProxyProtocol
@ -380,7 +380,7 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-logInvalidAuthTokens
Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page
-loggerDisableTimestamps
@ -406,10 +406,10 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
-maxIdleConnsPerBackend int
The maximum number of idle connections vmauth can open per each backend host. See also -maxConcurrentRequests (default 100)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string

View file

@ -94,13 +94,13 @@ See also [vmbackupmanager tool](https://docs.victoriametrics.com/vmbackupmanager
The backup algorithm is the following:
1. Create a snapshot by querying the provided `-snapshot.createURL`
2. Collect information about files in the created snapshot, in the `-dst` and in the `-origin`.
3. Determine which files in `-dst` are missing in the created snapshot, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
4. Determine which files in the created snapshot are missing in `-dst`. These are usually small new files and bigger merged files.
5. Determine which files from step 3 exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
1. Collect information about files in the created snapshot, in the `-dst` and in the `-origin`.
1. Determine which files in `-dst` are missing in the created snapshot, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
1. Determine which files in the created snapshot are missing in `-dst`. These are usually small new files and bigger merged files.
1. Determine which files from step 3 exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
These are usually the biggest and the oldest files, which are shared between backups.
6. Upload the remaining files from step 3 from the created snapshot to `-dst`.
7. Delete the created snapshot.
1. Upload the remaining files from step 3 from the created snapshot to `-dst`.
1. Delete the created snapshot.
The algorithm splits source files into 1 GiB chunks in the backup. Each chunk is stored as a separate file in the backup.
Such splitting balances between the number of files in the backup and the amounts of data that needs to be re-transferred after temporary errors.
@ -190,9 +190,9 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://container/path/to/backup or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -214,9 +214,9 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8420")
-internStringCacheExpireDuration duration
@ -224,7 +224,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
@ -245,10 +245,10 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
The maximum upload speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-origin string
@ -302,13 +302,13 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmbackup` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmbackup` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmbackup` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmbackup-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmbackup-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmbackup-prod` binary and puts it into the `bin` folder.
### Building docker images

View file

@ -298,7 +298,7 @@ If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vm
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
2. Run `vmbackupmanager restore create` to create restore mark:
1. Run `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2023-04-07
@ -307,12 +307,12 @@ If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vm
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2023-04-07
```
3. Stop `vmstorage` or `vmsingle` node
4. Run `vmbackupmanager restore` to restore backup:
1. Stop `vmstorage` or `vmsingle` node
1. Run `vmbackupmanager restore` to restore backup:
```console
$ /vmbackupmanager-prod restore -credsFilePath=credentials.json -storageDataPath=/vmstorage-data
```
5. Start `vmstorage` or `vmsingle` node
1. Start `vmstorage` or `vmsingle` node
### How to restore in Kubernetes
@ -326,13 +326,13 @@ If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vm
enabled: "true"
```
See operator `VMStorage` schema [here](https://docs.victoriametrics.com/operator/api.html#vmstorage) and `VMSingle` [here](https://docs.victoriametrics.com/operator/api.html#vmsinglespec).
2. Enter container running `vmbackupmanager`
2. Use `vmbackupmanager backup list` to get list of available backups:
1. Enter container running `vmbackupmanager`
1. Use `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
3. Use `vmbackupmanager restore create` to create restore mark:
1. Use `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2023-04-07
@ -341,7 +341,7 @@ If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vm
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2023-04-07
```
4. Restart pod
1. Restart pod
#### Restore cluster into another cluster
@ -358,13 +358,13 @@ Clusters here are referred to as `source` and `destination`.
```
Note: it is safe to leave this section in the cluster configuration, since it will be ignored if restore mark doesn't exist.
> Important! Use different `-dst` for *destination* cluster to avoid overwriting backup data of the *source* cluster.
2. Enter container running `vmbackupmanager` in *source* cluster
2. Use `vmbackupmanager backup list` to get list of available backups:
1. Enter container running `vmbackupmanager` in *source* cluster
1. Use `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
3. Use `vmbackupmanager restore create` to create restore mark at each pod of the *destination* cluster.
1. Use `vmbackupmanager restore create` to create restore mark at each pod of the *destination* cluster.
Each pod in *destination* cluster should be restored from backup of respective pod in *source* cluster.
For example: `vmstorage-source-0` in *source* cluster should be restored from `vmstorage-destination-0` in *destination* cluster.
```console
@ -423,9 +423,9 @@ command-line flags:
-dst string
The root folder of Victoria Metrics backups. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -447,9 +447,9 @@ command-line flags:
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections (default ":8300")
-internStringCacheExpireDuration duration
@ -457,7 +457,7 @@ command-line flags:
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-keepLastDaily int
Keep last N daily backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastHourly int
@ -485,10 +485,10 @@ command-line flags:
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string

View file

@ -85,24 +85,24 @@ See `./vmctl opentsdb --help` for details and full list of flags.
OpenTSDB migration works like so:
1. Find metrics based on selected filters (or the default filter set `['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']`)
1. Find metrics based on selected filters (or the default filter set `['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']`):
- e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
`curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
2. Find series associated with each returned metric
1. Find series associated with each returned metric:
- e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
`curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
Here `results` return field should not be empty. Otherwise, it means that meta tables are absent and needs to be turned on previously.
Here `results` return field should not be empty. Otherwise, it means that meta tables are absent and needs to be turned on previously.
3. Download data for each series in chunks defined in the CLI switches
1. Download data for each series in chunks defined in the CLI switches:
- e.g. `-retention=sum-1m-avg:1h:90d` means
- `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- ...
- `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
`-retention=sum-1m-avg:1h:90d` means:
- `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- ...
- `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster.
@ -131,7 +131,7 @@ Starting with a relatively simple retention string (`sum-1m-avg:1h:30d`), let's
There are two essential parts of a retention string:
1. [aggregation](#aggregation)
2. [windows/time ranges](#windows)
1. [windows/time ranges](#windows)
#### Aggregation
@ -163,7 +163,7 @@ We do not allow for defining the "null value" portion of the rollup window (e.g.
There are two important windows we define in a retention string:
1. the "chunk" range of each query
2. The time range we will be querying on with that "chunk"
1. The time range we will be querying on with that "chunk"
From our example, our windows are `1h:30d`.
@ -445,11 +445,11 @@ See `./vmctl remote-read --help` for details and full list of flags.
To start the migration process configure the following flags:
1. `--remote-read-src-addr` - data source address to read from;
2. `--vm-addr` - VictoriaMetrics address to write to. For single-node VM is usually equal to `--httpListenAddr`,
and for cluster version is equal to `--httpListenAddr` flag of vminsert component (for example `http://<vminsert>:8480/insert/<accountID>/prometheus`);
3. `--remote-read-filter-time-start` - the time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z';
4. `--remote-read-filter-time-end` - the time filter in RFC3339 format to select time series with timestamp equal or smaller than provided value. E.g. '2020-01-01T20:07:00Z'. Current time is used when omitted.;
5. `--remote-read-step-interval` - split export data into chunks. Valid values are `month, day, hour, minute`;
1. `--vm-addr` - VictoriaMetrics address to write to. For single-node VM is usually equal to `--httpListenAddr`,
and for cluster version is equal to `--httpListenAddr` flag of vminsert component (for example `http://<vminsert>:8480/insert/<accountID>/prometheus`);
1. `--remote-read-filter-time-start` - the time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z';
1. `--remote-read-filter-time-end` - the time filter in RFC3339 format to select time series with timestamp equal or smaller than provided value. E.g. '2020-01-01T20:07:00Z'. Current time is used when omitted.;
1. `--remote-read-step-interval` - split export data into chunks. Valid values are `month, day, hour, minute`;
The importing process example for local installation of Prometheus
and single-node VictoriaMetrics(`http://localhost:8428`):
@ -516,7 +516,7 @@ and that you have a separate Thanos Store installation.
- url: http://victoria-metrics:8428/api/v1/write
```
2. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving.
1. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving.
In Prometheus, make sure there are no errors. On the VM side, you should see messages like this:
```
@ -524,7 +524,7 @@ and that you have a separate Thanos Store installation.
2020-04-27T18:38:46.506Z info VictoriaMetrics/lib/storage/partition.go:222 partition "2020_04" has been created
```
3. Now just wait. Within two hours, Prometheus should finish its current data file and hand it off to Thanos Store for long term
1. Now just wait. Within two hours, Prometheus should finish its current data file and hand it off to Thanos Store for long term
storage.
### Historical data
@ -627,7 +627,6 @@ and single-node VictoriaMetrics(`http://localhost:8428`):
--remote-read-src-addr=http://127.0.0.1:9009/prometheus \
--remote-read-filter-time-start=2021-10-18T00:00:00Z \
--remote-read-step-interval=hour \
--remote-read-src-check-alive=false \
--vm-addr=http://127.0.0.1:8428 \
--vm-concurrency=6
```
@ -691,7 +690,6 @@ and single-node VictoriaMetrics(`http://localhost:8428`):
--remote-read-src-addr=http://127.0.0.1:9009/prometheus \
--remote-read-filter-time-start=2021-10-18T00:00:00Z \
--remote-read-step-interval=hour \
--remote-read-src-check-alive=false \
--remote-read-headers=X-Scope-OrgID:demo \
--remote-read-use-stream=true \
--vm-addr=http://127.0.0.1:8428 \
@ -727,6 +725,9 @@ requires an Authentication header like `X-Scope-OrgID`. You can define it via th
## Migrating data from VictoriaMetrics
The simplest way to migrate data between VM instances is to copy data.
See more details [here](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#data-migration).
vmctl uses [native binary protocol](https://docs.victoriametrics.com/#how-to-export-data-in-native-format)
(available since [1.42.0 release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.42.0))
o migrate data between VM instances: single to single, cluster to cluster, single to cluster and vice versa.
@ -735,7 +736,7 @@ See `./vmctl vm-native --help` for details and full list of flags.
Migration in `vm-native` mode takes two steps:
1. Explore the list of the metrics to migrate via `api/v1/label/__name__/values` API;
2. Migrate explored metrics one-by-one.
1. Migrate explored metrics one-by-one.
```
./vmctl vm-native \
@ -769,49 +770,54 @@ _To disable explore phase and switch to the old way of data migration via single
Importing tips:
1. Migrating big volumes of data may result in reaching the safety limits on `src` side.
Please verify that `-search.maxExportDuration` and `-search.maxExportSeries` were set with
proper values for `src`. If hitting the limits, follow the recommendations
[here](https://docs.victoriametrics.com/#how-to-export-data-in-native-format).
If hitting `the number of matching timeseries exceeds...` error, adjust filters to match less time series or
update `-search.maxSeries` command-line flag on vmselect/vmsingle;
2. Migrating all the metrics from one VM to another may collide with existing application metrics
(prefixed with `vm_`) at destination and lead to confusion when using
[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
To avoid such situation try to filter out VM process metrics via `--vm-native-filter-match='{__name__!~"vm_.*"}'` flag.
3. Migrating data with overlapping time range or via unstable network can produce duplicates series at destination.
To avoid duplicates set `-dedup.minScrapeInterval=1ms` for `vmselect`/`vmstorage` at the destination.
This will instruct `vmselect`/`vmstorage` to ignore duplicates with identical timestamps.
4. When migrating large volumes of data use `--vm-native-step-interval` flag to split migration [into steps](#using-time-based-chunking-of-migration).
5. When migrating data from one VM cluster to another, consider using [cluster-to-cluster mode](#cluster-to-cluster-migration-mode).
Or manually specify addresses according to [URL format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format):
```console
# Migrating from cluster specific tenantID to single
--vm-native-src-addr=http://<src-vmselect>:8481/select/0/prometheus
--vm-native-dst-addr=http://<dst-vmsingle>:8428
Please verify that `-search.maxExportDuration` and `-search.maxExportSeries` were set with
proper values for `src`. If hitting the limits, follow the recommendations
[here](https://docs.victoriametrics.com/#how-to-export-data-in-native-format).
If hitting `the number of matching timeseries exceeds...` error, adjust filters to match less time series or
update `-search.maxSeries` command-line flag on vmselect/vmsingle;
1. Migrating all the metrics from one VM to another may collide with existing application metrics
(prefixed with `vm_`) at destination and lead to confusion when using
[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
To avoid such situation try to filter out VM process metrics via `--vm-native-filter-match='{__name__!~"vm_.*"}'` flag.
1. Migrating data with overlapping time range or via unstable network can produce duplicates series at destination.
To avoid duplicates set `-dedup.minScrapeInterval=1ms` for `vmselect`/`vmstorage` at the destination.
This will instruct `vmselect`/`vmstorage` to ignore duplicates with identical timestamps.
1. When migrating large volumes of data use `--vm-native-step-interval` flag to split migration [into steps](#using-time-based-chunking-of-migration).
1. When migrating data from one VM cluster to another, consider using [cluster-to-cluster mode](#cluster-to-cluster-migration-mode).
Or manually specify addresses according to [URL format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format):
```console
# Migrating from cluster specific tenantID to single
--vm-native-src-addr=http://<src-vmselect>:8481/select/0/prometheus
--vm-native-dst-addr=http://<dst-vmsingle>:8428
# Migrating from single to cluster specific tenantID
--vm-native-src-addr=http://<src-vmsingle>:8428
--vm-native-src-addr=http://<dst-vminsert>:8480/insert/0/prometheus
# Migrating from single to cluster specific tenantID
--vm-native-src-addr=http://<src-vmsingle>:8428
--vm-native-src-addr=http://<dst-vminsert>:8480/insert/0/prometheus
# Migrating single to single
--vm-native-src-addr=http://<src-vmsingle>:8428
--vm-native-dst-addr=http://<dst-vmsingle>:8428
# Migrating single to single
--vm-native-src-addr=http://<src-vmsingle>:8428
--vm-native-dst-addr=http://<dst-vmsingle>:8428
# Migrating cluster to cluster for specific tenant ID
--vm-native-src-addr=http://<src-vmselect>:8481/select/0/prometheus
--vm-native-dst-addr=http://<dst-vminsert>:8480/insert/0/prometheus
```
6. Migration speed can be adjusted via `--vm-concurrency` cmd-line flag, which controls the number of concurrent
workers busy with processing. Please note, that each worker can load up to a single vCPU core on VictoriaMetrics.
So try to set it according to allocated CPU resources of your VictoriaMetrics destination installation.
7. Migration is a backfilling process, so it is recommended to read
[Backfilling tips](https://github.com/VictoriaMetrics/VictoriaMetrics#backfilling) section.
8. `vmctl` doesn't provide relabeling or other types of labels management.
Instead, use [relabeling in VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl/issues/4#issuecomment-683424375).
9. `vmctl` supports `--vm-native-src-headers` and `--vm-native-dst-headers` to define headers sent with each request
to the corresponding source address.
10. `vmctl` supports `--vm-native-disable-http-keep-alive` to allow `vmctl` to use non-persistent HTTP connections to avoid
error `use of closed network connection` when run a longer export.
# Migrating cluster to cluster for specific tenant ID
--vm-native-src-addr=http://<src-vmselect>:8481/select/0/prometheus
--vm-native-dst-addr=http://<dst-vminsert>:8480/insert/0/prometheus
```
1. Migrating data from VM cluster which had replication (`-replicationFactor` > 1) enabled won't produce the same amount
of data copies for the destination database, and will result only in creating duplicates. To remove duplicates,
destination database need to be configured with `-dedup.minScrapeInterval=1ms`. To restore the replication factor
the destination `vminsert` component need to be configured with the according `-replicationFactor` value.
See more about replication [here](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety).
1. Migration speed can be adjusted via `--vm-concurrency` cmd-line flag, which controls the number of concurrent
workers busy with processing. Please note, that each worker can load up to a single vCPU core on VictoriaMetrics.
So try to set it according to allocated CPU resources of your VictoriaMetrics destination installation.
1. Migration is a backfilling process, so it is recommended to read
[Backfilling tips](https://github.com/VictoriaMetrics/VictoriaMetrics#backfilling) section.
1. `vmctl` doesn't provide relabeling or other types of labels management.
Instead, use [relabeling in VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl/issues/4#issuecomment-683424375).
1. `vmctl` supports `--vm-native-src-headers` and `--vm-native-dst-headers` to define headers sent with each request
to the corresponding source address.
1. `vmctl` supports `--vm-native-disable-http-keep-alive` to allow `vmctl` to use non-persistent HTTP connections to avoid
error `use of closed network connection` when run a longer export.
### Using time-based chunking of migration
@ -1018,13 +1024,13 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmctl` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmctl` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmctl-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmctl-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl-prod` binary and puts it into the `bin` folder.
### Building docker images
@ -1047,11 +1053,11 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
#### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmctl-linux-arm` or `make vmctl-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmctl-linux-arm` or `make vmctl-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl-linux-arm` or `vmctl-linux-arm64` binary respectively and puts it into the `bin` folder.
#### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmctl-linux-arm-prod` or `make vmctl-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmctl-linux-arm-prod` or `make vmctl-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl-linux-arm-prod` or `vmctl-linux-arm64-prod` binary respectively and puts it into the `bin` folder.

View file

@ -478,9 +478,10 @@ var (
Value: 1,
},
&cli.TimestampFlag{
Name: remoteReadFilterTimeStart,
Usage: "The time filter in RFC3339 format to select timeseries with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'",
Layout: time.RFC3339,
Name: remoteReadFilterTimeStart,
Usage: "The time filter in RFC3339 format to select timeseries with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'",
Layout: time.RFC3339,
Required: true,
},
&cli.TimestampFlag{
Name: remoteReadFilterTimeEnd,

View file

@ -339,23 +339,26 @@ func buildMatchWithFilter(filter string, metricName string) (string, error) {
if filter == metricName {
return filter, nil
}
nameFilter := fmt.Sprintf("__name__=%q", metricName)
labels, err := searchutils.ParseMetricSelector(filter)
tfss, err := searchutils.ParseMetricSelector(filter)
if err != nil {
return "", err
}
str := make([]string, 0, len(labels))
for _, label := range labels {
if len(label.Key) == 0 {
continue
var filters []string
for _, tfs := range tfss {
var a []string
for _, tf := range tfs {
if len(tf.Key) == 0 {
continue
}
a = append(a, tf.String())
}
str = append(str, label.String())
a = append(a, nameFilter)
filters = append(filters, strings.Join(a, ","))
}
nameFilter := fmt.Sprintf("__name__=%q", metricName)
str = append(str, nameFilter)
match := fmt.Sprintf("{%s}", strings.Join(str, ","))
match := "{" + strings.Join(filters, " or ") + "}"
return match, nil
}

View file

@ -332,9 +332,9 @@ The shortlist of configuration flags include the following:
-enable.rateLimit
enables rate limiter
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -356,9 +356,9 @@ The shortlist of configuration flags include the following:
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8431")
-httpListenAddr.useProxyProtocol
@ -368,7 +368,7 @@ The shortlist of configuration flags include the following:
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
@ -386,10 +386,10 @@ The shortlist of configuration flags include the following:
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string

View file

@ -34,11 +34,12 @@ func (ctx *InsertCtx) Reset(rowsLen int) {
}
ctx.Labels = ctx.Labels[:0]
for i := range ctx.mrs {
mr := &ctx.mrs[i]
mr.MetricNameRaw = nil
mrs := ctx.mrs
for i := range mrs {
cleanMetricRow(&mrs[i])
}
ctx.mrs = ctx.mrs[:0]
ctx.mrs = mrs[:0]
if n := rowsLen - cap(ctx.mrs); n > 0 {
ctx.mrs = append(ctx.mrs[:cap(ctx.mrs)], make([]storage.MetricRow, n)...)
}
@ -49,6 +50,10 @@ func (ctx *InsertCtx) Reset(rowsLen int) {
ctx.skipStreamAggr = false
}
func cleanMetricRow(mr *storage.MetricRow) {
mr.MetricNameRaw = nil
}
func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label) []byte {
start := len(ctx.metricNamesBuf)
ctx.metricNamesBuf = append(ctx.metricNamesBuf, prefix...)
@ -139,11 +144,13 @@ func (ctx *InsertCtx) ApplyRelabeling() {
func (ctx *InsertCtx) FlushBufs() error {
sas := sasGlobal.Load()
if sas != nil && !ctx.skipStreamAggr {
ctx.streamAggrCtx.push(ctx.mrs)
matchIdxs := matchIdxsPool.Get()
matchIdxs.B = ctx.streamAggrCtx.push(ctx.mrs, matchIdxs.B)
if !*streamAggrKeepInput {
ctx.Reset(0)
return nil
// Remove aggregated rows from ctx.mrs
ctx.dropAggregatedRows(matchIdxs.B)
}
matchIdxsPool.Put(matchIdxs)
}
// There is no need in limiting the number of concurrent calls to vmstorage.AddRows() here,
// since the number of concurrent FlushBufs() calls should be already limited via writeconcurrencylimiter
@ -158,3 +165,23 @@ func (ctx *InsertCtx) FlushBufs() error {
StatusCode: http.StatusServiceUnavailable,
}
}
func (ctx *InsertCtx) dropAggregatedRows(matchIdxs []byte) {
dst := ctx.mrs[:0]
src := ctx.mrs
if !*streamAggrDropInput {
for idx, match := range matchIdxs {
if match != 0 {
continue
}
dst = append(dst, src[idx])
}
}
tail := src[len(dst):]
for i := range tail {
cleanMetricRow(&tail[i])
}
ctx.mrs = dst
}
var matchIdxsPool bytesutil.ByteBufferPool

View file

@ -21,16 +21,19 @@ import (
var (
streamAggrConfig = flag.String("streamAggr.config", "", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/stream-aggregation.html . "+
"See also -remoteWrite.streamAggr.keepInput and -streamAggr.dedupInterval")
streamAggrKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep input samples after the aggregation with -streamAggr.config. "+
"By default, the input is dropped after the aggregation, so only the aggregate data is stored. "+
"See https://docs.victoriametrics.com/stream-aggregation.html")
"See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval")
streamAggrKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep all the input samples after the aggregation with -streamAggr.config. "+
"By default, only aggregated samples are dropped, while the remaining samples are stored in the database. "+
"See also -streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop all the input samples after the aggregation with -streamAggr.config. "+
"By default, only aggregated samples are dropped, while the remaining samples are stored in the database. "+
"See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flag.Duration("streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
)
var (
saCfgReloaderStopCh = make(chan struct{})
saCfgReloaderStopCh chan struct{}
saCfgReloaderWG sync.WaitGroup
saCfgReloads = metrics.NewCounter(`vminsert_streamagg_config_reloads_total`)
@ -59,6 +62,7 @@ func CheckStreamAggrConfig() error {
//
// MustStopStreamAggr must be called when stream aggr is no longer needed.
func InitStreamAggr() {
saCfgReloaderStopCh = make(chan struct{})
if *streamAggrConfig == "" {
return
}
@ -132,14 +136,20 @@ func (ctx *streamAggrCtx) Reset() {
promrelabel.CleanLabels(ts.Labels)
}
func (ctx *streamAggrCtx) push(mrs []storage.MetricRow) {
func (ctx *streamAggrCtx) push(mrs []storage.MetricRow, matchIdxs []byte) []byte {
matchIdxs = bytesutil.ResizeNoCopyMayOverallocate(matchIdxs, len(mrs))
for i := 0; i < len(matchIdxs); i++ {
matchIdxs[i] = 0
}
mn := &ctx.mn
tss := ctx.tss[:]
ts := &tss[0]
labels := ts.Labels
samples := ts.Samples
sas := sasGlobal.Load()
for _, mr := range mrs {
var matchIdxsLocal []byte
for idx, mr := range mrs {
if err := mn.UnmarshalRaw(mr.MetricNameRaw); err != nil {
logger.Panicf("BUG: cannot unmarshal recently marshaled MetricName: %s", err)
}
@ -163,8 +173,13 @@ func (ctx *streamAggrCtx) push(mrs []storage.MetricRow) {
ts.Labels = labels
ts.Samples = samples
sas.Push(tss)
matchIdxsLocal = sas.Push(tss, matchIdxsLocal)
if matchIdxsLocal[0] != 0 {
matchIdxs[idx] = 1
}
}
return matchIdxs
}
func pushAggregateSeries(tss []prompbmarshal.TimeSeries) {

View file

@ -69,7 +69,7 @@ var (
configTimestamp = metrics.NewCounter(`vm_relabel_config_last_reload_success_timestamp_seconds`)
)
var pcsGlobal atomic.Value
var pcsGlobal atomic.Pointer[promrelabel.ParsedConfigs]
// CheckRelabelConfig checks config pointed by -relabelConfig
func CheckRelabelConfig() error {
@ -90,7 +90,7 @@ func loadRelabelConfig() (*promrelabel.ParsedConfigs, error) {
// HasRelabeling returns true if there is global relabeling.
func HasRelabeling() bool {
pcs := pcsGlobal.Load().(*promrelabel.ParsedConfigs)
pcs := pcsGlobal.Load()
return pcs.Len() > 0 || *usePromCompatibleNaming
}
@ -110,7 +110,7 @@ func (ctx *Ctx) Reset() {
//
// The returned labels are valid until the next call to ApplyRelabeling.
func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
pcs := pcsGlobal.Load().(*promrelabel.ParsedConfigs)
pcs := pcsGlobal.Load()
if pcs.Len() == 0 && !*usePromCompatibleNaming {
// There are no relabeling rules.
return labels

View file

@ -94,9 +94,9 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@ -118,9 +118,9 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8421")
-internStringCacheExpireDuration duration
@ -128,7 +128,7 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
@ -149,10 +149,10 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
The maximum download speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
@ -202,13 +202,13 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmrestore` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmrestore` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmrestore` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmrestore-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
1. Run `make vmrestore-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmrestore-prod` binary and puts it into the `bin` folder.
### Building docker images

View file

@ -138,9 +138,7 @@ func registerMetrics(startTime time.Time, w http.ResponseWriter, r *http.Request
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)
mr.Timestamp = ct
}
if err := vmstorage.RegisterMetricNames(nil, mrs); err != nil {
return fmt.Errorf("cannot register paths: %w", err)
}
vmstorage.RegisterMetricNames(nil, mrs)
// Return response
contentType := "text/plain; charset=utf-8"

View file

@ -328,7 +328,8 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/api/v1/status/active_queries":
statusActiveQueriesRequests.Inc()
promql.WriteActiveQueries(w)
httpserver.EnableCORS(w, r)
promql.ActiveQueriesHandler(w, r)
return true
case "/api/v1/status/top_queries":
topQueriesRequests.Inc()

View file

@ -1005,15 +1005,15 @@ func getMaxLookback(r *http.Request) (int64, error) {
}
func getTagFilterssFromMatches(matches []string) ([][]storage.TagFilter, error) {
tagFilterss := make([][]storage.TagFilter, 0, len(matches))
tfss := make([][]storage.TagFilter, 0, len(matches))
for _, match := range matches {
tagFilters, err := searchutils.ParseMetricSelector(match)
tfssLocal, err := searchutils.ParseMetricSelector(match)
if err != nil {
return nil, fmt.Errorf("cannot parse matches[]=%s: %w", match, err)
}
tagFilterss = append(tagFilterss, tagFilters)
tfss = append(tfss, tfssLocal...)
}
return tagFilterss, nil
return tfss, nil
}
func getRoundDigits(r *http.Request) int {

View file

@ -2,27 +2,34 @@ package promql
import (
"fmt"
"io"
"net/http"
"sort"
"sync"
"sync/atomic"
"time"
)
// WriteActiveQueries writes active queries to w.
// ActiveQueriesHandler returns response to /api/v1/status/active_queries
//
// The written active queries are sorted in descending order of their exeuction duration.
func WriteActiveQueries(w io.Writer) {
// It writes a JSON with active queries to w.
func ActiveQueriesHandler(w http.ResponseWriter, r *http.Request) {
aqes := activeQueriesV.GetAll()
w.Header().Set("Content-Type", "application/json")
sort.Slice(aqes, func(i, j int) bool {
return aqes[i].startTime.Sub(aqes[j].startTime) < 0
})
now := time.Now()
for _, aqe := range aqes {
fmt.Fprintf(w, `{"status":"ok","data":[`)
for i, aqe := range aqes {
d := now.Sub(aqe.startTime)
fmt.Fprintf(w, "\tduration: %.3fs, id=%016X, remote_addr=%s, query=%q, start=%d, end=%d, step=%d\n",
fmt.Fprintf(w, `{"duration":"%.3fs","id":"%016X","remote_addr":%s,"query":%q,"start":%d,"end":%d,"step":%d}`,
d.Seconds(), aqe.qid, aqe.quotedRemoteAddr, aqe.q, aqe.start, aqe.end, aqe.step)
if i+1 < len(aqes) {
fmt.Fprintf(w, `,`)
}
}
fmt.Fprintf(w, `]}`)
}
var activeQueriesV = newActiveQueries()

View file

@ -157,6 +157,10 @@ func adjustBinaryOpTags(be *metricsql.BinaryOpExpr, left, right []*timeseries) (
groupOp = "ignoring"
}
groupTags := be.GroupModifier.Args
if be.KeepMetricNames && groupOp == "on" {
// Add __name__ to groupTags if metric name must be preserved.
groupTags = append(groupTags[:len(groupTags):len(groupTags)], "__name__")
}
for k, tssLeft := range mLeft {
tssRight := mRight[k]
if len(tssRight) == 0 {
@ -221,6 +225,14 @@ func ensureSingleTimeseries(side string, be *metricsql.BinaryOpExpr, tss []*time
func groupJoin(singleTimeseriesSide string, be *metricsql.BinaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
joinTags := be.JoinModifier.Args
var skipTags []string
if strings.ToLower(be.GroupModifier.Op) == "on" {
skipTags = be.GroupModifier.Args
}
joinPrefix := ""
if be.JoinModifierPrefix != nil {
joinPrefix = be.JoinModifierPrefix.S
}
type tsPair struct {
left *timeseries
right *timeseries
@ -230,7 +242,7 @@ func groupJoin(singleTimeseriesSide string, be *metricsql.BinaryOpExpr, rvsLeft,
resetMetricGroupIfRequired(be, tsLeft)
if len(tssRight) == 1 {
// Easy case - right part contains only a single matching time series.
tsLeft.MetricName.SetTags(joinTags, &tssRight[0].MetricName)
tsLeft.MetricName.SetTags(joinTags, joinPrefix, skipTags, &tssRight[0].MetricName)
rvsLeft = append(rvsLeft, tsLeft)
rvsRight = append(rvsRight, tssRight[0])
continue
@ -245,7 +257,7 @@ func groupJoin(singleTimeseriesSide string, be *metricsql.BinaryOpExpr, rvsLeft,
for _, tsRight := range tssRight {
var tsCopy timeseries
tsCopy.CopyFromShallowTimestamps(tsLeft)
tsCopy.MetricName.SetTags(joinTags, &tsRight.MetricName)
tsCopy.MetricName.SetTags(joinTags, joinPrefix, skipTags, &tsRight.MetricName)
bb.B = marshalMetricTagsSorted(bb.B[:0], &tsCopy.MetricName)
pair, ok := m[string(bb.B)]
if !ok {
@ -315,6 +327,12 @@ func resetMetricGroupIfRequired(be *metricsql.BinaryOpExpr, ts *timeseries) {
// Do not reset MetricGroup for non-boolean `compare` binary ops like Prometheus does.
return
}
if be.KeepMetricNames {
// Do not reset MetricGroup if it is explicitly requested via `a op b keep_metric_names`
// See https://docs.victoriametrics.com/MetricsQL.html#keep_metric_names
return
}
ts.MetricName.ResetMetricGroup()
}

View file

@ -1075,14 +1075,17 @@ func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcNa
}
// Fetch the remaining part of the result.
tfs := searchutils.ToTagFilters(me.LabelFilters)
tfss := searchutils.JoinTagFilterss([][]storage.TagFilter{tfs}, ec.EnforcedTagFilterss)
tfss := searchutils.ToTagFilterss(me.LabelFilterss)
tfss = searchutils.JoinTagFilterss(tfss, ec.EnforcedTagFilterss)
minTimestamp := start - maxSilenceInterval
if window > ec.Step {
minTimestamp -= window
} else {
minTimestamp -= ec.Step
}
if minTimestamp < 0 {
minTimestamp = 0
}
sq := storage.NewSearchQuery(minTimestamp, ec.End, tfss, ec.MaxSeries)
rss, err := netstorage.ProcessSearchQuery(qt, sq, ec.Deadline)
if err != nil {

View file

@ -29,8 +29,9 @@ func TestGetCommonLabelFilters(t *testing.T) {
tss = append(tss, &ts)
}
lfs := getCommonLabelFilters(tss)
me := &metricsql.MetricExpr{
LabelFilters: lfs,
var me metricsql.MetricExpr
if len(lfs) > 0 {
me.LabelFilterss = [][]metricsql.LabelFilter{lfs}
}
lfsMarshaled := me.AppendString(nil)
if string(lfsMarshaled) != lfsExpected {
@ -40,7 +41,7 @@ func TestGetCommonLabelFilters(t *testing.T) {
f(``, `{}`)
f(`m 1`, `{}`)
f(`m{a="b"} 1`, `{a="b"}`)
f(`m{c="d",a="b"} 1`, `{a="b", c="d"}`)
f(`m{c="d",a="b"} 1`, `{a="b",c="d"}`)
f(`m1{a="foo"} 1
m2{a="bar"} 1`, `{a=~"bar|foo"}`)
f(`m1{a="foo"} 1

View file

@ -277,10 +277,12 @@ func escapeDotsInRegexpLabelFilters(e metricsql.Expr) metricsql.Expr {
if !ok {
return
}
for i := range me.LabelFilters {
f := &me.LabelFilters[i]
if f.IsRegexp {
f.Value = escapeDots(f.Value)
for _, lfs := range me.LabelFilterss {
for i := range lfs {
f := &lfs[i]
if f.IsRegexp {
f.Value = escapeDots(f.Value)
}
}
}
})

View file

@ -45,7 +45,7 @@ func TestEscapeDotsInRegexpLabelFilters(t *testing.T) {
f("2", "2")
f(`foo.bar + 123`, `foo.bar + 123`)
f(`foo{bar=~"baz.xx.yyy"}`, `foo{bar=~"baz\\.xx\\.yyy"}`)
f(`foo(a.b{c="d.e",x=~"a.b.+[.a]",y!~"aaa.bb|cc.dd"}) + x.y(1,sum({x=~"aa.bb"}))`, `foo(a.b{c="d.e", x=~"a\\.b.+[\\.a]", y!~"aaa\\.bb|cc\\.dd"}) + x.y(1, sum({x=~"aa\\.bb"}))`)
f(`foo(a.b{c="d.e",x=~"a.b.+[.a]",y!~"aaa.bb|cc.dd"}) + x.y(1,sum({x=~"aa.bb"}))`, `foo(a.b{c="d.e",x=~"a\\.b.+[\\.a]",y!~"aaa\\.bb|cc\\.dd"}) + x.y(1, sum({x=~"aa\\.bb"}))`)
}
func TestExecSuccess(t *testing.T) {
@ -3003,6 +3003,32 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector / scalar keep_metric_names`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(((label_set(time(), "foo", "bar", "__name__", "q1") or label_set(10, "foo", "qwert", "__name__", "q2")) / 2) keep_metric_names)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{500, 600, 700, 800, 900, 1000},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("q1")
r1.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("bar"),
}}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{5, 5, 5, 5, 5, 5},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("q2")
r2.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("qwert"),
}}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector * scalar`, func(t *testing.T) {
t.Parallel()
q := `sum(time()) * 2`
@ -3038,6 +3064,32 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`scalar * vector keep_metric_names`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(2 * (label_set(time(), "foo", "bar", "__name__", "q1"), label_set(10, "foo", "qwert", "__name__", "q2")) keep_metric_names)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2000, 2400, 2800, 3200, 3600, 4000},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("q1")
r1.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("bar"),
}}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{20, 20, 20, 20, 20, 20},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("q2")
r2.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("qwert"),
}}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`scalar * on() group_right vector`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(2 * on() group_right() (label_set(time(), "foo", "bar") or label_set(10, "foo", "qwert")))`
@ -3062,6 +3114,32 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`scalar * on() group_right vector keep_metric_names`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(2 * on() group_right() (label_set(time(), "foo", "bar", "__name__", "q1"), label_set(10, "foo", "qwert", "__name__", "q2")) keep_metric_names)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2000, 2400, 2800, 3200, 3600, 4000},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("q1")
r1.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("bar"),
}}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{20, 20, 20, 20, 20, 20},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("q2")
r2.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("qwert"),
}}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`scalar * ignoring(foo) group_right vector`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(label_set(2, "a", "2") * ignoring(foo,a) group_right(a) (label_set(time(), "foo", "bar", "a", "1"), label_set(10, "foo", "qwert")))`
@ -3147,6 +3225,27 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`vector * on(foo) scalar keep_metric_names`, func(t *testing.T) {
t.Parallel()
q := `((
label_set(time(), "foo", "bar", "xx", "yy", "__name__", "q1"),
label_set(10, "foo", "qwert", "__name__", "q2")
) * on(foo) label_set(2, "foo","bar","aa","bb", "__name__", "q2")) keep_metric_names`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2000, 2400, 2800, 3200, 3600, 4000},
Timestamps: timestampsExpected,
}
r.MetricName.MetricGroup = []byte("q1")
r.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_left(additional_tag) duplicate_timeseries_differ_by_additional_tag`, func(t *testing.T) {
t.Parallel()
q := `sort(label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_left(op) (
@ -3375,6 +3474,26 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`vector + vector partial matching keep_metric_names`, func(t *testing.T) {
t.Parallel()
q := `(
(label_set(time(), "t1", "v1", "__name__", "q1") or label_set(10, "t2", "v2", "__name__", "q2"))
+
(label_set(100, "t1", "v1", "__name__", "q3") or label_set(time(), "t2", "v3"))
) keep_metric_names`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1100, 1300, 1500, 1700, 1900, 2100},
Timestamps: timestampsExpected,
}
r.MetricName.MetricGroup = []byte("q1")
r.MetricName.Tags = []storage.Tag{{
Key: []byte("t1"),
Value: []byte("v1"),
}}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`vector + vector no matching`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(
@ -3450,6 +3569,102 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector + vector on group_left(*)`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(
(label_set(time(), "t1", "v123", "t2", "v3"), label_set(10, "t2", "v3", "xxx", "yy"))
+ on (foo, t2) group_left (*)
(label_set(100, "t1", "v1"), label_set(time(), "t2", "v3", "noxxx", "aa"))
)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2000, 2400, 2800, 3200, 3600, 4000},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("noxxx"),
Value: []byte("aa"),
},
{
Key: []byte("t1"),
Value: []byte("v123"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1010, 1210, 1410, 1610, 1810, 2010},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("noxxx"),
Value: []byte("aa"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
},
{
Key: []byte("xxx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector + vector on group_left(*) prefix`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(
(label_set(time(), "t1", "v123", "t2", "v3"), label_set(10, "t2", "v3", "xxx", "yy"))
+ on (foo, t2) group_left (*) prefix "abc_"
(label_set(100, "t1", "v1"), label_set(time(), "t2", "v3", "noxxx", "aa"))
)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2000, 2400, 2800, 3200, 3600, 4000},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("abc_noxxx"),
Value: []byte("aa"),
},
{
Key: []byte("t1"),
Value: []byte("v123"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1010, 1210, 1410, 1610, 1810, 2010},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("abc_noxxx"),
Value: []byte("aa"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
},
{
Key: []byte("xxx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector + vector on group_left (__name__)`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(

View file

@ -34,7 +34,7 @@ func IsMetricSelectorWithRollup(s string) (childQuery string, window, offset *me
return
}
me, ok := re.Expr.(*metricsql.MetricExpr)
if !ok || len(me.LabelFilters) == 0 {
if !ok || len(me.LabelFilterss) == 0 {
return
}
wrappedQuery := me.AppendString(nil)

View file

@ -63,7 +63,7 @@ func checkRollupResultCacheReset() {
for {
time.Sleep(checkRollupResultCacheResetInterval)
if atomic.SwapUint32(&needRollupResultCacheReset, 0) > 0 {
mr := rollupResultResetMetricRowSample.Load().(*storage.MetricRow)
mr := rollupResultResetMetricRowSample.Load()
d := int64(fasttime.UnixTimestamp()*1000) - mr.Timestamp - cacheTimestampOffset.Milliseconds()
logger.Warnf("resetting rollup result cache because the metric %s has a timestamp older than -search.cacheTimestampOffset=%s by %.3fs",
mr.String(), cacheTimestampOffset, float64(d)/1e3)
@ -76,7 +76,7 @@ const checkRollupResultCacheResetInterval = 5 * time.Second
var needRollupResultCacheReset uint32
var checkRollupResultCacheResetOnce sync.Once
var rollupResultResetMetricRowSample atomic.Value
var rollupResultResetMetricRowSample atomic.Pointer[storage.MetricRow]
var rollupResultCacheV = &rollupResultCache{
c: workingsetcache.New(1024 * 1024), // This is a cache for testing.

View file

@ -41,10 +41,14 @@ func TestRollupResultCache(t *testing.T) {
MayCache: true,
}
me := &metricsql.MetricExpr{
LabelFilters: []metricsql.LabelFilter{{
Label: "aaa",
Value: "xxx",
}},
LabelFilterss: [][]metricsql.LabelFilter{
{
{
Label: "aaa",
Value: "xxx",
},
},
},
}
fe := &metricsql.FuncExpr{
Name: "foo",

View file

@ -240,7 +240,11 @@ func getAbsentTimeseries(ec *EvalConfig, arg metricsql.Expr) []*timeseries {
if !ok {
return rvs
}
tfs := searchutils.ToTagFilters(me.LabelFilters)
tfss := searchutils.ToTagFilterss(me.LabelFilterss)
if len(tfss) != 1 {
return rvs
}
tfs := tfss[0]
for i := range tfs {
tf := &tfs[i]
if len(tf.Key) == 0 {

View file

@ -140,12 +140,14 @@ func GetExtraTagFilters(r *http.Request) ([][]storage.TagFilter, error) {
}
var etfs [][]storage.TagFilter
for _, extraFilter := range extraFilters {
tfs, err := ParseMetricSelector(extraFilter)
tfss, err := ParseMetricSelector(extraFilter)
if err != nil {
return nil, fmt.Errorf("cannot parse extra_filters=%s: %w", extraFilter, err)
}
tfs = append(tfs, tagFilters...)
etfs = append(etfs, tfs)
for i := range tfss {
tfss[i] = append(tfss[i], tagFilters...)
}
etfs = append(etfs, tfss...)
}
return etfs, nil
}
@ -170,7 +172,7 @@ func JoinTagFilterss(src, etfs [][]storage.TagFilter) [][]storage.TagFilter {
}
// ParseMetricSelector parses s containing PromQL metric selector and returns the corresponding LabelFilters.
func ParseMetricSelector(s string) ([]storage.TagFilter, error) {
func ParseMetricSelector(s string) ([][]storage.TagFilter, error) {
expr, err := metricsql.Parse(s)
if err != nil {
return nil, err
@ -179,20 +181,24 @@ func ParseMetricSelector(s string) ([]storage.TagFilter, error) {
if !ok {
return nil, fmt.Errorf("expecting metricSelector; got %q", expr.AppendString(nil))
}
if len(me.LabelFilters) == 0 {
return nil, fmt.Errorf("labelFilters cannot be empty")
if len(me.LabelFilterss) == 0 {
return nil, fmt.Errorf("labelFilterss cannot be empty")
}
tfs := ToTagFilters(me.LabelFilters)
return tfs, nil
tfss := ToTagFilterss(me.LabelFilterss)
return tfss, nil
}
// ToTagFilters converts lfs to a slice of storage.TagFilter
func ToTagFilters(lfs []metricsql.LabelFilter) []storage.TagFilter {
tfs := make([]storage.TagFilter, len(lfs))
for i := range lfs {
toTagFilter(&tfs[i], &lfs[i])
// ToTagFilterss converts lfss to or-delimited slices of storage.TagFilter
func ToTagFilterss(lfss [][]metricsql.LabelFilter) [][]storage.TagFilter {
tfss := make([][]storage.TagFilter, len(lfss))
for i, lfs := range lfss {
tfs := make([]storage.TagFilter, len(lfs))
for j := range lfs {
toTagFilter(&tfs[j], &lfs[j])
}
tfss[i] = tfs
}
return tfs
return tfss
}
func toTagFilter(dst *storage.TagFilter, src *metricsql.LabelFilter) {

View file

@ -135,69 +135,69 @@ func TestJoinTagFilterss(t *testing.T) {
}
}
// Single tag filter
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
}, nil, []string{
), nil, []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`,
})
// Miltiple tag filters
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
mustParseMetricSelector(`{k5=~"v5"}`),
}, nil, []string{
), nil, []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`,
`{k5=~"v5"}`,
})
// Single extra filter
f(t, nil, [][]storage.TagFilter{
f(t, nil, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`,
})
// Multiple extra filters
f(t, nil, [][]storage.TagFilter{
f(t, nil, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
mustParseMetricSelector(`{k5=~"v5"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`,
`{k5=~"v5"}`,
})
// Single tag filter and a single extra filter
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
}, [][]storage.TagFilter{
), joinTagFilters(
mustParseMetricSelector(`{k5=~"v5"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k5=~"v5"}`,
})
// Multiple tag filters and a single extra filter
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
mustParseMetricSelector(`{k5=~"v5"}`),
}, [][]storage.TagFilter{
), joinTagFilters(
mustParseMetricSelector(`{k6=~"v6"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k6=~"v6"}`,
`{k5=~"v5",k6=~"v6"}`,
})
// Single tag filter and multiple extra filters
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
}, [][]storage.TagFilter{
), joinTagFilters(
mustParseMetricSelector(`{k5=~"v5"}`),
mustParseMetricSelector(`{k6=~"v6"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k5=~"v5"}`,
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k6=~"v6"}`,
})
// Multiple tag filters and multiple extra filters
f(t, [][]storage.TagFilter{
f(t, joinTagFilters(
mustParseMetricSelector(`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4"}`),
mustParseMetricSelector(`{k5=~"v5"}`),
}, [][]storage.TagFilter{
), joinTagFilters(
mustParseMetricSelector(`{k6=~"v6"}`),
mustParseMetricSelector(`{k7=~"v7"}`),
}, []string{
), []string{
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k6=~"v6"}`,
`{k1="v1",k2=~"v2",k3!="v3",k4!~"v4",k7=~"v7"}`,
`{k5=~"v5",k6=~"v6"}`,
@ -205,12 +205,20 @@ func TestJoinTagFilterss(t *testing.T) {
})
}
func mustParseMetricSelector(s string) []storage.TagFilter {
tf, err := ParseMetricSelector(s)
func joinTagFilters(args ...[][]storage.TagFilter) [][]storage.TagFilter {
result := append([][]storage.TagFilter{}, args[0]...)
for _, tfss := range args[1:] {
result = append(result, tfss...)
}
return result
}
func mustParseMetricSelector(s string) [][]storage.TagFilter {
tfss, err := ParseMetricSelector(s)
if err != nil {
panic(fmt.Errorf("cannot parse %q: %w", s, err))
}
return tf
return tfss
}
func tagFilterssToStrings(tfss [][]storage.TagFilter) []string {

View file

@ -1,14 +1,14 @@
{
"files": {
"main.css": "./static/css/main.f31d05d1.css",
"main.js": "./static/js/main.ff1e4560.js",
"static/js/27.c1ccfd29.chunk.js": "./static/js/27.c1ccfd29.chunk.js",
"main.css": "./static/css/main.c93e8dee.css",
"main.js": "./static/js/main.05892dd1.js",
"static/js/522.b5ae4365.chunk.js": "./static/js/522.b5ae4365.chunk.js",
"static/media/Lato-Regular.ttf": "./static/media/Lato-Regular.d714fec1633b69a9c2e9.ttf",
"static/media/Lato-Bold.ttf": "./static/media/Lato-Bold.32360ba4b57802daa4d6.ttf",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.f31d05d1.css",
"static/js/main.ff1e4560.js"
"static/css/main.c93e8dee.css",
"static/js/main.05892dd1.js"
]
}

View file

@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.ff1e4560.js"></script><link href="./static/css/main.f31d05d1.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.05892dd1.js"></script><link href="./static/css/main.c93e8dee.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show more