Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files

This commit is contained in:
Aliaksandr Valialkin 2021-06-18 10:55:54 +03:00
commit bbca1740c1
45 changed files with 3073 additions and 1664 deletions

View file

@ -94,7 +94,7 @@ Alphabetically sorted links to case studies:
* [Arbitrary CSV data](#how-to-import-csv-data). * [Arbitrary CSV data](#how-to-import-csv-data).
* Supports metrics' relabeling. See [these docs](#relabeling) for details. * Supports metrics' relabeling. See [these docs](#relabeling) for details.
* Can deal with high cardinality and high churn rate issues using [series limiter](#cardinality-limiter). * Can deal with high cardinality and high churn rate issues using [series limiter](#cardinality-limiter).
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads. * Ideally works with big amounts of time series data from APM, Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). * Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* See also technical [Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html). * See also technical [Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html).
@ -343,6 +343,7 @@ Currently the following [scrape_config](https://prometheus.io/docs/prometheus/la
* [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config) * [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config)
* [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) * [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)
* [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) * [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config)
* [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config)
Other `*_sd_config` types will be supported in the future. Other `*_sd_config` types will be supported in the future.
@ -1721,6 +1722,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Wait time used by Consul service discovery. Default value is used if not set Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration -promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
-promscrape.disableCompression -promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive -promscrape.disableKeepAlive

View file

@ -177,6 +177,8 @@ The following scrape types in [scrape_config](https://prometheus.io/docs/prometh
See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details. See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details.
* `eureka_sd_configs` - is for scraping targets registered in [Netflix Eureka](https://github.com/Netflix/eureka). * `eureka_sd_configs` - is for scraping targets registered in [Netflix Eureka](https://github.com/Netflix/eureka).
See [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) for details. See [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) for details.
* `digitalocean_sd_configs` is for scraping targerts registered in [DigitalOcean](https://www.digitalocean.com/)
See [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config) for details.
Please file feature requests to [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`. Please file feature requests to [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
@ -627,6 +629,8 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Wait time used by Consul service discovery. Default value is used if not set Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration -promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
-promscrape.disableCompression -promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive -promscrape.disableKeepAlive
@ -715,7 +719,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234 Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234
Supports an array of values separated by comma or specified via multiple flags. Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.queues int -remoteWrite.queues int
The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage (default 4) The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage (default 2 * numberOfAvailableCPUs)
-remoteWrite.rateLimit array -remoteWrite.rateLimit array
Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Supports array of values separated by comma or specified via multiple flags. Supports array of values separated by comma or specified via multiple flags.

View file

@ -28,8 +28,8 @@ var (
"Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems") "Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+ tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL") "See also -remoteWrite.maxDiskUsagePerURL")
queues = flag.Int("remoteWrite.queues", 4, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+ queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage") "isn't enough for sending high volume of collected data to remote storage. Default value if 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+ showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key") "It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+ maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+

View file

@ -1,8 +1,9 @@
# vmalert # vmalert
`vmalert` executes a list of given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) `vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
rules against configured address. rules against configured address. It is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/latest/overview/)
implementation and aims to be compatible with its syntax.
## Features ## Features
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB; * Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
@ -40,21 +41,23 @@ To start using `vmalert` you will need the following things:
* datasource address - reachable VictoriaMetrics instance for rules execution; * datasource address - reachable VictoriaMetrics instance for rules execution;
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, * notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts and sending notifications. aggregating alerts and sending notifications.
* remote write address - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage address for storing recording rules results and alerts state in for of timeseries. This is optional. compatible storage address for storing recording rules results and alerts state in for of timeseries.
Then configure `vmalert` accordingly: Then configure `vmalert` accordingly:
``` ```
./bin/vmalert -rule=alert.rules \ ./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource -datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL -notifier.url=http://localhost:9093 \ # AlertManager URL
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL -notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules -remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from -remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule -external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a # Multiple external labels may be set -external.label=replica=a # Multiple external labels may be set
``` ```
See the fill list of configuration flags in [configuration](#configuration) section.
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts. to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
@ -62,7 +65,7 @@ Configuration for [recording](https://prometheus.io/docs/prometheus/latest/confi
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found similar to Prometheus rules and configured using YAML. Configuration examples may be found
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder. in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
Every `rule` belongs to `group` and every configuration file may contain arbitrary number of groups: Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
```yaml ```yaml
groups: groups:
[ - <rule_group> ] [ - <rule_group> ]
@ -70,15 +73,15 @@ groups:
### Groups ### Groups
Each group has following attributes: Each group has the following attributes:
```yaml ```yaml
# The name of the group. Must be unique within a file. # The name of the group. Must be unique within a file.
name: <string> name: <string>
# How often rules in the group are evaluated. # How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ] [ interval: <duration> | default = -evaluationInterval flag ]
# How many rules execute at once. Increasing concurrency may speed # How many rules execute at once within a group. Increasing concurrency may speed
# up round execution speed. # up round execution speed.
[ concurrency: <integer> | default = 1 ] [ concurrency: <integer> | default = 1 ]
@ -98,20 +101,25 @@ rules:
### Rules ### Rules
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
expression and then act according to the Rule type.
There are two types of Rules: There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) - * [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allows to define alert conditions via [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) Alerting rules allows to define alert conditions via `expr` field and to send notifications
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager). [Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) - * [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow you to precompute frequently needed or computationally expensive expressions Recording rules allows to define `expr` which result will be than backfilled to configured
and save their result as a new set of time series. `-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels `vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
within one group. within one group.
#### Alerting rules #### Alerting rules
The syntax for alerting rule is following: The syntax for alerting rule is the following:
```yaml ```yaml
# The name of the alert. Must be a valid metric name. # The name of the alert. Must be a valid metric name.
alert: <string> alert: <string>
@ -121,12 +129,14 @@ alert: <string>
[ type: <string> ] [ type: <string> ]
# The expression to evaluate. The expression language depends on the type value. # The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If type="graphite", then the expression # By default PromQL/MetricsQL expression is used. If type="graphite", then the expression
# must contain valid Graphite expression. # must contain valid Graphite expression.
expr: <string> expr: <string>
# Alerts are considered firing once they have been returned for this long. # Alerts are considered firing once they have been returned for this long.
# Alerts which have not yet fired for long enough are considered pending. # Alerts which have not yet fired for long enough are considered pending.
# If param is omitted or set to 0 then alerts will be immediately considered
# as firing once they return.
[ for: <duration> | default = 0s ] [ for: <duration> | default = 0s ]
# Labels to add or overwrite for each alert. # Labels to add or overwrite for each alert.
@ -164,12 +174,12 @@ labels:
[ <labelname>: <labelvalue> ] [ <labelname>: <labelvalue> ]
``` ```
For recording rules to work `-remoteWrite.url` must specified. For recording rules to work `-remoteWrite.url` must be specified.
### Alerts state on restarts ### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert` `vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags: the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state * `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol. into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
@ -179,17 +189,27 @@ The state stored to the configured address on every rule evaluation.
from configured address by querying time series with name `ALERTS_FOR_STATE`. from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for the proper state restoring. Restore process may fail if time series are missing Both flags are required for the proper state restoring. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert` in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
rules configuration. or received state doesn't match current `vmalert` rules configuration.
### Multitenancy ### Multitenancy
There are the following approaches for alerting and recording rules across [multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) exist: There are the following approaches for alerting and recording rules across
[multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy):
* To run a separate `vmalert` instance per each tenant. The corresponding tenant must be specified in `-datasource.url` command-line flag according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus` would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line flag must contain the url for the specific tenant as well. For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording rules to `AccountID=123`. * To run a separate `vmalert` instance per each tenant.
The corresponding tenant must be specified in `-datasource.url` command-line flag
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus`
would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line
flag must contain the url for the specific tenant as well.
For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if [enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used with `-clusterMode` command-line flag. For example: * To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used
with `-clusterMode` command-line flag. For example:
```yaml ```yaml
groups: groups:
@ -204,9 +224,13 @@ groups:
# Rules for accountID=456, projectID=789 # Rules for accountID=456, projectID=789
``` ```
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481` . `vmselect` automatically adds the specified tenant to urls per each recording rule in this case. If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmselect` automatically adds the specified tenant to urls per each recording rule in this case.
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise` tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags). The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### WEB ### WEB
@ -318,6 +342,9 @@ See full description for these flags in `./vmalert --help`.
## Configuration ## Configuration
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following: The shortlist of configuration flags is the following:
``` ```
-datasource.appendTypePrefix -datasource.appendTypePrefix
@ -510,9 +537,6 @@ The shortlist of configuration flags is the following:
Show VictoriaMetrics version Show VictoriaMetrics version
``` ```
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
`vmalert` supports "hot" config reload via the following methods: `vmalert` supports "hot" config reload via the following methods:
* send SIGHUP signal to `vmalert` process; * send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint; * send GET request to `/-/reload` endpoint;

View file

@ -1,8 +1,8 @@
## vmbackupmanager ## vmbackupmanager
VictoriaMetrics backup manager ***vmbackupmanager is a part of [enterprise package](https://victoriametrics.com/enterprise.html)***
This service automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**. Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc. Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed. The VictoriaMetrics backup manager automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**. Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc. Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed.
The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders which represent the backup intervals (hourly, daily, weekly and monthly) The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders which represent the backup intervals (hourly, daily, weekly and monthly)
The required flags for running the service are as follows: The required flags for running the service are as follows:
@ -49,7 +49,7 @@ There are two flags which could help with performance tuning:
* -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10) * -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10)
### Example of Usage ## Example of Usage
GCS and cluster version. You need to have a credentials file in json format with following structure GCS and cluster version. You need to have a credentials file in json format with following structure

View file

@ -1,5 +1,7 @@
# vmgateway # vmgateway
***vmgateway is a part of [enterprise package](https://victoriametrics.com/enterprise.html)***
<img alt="vmgateway" src="vmgateway-overview.jpeg"> <img alt="vmgateway" src="vmgateway-overview.jpeg">

View file

@ -1332,7 +1332,8 @@ func rollupIncreasePure(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up // There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs. // before calling rollup funcs.
values := rfa.values values := rfa.values
prevValue := rfa.prevValue // restore to the real value because of potential staleness reset
prevValue := rfa.realPrevValue
if math.IsNaN(prevValue) { if math.IsNaN(prevValue) {
if len(values) == 0 { if len(values) == 0 {
return nan return nan

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -6,6 +6,13 @@ sort: 15
## tip ## tip
* FEATURE: vmagent: add service discovery for DigitalOcean (aka [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config)). See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367).
* FEATURE: vmagent: show the number of samples the target returned during the last scrape on `/targets` and `/api/v1/targets` pages. This should simplify debugging targets, which may return too big or too low number of samples. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377).
* FEATURE: vmagent: change the default value for `-remoteWrite.queues` from 4 to `2 * numCPUs`. This should reduce scrape duration for highly loaded vmagent, which scrapes tens of thousands of targets. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1385).
* FEATURE: vmagent: show jobs with zero discovered targets on `/targets` page. This should help debugging improperly configured scrape configs.
* BUGFIX: prevent from adding new samples to deleted time series after the rotation of the inverted index (the rotation is performed once per `-retentionPeriod`). See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136) for details.
## [v1.61.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.61.1) ## [v1.61.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.61.1)

View file

@ -452,6 +452,342 @@ Due to `KISS`, cluster version of VictoriaMetrics has no the following "features
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
## List of command-line flags
* [List of command-line flags for vminsert](#list-of-command-line-flags-for-vminsert)
* [List of command-line flags for vmselect](#list-of-command-line-flags-for-vmselect)
* [List of command-line flags for vmstorage](#list-of-command-line-flags-for-vmstorage)
### List of command-line flags for vminsert
Below is the output for `/path/to/vminsert -help`:
```
-clusternativeListenAddr string
TCP address to listen for data from other vminsert nodes in multi-level cluster setup. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup . Usually :8400 must be set. Doesn't work if empty
-csvTrimTimestamp duration
Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-disableRerouting
Whether to disable re-routing when some of vmstorage nodes accept incoming data at slower speed compared to other storage nodes. By default the re-routing is enabled. Disabled re-routing limits the ingestion rate by the slowest vmstorage node. On the other side, disabled re-routing minimizes the number of active time series in the cluster
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteListenAddr string
TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty
-graphiteTrimTimestamp duration
Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpListenAddr string
Address to listen for http connections (default ":8480")
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.
-influx.maxLineSize size
The maximum size in bytes for a single Influx line during parsing
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144)
-influxListenAddr string
TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vminsert>:8480/insert/<accountID>/influx/write
-influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol (default "_")
-influxSkipMeasurement
Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'
-influxSkipSingleField
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field
-influxTrimTimestamp duration
Trim timestamps for Influx line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-insert.maxQueueDuration duration
The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
-maxLabelsPerTimeseries int
The maximum number of labels accepted per time series. Superfluous labels are dropped (default 30)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-opentsdbHTTPListenAddr string
TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty
-opentsdbListenAddr string
TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty
-opentsdbTrimTimestamp duration
Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
-opentsdbhttp.maxInsertRequestSize size
The maximum size of OpenTSDB HTTP put request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
-opentsdbhttpTrimTimestamp duration
Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-relabelConfig string
Optional path to a file with relabeling rules, which are applied to all the ingested metrics. See https://docs.victoriametrics.com/#relabeling for details
-relabelDebug
Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs
-replicationFactor int
Replication factor for the ingested data, i.e. how many copies to make among distinct -storageNode instances. Note that vmselect must run with -dedup.minScrapeInterval=1ms for data de-duplication when replicationFactor is greater than 1. Higher values for -dedup.minScrapeInterval at vmselect is OK (default 1)
-rpc.disableCompression
Whether to disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage
-sortLabels
Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit
-storageNode array
Address of vmstorage nodes; usage: -storageNode=vmstorage-host1:8400 -storageNode=vmstorage-host2:8400
Supports an array of values separated by comma or specified via multiple flags.
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
```
### List of command-line flags for vmselect
Below is the output for `/path/to/vmselect -help`:
```
-cacheDataPath string
Path to directory for cache files. Cache isn't saved if empty
-dedup.minScrapeInterval duration
Remove superflouos samples from time series if they are located closer to each other than this duration. This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. Deduplication is disabled if the -dedup.minScrapeInterval is 0
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteTrimTimestamp duration
Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpListenAddr string
Address to listen for http connections (default ":8481")
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-replicationFactor int
How many copies of every time series is available on vmstorage nodes. See -replicationFactor command-line flag for vminsert nodes (default 1)
-search.cacheTimestampOffset duration
The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources (default 5m0s)
-search.denyPartialResponse
Whether to deny partial responses if a part of -storageNode instances fail to perform queries; this trades availability over consistency; see also -search.maxQueryDuration
-search.disableCache
Whether to disable response caching. This may be useful during data backfilling
-search.latencyOffset duration
The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s)
-search.logSlowQueryDuration duration
Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s)
-search.maxConcurrentRequests int
The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8)
-search.maxExportDuration duration
The maximum duration for /api/v1/export call (default 720h0m0s)
-search.maxLookback duration
Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons
-search.maxPointsPerTimeseries int
The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000)
-search.maxQueryDuration duration
The maximum duration for query execution (default 30s)
-search.maxQueryLen size
The maximum search query length in bytes
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384)
-search.maxQueueDuration duration
The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s)
-search.maxStalenessInterval duration
The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons
-search.maxStatusRequestDuration duration
The maximum duration for /api/v1/status/* requests (default 5m0s)
-search.maxStepForPointsAdjustment duration
The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s)
-search.minStalenessInterval duration
The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval'
-search.queryStats.lastQueriesCount int
Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000)
-search.queryStats.minQueryDuration duration
The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats
-search.resetCacheAuthKey string
Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call
-search.treatDotsAsIsInRegexps
Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter
-selectNode array
Addresses of vmselect nodes; usage: -selectNode=vmselect-host1:8481 -selectNode=vmselect-host2:8481
Supports an array of values separated by comma or specified via multiple flags.
-storageNode array
Addresses of vmstorage nodes; usage: -storageNode=vmstorage-host1:8401 -storageNode=vmstorage-host2:8401
Supports an array of values separated by comma or specified via multiple flags.
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
```
### List of command-line flags for vmstorage
Below is the output for `/path/to/vmstorage -help`:
```
-bigMergeConcurrency int
The maximum number of CPU cores to use for big merges. Default value is used if set to 0
-dedup.minScrapeInterval duration
Remove superflouos samples from time series if they are located closer to each other than this duration. This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. Deduplication is disabled if the -dedup.minScrapeInterval is 0
-denyQueriesOutsideRetention
Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-finalMergeDelay duration
The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge
-forceFlushAuthKey string
authKey, which must be passed in query string to /internal/force_flush pages
-forceMergeAuthKey string
authKey, which must be passed in query string to /internal/force_merge pages
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpListenAddr string
Address to listen for http connections (default ":8482")
-logNewSeries
Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-precisionBits int
The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64)
-retentionPeriod value
Data with timestamps outside the retentionPeriod is automatically deleted
The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1)
-rpc.disableCompression
Disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage
-search.maxTagKeys int
The maximum number of tag keys returned per search (default 100000)
-search.maxTagValueSuffixesPerSearch int
The maximum number of tag value suffixes returned from /metrics/find (default 100000)
-search.maxTagValues int
The maximum number of tag values returned per search (default 100000)
-search.maxUniqueTimeseries int
The maximum number of unique time series each search can scan (default 300000)
-smallMergeConcurrency int
The maximum number of CPU cores to use for small merges. Default value is used if set to 0
-snapshotAuthKey string
authKey, which must be passed in query string to /snapshot* pages
-storage.maxDailySeries int
The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries
-storage.maxHourlySeries int
The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries
-storageDataPath string
Path to storage data (default "vmstorage-data")
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
-vminsertAddr string
TCP address to accept connections from vminsert services (default ":8400")
-vmselectAddr string
TCP address to accept connections from vmselect services (default ":8401")
```
## VictoriaMetrics Logo ## VictoriaMetrics Logo
[Zip](VM_logo.zip) contains three folders with different image orientation (main color and inverted version). [Zip](VM_logo.zip) contains three folders with different image orientation (main color and inverted version).

View file

@ -98,7 +98,7 @@ Alphabetically sorted links to case studies:
* [Arbitrary CSV data](#how-to-import-csv-data). * [Arbitrary CSV data](#how-to-import-csv-data).
* Supports metrics' relabeling. See [these docs](#relabeling) for details. * Supports metrics' relabeling. See [these docs](#relabeling) for details.
* Can deal with high cardinality and high churn rate issues using [series limiter](#cardinality-limiter). * Can deal with high cardinality and high churn rate issues using [series limiter](#cardinality-limiter).
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads. * Ideally works with big amounts of time series data from APM, Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). * Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* See also technical [Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html). * See also technical [Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html).
@ -347,6 +347,7 @@ Currently the following [scrape_config](https://prometheus.io/docs/prometheus/la
* [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config) * [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config)
* [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) * [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)
* [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) * [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config)
* [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config)
Other `*_sd_config` types will be supported in the future. Other `*_sd_config` types will be supported in the future.
@ -1725,6 +1726,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Wait time used by Consul service discovery. Default value is used if not set Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration -promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
-promscrape.disableCompression -promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive -promscrape.disableKeepAlive

View file

@ -181,6 +181,8 @@ The following scrape types in [scrape_config](https://prometheus.io/docs/prometh
See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details. See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details.
* `eureka_sd_configs` - is for scraping targets registered in [Netflix Eureka](https://github.com/Netflix/eureka). * `eureka_sd_configs` - is for scraping targets registered in [Netflix Eureka](https://github.com/Netflix/eureka).
See [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) for details. See [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) for details.
* `digitalocean_sd_configs` is for scraping targerts registered in [DigitalOcean](https://www.digitalocean.com/)
See [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config) for details.
Please file feature requests to [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`. Please file feature requests to [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
@ -631,6 +633,8 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Wait time used by Consul service discovery. Default value is used if not set Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration -promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
-promscrape.disableCompression -promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive -promscrape.disableKeepAlive
@ -719,7 +723,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234 Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234
Supports an array of values separated by comma or specified via multiple flags. Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.queues int -remoteWrite.queues int
The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage (default 4) The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage (default 2 * numberOfAvailableCPUs)
-remoteWrite.rateLimit array -remoteWrite.rateLimit array
Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Supports array of values separated by comma or specified via multiple flags. Supports array of values separated by comma or specified via multiple flags.

View file

@ -4,9 +4,10 @@ sort: 4
# vmalert # vmalert
`vmalert` executes a list of given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) `vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
rules against configured address. rules against configured address. It is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/latest/overview/)
implementation and aims to be compatible with its syntax.
## Features ## Features
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB; * Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
@ -44,21 +45,23 @@ To start using `vmalert` you will need the following things:
* datasource address - reachable VictoriaMetrics instance for rules execution; * datasource address - reachable VictoriaMetrics instance for rules execution;
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, * notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts and sending notifications. aggregating alerts and sending notifications.
* remote write address - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage address for storing recording rules results and alerts state in for of timeseries. This is optional. compatible storage address for storing recording rules results and alerts state in for of timeseries.
Then configure `vmalert` accordingly: Then configure `vmalert` accordingly:
``` ```
./bin/vmalert -rule=alert.rules \ ./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource -datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL -notifier.url=http://localhost:9093 \ # AlertManager URL
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL -notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules -remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from -remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule -external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a # Multiple external labels may be set -external.label=replica=a # Multiple external labels may be set
``` ```
See the fill list of configuration flags in [configuration](#configuration) section.
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts. to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
@ -66,7 +69,7 @@ Configuration for [recording](https://prometheus.io/docs/prometheus/latest/confi
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found similar to Prometheus rules and configured using YAML. Configuration examples may be found
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder. in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
Every `rule` belongs to `group` and every configuration file may contain arbitrary number of groups: Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
```yaml ```yaml
groups: groups:
[ - <rule_group> ] [ - <rule_group> ]
@ -74,15 +77,15 @@ groups:
### Groups ### Groups
Each group has following attributes: Each group has the following attributes:
```yaml ```yaml
# The name of the group. Must be unique within a file. # The name of the group. Must be unique within a file.
name: <string> name: <string>
# How often rules in the group are evaluated. # How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ] [ interval: <duration> | default = -evaluationInterval flag ]
# How many rules execute at once. Increasing concurrency may speed # How many rules execute at once within a group. Increasing concurrency may speed
# up round execution speed. # up round execution speed.
[ concurrency: <integer> | default = 1 ] [ concurrency: <integer> | default = 1 ]
@ -102,20 +105,25 @@ rules:
### Rules ### Rules
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
expression and then act according to the Rule type.
There are two types of Rules: There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) - * [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allows to define alert conditions via [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) Alerting rules allows to define alert conditions via `expr` field and to send notifications
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager). [Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) - * [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow you to precompute frequently needed or computationally expensive expressions Recording rules allows to define `expr` which result will be than backfilled to configured
and save their result as a new set of time series. `-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels `vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
within one group. within one group.
#### Alerting rules #### Alerting rules
The syntax for alerting rule is following: The syntax for alerting rule is the following:
```yaml ```yaml
# The name of the alert. Must be a valid metric name. # The name of the alert. Must be a valid metric name.
alert: <string> alert: <string>
@ -125,12 +133,14 @@ alert: <string>
[ type: <string> ] [ type: <string> ]
# The expression to evaluate. The expression language depends on the type value. # The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If type="graphite", then the expression # By default PromQL/MetricsQL expression is used. If type="graphite", then the expression
# must contain valid Graphite expression. # must contain valid Graphite expression.
expr: <string> expr: <string>
# Alerts are considered firing once they have been returned for this long. # Alerts are considered firing once they have been returned for this long.
# Alerts which have not yet fired for long enough are considered pending. # Alerts which have not yet fired for long enough are considered pending.
# If param is omitted or set to 0 then alerts will be immediately considered
# as firing once they return.
[ for: <duration> | default = 0s ] [ for: <duration> | default = 0s ]
# Labels to add or overwrite for each alert. # Labels to add or overwrite for each alert.
@ -168,12 +178,12 @@ labels:
[ <labelname>: <labelvalue> ] [ <labelname>: <labelvalue> ]
``` ```
For recording rules to work `-remoteWrite.url` must specified. For recording rules to work `-remoteWrite.url` must be specified.
### Alerts state on restarts ### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert` `vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags: the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state * `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol. into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
@ -183,17 +193,27 @@ The state stored to the configured address on every rule evaluation.
from configured address by querying time series with name `ALERTS_FOR_STATE`. from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for the proper state restoring. Restore process may fail if time series are missing Both flags are required for the proper state restoring. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert` in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
rules configuration. or received state doesn't match current `vmalert` rules configuration.
### Multitenancy ### Multitenancy
There are the following approaches for alerting and recording rules across [multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) exist: There are the following approaches for alerting and recording rules across
[multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy):
* To run a separate `vmalert` instance per each tenant. The corresponding tenant must be specified in `-datasource.url` command-line flag according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus` would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line flag must contain the url for the specific tenant as well. For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording rules to `AccountID=123`. * To run a separate `vmalert` instance per each tenant.
The corresponding tenant must be specified in `-datasource.url` command-line flag
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus`
would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line
flag must contain the url for the specific tenant as well.
For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if [enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used with `-clusterMode` command-line flag. For example: * To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used
with `-clusterMode` command-line flag. For example:
```yaml ```yaml
groups: groups:
@ -208,9 +228,13 @@ groups:
# Rules for accountID=456, projectID=789 # Rules for accountID=456, projectID=789
``` ```
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481` . `vmselect` automatically adds the specified tenant to urls per each recording rule in this case. If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmselect` automatically adds the specified tenant to urls per each recording rule in this case.
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise` tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags). The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### WEB ### WEB
@ -322,6 +346,9 @@ See full description for these flags in `./vmalert --help`.
## Configuration ## Configuration
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following: The shortlist of configuration flags is the following:
``` ```
-datasource.appendTypePrefix -datasource.appendTypePrefix
@ -514,9 +541,6 @@ The shortlist of configuration flags is the following:
Show VictoriaMetrics version Show VictoriaMetrics version
``` ```
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
`vmalert` supports "hot" config reload via the following methods: `vmalert` supports "hot" config reload via the following methods:
* send SIGHUP signal to `vmalert` process; * send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint; * send GET request to `/-/reload` endpoint;

View file

@ -4,9 +4,9 @@ sort: 10
## vmbackupmanager ## vmbackupmanager
VictoriaMetrics backup manager ***vmbackupmanager is a part of [enterprise package](https://victoriametrics.com/enterprise.html)***
This service automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**. Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc. Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed. The VictoriaMetrics backup manager automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**. Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc. Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed.
The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders which represent the backup intervals (hourly, daily, weekly and monthly) The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders which represent the backup intervals (hourly, daily, weekly and monthly)
The required flags for running the service are as follows: The required flags for running the service are as follows:
@ -53,7 +53,7 @@ There are two flags which could help with performance tuning:
* -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10) * -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10)
### Example of Usage ## Example of Usage
GCS and cluster version. You need to have a credentials file in json format with following structure GCS and cluster version. You need to have a credentials file in json format with following structure

View file

@ -4,6 +4,8 @@ sort: 9
# vmgateway # vmgateway
***vmgateway is a part of [enterprise package](https://victoriametrics.com/enterprise.html)***
<img alt="vmgateway" src="vmgateway-overview.jpeg"> <img alt="vmgateway" src="vmgateway-overview.jpeg">

2
go.mod
View file

@ -20,7 +20,7 @@ require (
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/golang/snappy v0.0.3 github.com/golang/snappy v0.0.3
github.com/influxdata/influxdb v1.9.2 github.com/influxdata/influxdb v1.9.2
github.com/klauspost/compress v1.13.0 github.com/klauspost/compress v1.13.1
github.com/mattn/go-isatty v0.0.13 // indirect github.com/mattn/go-isatty v0.0.13 // indirect
github.com/mattn/go-runewidth v0.0.13 // indirect github.com/mattn/go-runewidth v0.0.13 // indirect
github.com/oklog/ulid v1.3.1 github.com/oklog/ulid v1.3.1

4
go.sum
View file

@ -588,8 +588,8 @@ github.com/klauspost/compress v1.9.5/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0
github.com/klauspost/compress v1.10.7/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs= github.com/klauspost/compress v1.10.7/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs=
github.com/klauspost/compress v1.11.0/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs= github.com/klauspost/compress v1.11.0/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs=
github.com/klauspost/compress v1.12.2/go.mod h1:8dP1Hq4DHOhN9w426knH3Rhby4rFm6D8eO+e+Dq5Gzg= github.com/klauspost/compress v1.12.2/go.mod h1:8dP1Hq4DHOhN9w426knH3Rhby4rFm6D8eO+e+Dq5Gzg=
github.com/klauspost/compress v1.13.0 h1:2T7tUoQrQT+fQWdaY5rjWztFGAFwbGD04iPJg90ZiOs= github.com/klauspost/compress v1.13.1 h1:wXr2uRxZTJXHLly6qhJabee5JqIhTRoLBhDOA74hDEQ=
github.com/klauspost/compress v1.13.0/go.mod h1:8dP1Hq4DHOhN9w426knH3Rhby4rFm6D8eO+e+Dq5Gzg= github.com/klauspost/compress v1.13.1/go.mod h1:8dP1Hq4DHOhN9w426knH3Rhby4rFm6D8eO+e+Dq5Gzg=
github.com/klauspost/cpuid v0.0.0-20170728055534-ae7887de9fa5/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek= github.com/klauspost/cpuid v0.0.0-20170728055534-ae7887de9fa5/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
github.com/klauspost/crc32 v0.0.0-20161016154125-cb6bfca970f6/go.mod h1:+ZoRqAPRLkC4NPOvfYeR5KNOrY6TD+/sAC3HXPZgDYg= github.com/klauspost/crc32 v0.0.0-20161016154125-cb6bfca970f6/go.mod h1:+ZoRqAPRLkC4NPOvfYeR5KNOrY6TD+/sAC3HXPZgDYg=
github.com/klauspost/pgzip v1.0.2-0.20170402124221-0bf5dcad4ada/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs= github.com/klauspost/pgzip v1.0.2-0.20170402124221-0bf5dcad4ada/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=

View file

@ -177,7 +177,7 @@ func (ris *rawItemsShard) Len() int {
func (ris *rawItemsShard) addItems(tb *Table, items [][]byte) error { func (ris *rawItemsShard) addItems(tb *Table, items [][]byte) error {
var err error var err error
var blocksToMerge []*inmemoryBlock var blocksToFlush []*inmemoryBlock
ris.mu.Lock() ris.mu.Lock()
ibs := ris.ibs ibs := ris.ibs
@ -200,19 +200,16 @@ func (ris *rawItemsShard) addItems(tb *Table, items [][]byte) error {
} }
} }
if len(ibs) >= maxBlocksPerShard { if len(ibs) >= maxBlocksPerShard {
blocksToMerge = ibs blocksToFlush = append(blocksToFlush, ibs...)
ris.ibs = make([]*inmemoryBlock, 0, maxBlocksPerShard) for i := range ibs {
ibs[i] = nil
}
ris.ibs = ibs[:0]
ris.lastFlushTime = fasttime.UnixTimestamp() ris.lastFlushTime = fasttime.UnixTimestamp()
} }
ris.mu.Unlock() ris.mu.Unlock()
if blocksToMerge == nil { tb.mergeRawItemsBlocks(blocksToFlush)
// Fast path.
return err
}
// Slow path: merge blocksToMerge.
tb.mergeRawItemsBlocks(blocksToMerge)
return err return err
} }
@ -586,58 +583,65 @@ func (riss *rawItemsShards) flush(tb *Table, isFinal bool) {
tb.rawItemsPendingFlushesWG.Add(1) tb.rawItemsPendingFlushesWG.Add(1)
defer tb.rawItemsPendingFlushesWG.Done() defer tb.rawItemsPendingFlushesWG.Done()
var wg sync.WaitGroup var blocksToFlush []*inmemoryBlock
wg.Add(len(riss.shards))
for i := range riss.shards { for i := range riss.shards {
go func(ris *rawItemsShard) { blocksToFlush = riss.shards[i].appendBlocksToFlush(blocksToFlush, tb, isFinal)
ris.flush(tb, isFinal)
wg.Done()
}(&riss.shards[i])
} }
wg.Wait() tb.mergeRawItemsBlocks(blocksToFlush)
} }
func (ris *rawItemsShard) flush(tb *Table, isFinal bool) { func (ris *rawItemsShard) appendBlocksToFlush(dst []*inmemoryBlock, tb *Table, isFinal bool) []*inmemoryBlock {
mustFlush := false
currentTime := fasttime.UnixTimestamp() currentTime := fasttime.UnixTimestamp()
flushSeconds := int64(rawItemsFlushInterval.Seconds()) flushSeconds := int64(rawItemsFlushInterval.Seconds())
if flushSeconds <= 0 { if flushSeconds <= 0 {
flushSeconds = 1 flushSeconds = 1
} }
var blocksToMerge []*inmemoryBlock
ris.mu.Lock() ris.mu.Lock()
if isFinal || currentTime-ris.lastFlushTime > uint64(flushSeconds) { if isFinal || currentTime-ris.lastFlushTime > uint64(flushSeconds) {
mustFlush = true ibs := ris.ibs
blocksToMerge = ris.ibs dst = append(dst, ibs...)
ris.ibs = make([]*inmemoryBlock, 0, maxBlocksPerShard) for i := range ibs {
ibs[i] = nil
}
ris.ibs = ibs[:0]
ris.lastFlushTime = currentTime ris.lastFlushTime = currentTime
} }
ris.mu.Unlock() ris.mu.Unlock()
if mustFlush { return dst
tb.mergeRawItemsBlocks(blocksToMerge)
}
} }
func (tb *Table) mergeRawItemsBlocks(blocksToMerge []*inmemoryBlock) { func (tb *Table) mergeRawItemsBlocks(ibs []*inmemoryBlock) {
if len(ibs) == 0 {
return
}
tb.partMergersWG.Add(1) tb.partMergersWG.Add(1)
defer tb.partMergersWG.Done() defer tb.partMergersWG.Done()
pws := make([]*partWrapper, 0, (len(blocksToMerge)+defaultPartsToMerge-1)/defaultPartsToMerge) pws := make([]*partWrapper, 0, (len(ibs)+defaultPartsToMerge-1)/defaultPartsToMerge)
for len(blocksToMerge) > 0 { var pwsLock sync.Mutex
var wg sync.WaitGroup
for len(ibs) > 0 {
n := defaultPartsToMerge n := defaultPartsToMerge
if n > len(blocksToMerge) { if n > len(ibs) {
n = len(blocksToMerge) n = len(ibs)
} }
pw := tb.mergeInmemoryBlocks(blocksToMerge[:n]) wg.Add(1)
blocksToMerge = blocksToMerge[n:] go func(ibsPart []*inmemoryBlock) {
if pw == nil { defer wg.Done()
continue pw := tb.mergeInmemoryBlocks(ibsPart)
} if pw == nil {
pw.isInMerge = true return
pws = append(pws, pw) }
pw.isInMerge = true
pwsLock.Lock()
pws = append(pws, pw)
pwsLock.Unlock()
}(ibs[:n])
ibs = ibs[n:]
} }
wg.Wait()
if len(pws) > 0 { if len(pws) > 0 {
if err := tb.mergeParts(pws, nil, true); err != nil { if err := tb.mergeParts(pws, nil, true); err != nil {
logger.Panicf("FATAL: cannot merge raw parts: %s", err) logger.Panicf("FATAL: cannot merge raw parts: %s", err)
@ -672,10 +676,10 @@ func (tb *Table) mergeRawItemsBlocks(blocksToMerge []*inmemoryBlock) {
} }
} }
func (tb *Table) mergeInmemoryBlocks(blocksToMerge []*inmemoryBlock) *partWrapper { func (tb *Table) mergeInmemoryBlocks(ibs []*inmemoryBlock) *partWrapper {
// Convert blocksToMerge into inmemoryPart's // Convert ibs into inmemoryPart's
mps := make([]*inmemoryPart, 0, len(blocksToMerge)) mps := make([]*inmemoryPart, 0, len(ibs))
for _, ib := range blocksToMerge { for _, ib := range ibs {
if len(ib.items) == 0 { if len(ib.items) == 0 {
continue continue
} }

View file

@ -128,10 +128,10 @@ func newClient(sw *ScrapeWork) *client {
ResponseHeaderTimeout: sw.ScrapeTimeout, ResponseHeaderTimeout: sw.ScrapeTimeout,
}, },
// Set 10x bigger timeout than the sw.ScrapeTimeout, since the duration for reading the full response // Set 30x bigger timeout than the sw.ScrapeTimeout, since the duration for reading the full response
// can be much bigger because of stream parsing. // can be much bigger because of stream parsing.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047 // See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
Timeout: 10 * sw.ScrapeTimeout, Timeout: 30 * sw.ScrapeTimeout,
} }
if sw.DenyRedirects { if sw.DenyRedirects {
sc.CheckRedirect = func(req *http.Request, via []*http.Request) error { sc.CheckRedirect = func(req *http.Request, via []*http.Request) error {

View file

@ -19,6 +19,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/digitalocean"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dockerswarm" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dockerswarm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/ec2" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/ec2"
@ -66,6 +67,8 @@ func (cfg *Config) mustStart() {
for i := range cfg.ScrapeConfigs { for i := range cfg.ScrapeConfigs {
cfg.ScrapeConfigs[i].mustStart(cfg.baseDir) cfg.ScrapeConfigs[i].mustStart(cfg.baseDir)
} }
jobNames := cfg.getJobNames()
tsmGlobal.registerJobNames(jobNames)
logger.Infof("started service discovery routines in %.3f seconds", time.Since(startTime).Seconds()) logger.Infof("started service discovery routines in %.3f seconds", time.Since(startTime).Seconds())
} }
@ -78,6 +81,15 @@ func (cfg *Config) mustStop() {
logger.Infof("stopped service discovery routines in %.3f seconds", time.Since(startTime).Seconds()) logger.Infof("stopped service discovery routines in %.3f seconds", time.Since(startTime).Seconds())
} }
// getJobNames returns all the scrape job names from the cfg.
func (cfg *Config) getJobNames() []string {
a := make([]string, 0, len(cfg.ScrapeConfigs))
for i := range cfg.ScrapeConfigs {
a = append(a, cfg.ScrapeConfigs[i].JobName)
}
return a
}
// GlobalConfig represents essential parts for `global` section of Prometheus config. // GlobalConfig represents essential parts for `global` section of Prometheus config.
// //
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/ // See https://prometheus.io/docs/prometheus/latest/configuration/configuration/
@ -106,16 +118,17 @@ type ScrapeConfig struct {
MetricRelabelConfigs []promrelabel.RelabelConfig `yaml:"metric_relabel_configs,omitempty"` MetricRelabelConfigs []promrelabel.RelabelConfig `yaml:"metric_relabel_configs,omitempty"`
SampleLimit int `yaml:"sample_limit,omitempty"` SampleLimit int `yaml:"sample_limit,omitempty"`
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"` StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
FileSDConfigs []FileSDConfig `yaml:"file_sd_configs,omitempty"` FileSDConfigs []FileSDConfig `yaml:"file_sd_configs,omitempty"`
KubernetesSDConfigs []kubernetes.SDConfig `yaml:"kubernetes_sd_configs,omitempty"` KubernetesSDConfigs []kubernetes.SDConfig `yaml:"kubernetes_sd_configs,omitempty"`
OpenStackSDConfigs []openstack.SDConfig `yaml:"openstack_sd_configs,omitempty"` OpenStackSDConfigs []openstack.SDConfig `yaml:"openstack_sd_configs,omitempty"`
ConsulSDConfigs []consul.SDConfig `yaml:"consul_sd_configs,omitempty"` ConsulSDConfigs []consul.SDConfig `yaml:"consul_sd_configs,omitempty"`
EurekaSDConfigs []eureka.SDConfig `yaml:"eureka_sd_configs,omitempty"` EurekaSDConfigs []eureka.SDConfig `yaml:"eureka_sd_configs,omitempty"`
DockerSwarmSDConfigs []dockerswarm.SDConfig `yaml:"dockerswarm_sd_configs,omitempty"` DockerSwarmSDConfigs []dockerswarm.SDConfig `yaml:"dockerswarm_sd_configs,omitempty"`
DNSSDConfigs []dns.SDConfig `yaml:"dns_sd_configs,omitempty"` DNSSDConfigs []dns.SDConfig `yaml:"dns_sd_configs,omitempty"`
EC2SDConfigs []ec2.SDConfig `yaml:"ec2_sd_configs,omitempty"` EC2SDConfigs []ec2.SDConfig `yaml:"ec2_sd_configs,omitempty"`
GCESDConfigs []gce.SDConfig `yaml:"gce_sd_configs,omitempty"` GCESDConfigs []gce.SDConfig `yaml:"gce_sd_configs,omitempty"`
DigitaloceanSDConfigs []digitalocean.SDConfig `yaml:"digitalocean_sd_configs,omitempty"`
// These options are supported only by lib/promscrape. // These options are supported only by lib/promscrape.
RelabelDebug bool `yaml:"relabel_debug,omitempty"` RelabelDebug bool `yaml:"relabel_debug,omitempty"`
@ -488,6 +501,34 @@ func (cfg *Config) getGCESDScrapeWork(prev []*ScrapeWork) []*ScrapeWork {
return dst return dst
} }
// getDigitalOceanDScrapeWork returns `digitalocean_sd_configs` ScrapeWork from cfg.
func (cfg *Config) getDigitalOceanDScrapeWork(prev []*ScrapeWork) []*ScrapeWork {
swsPrevByJob := getSWSByJob(prev)
dst := make([]*ScrapeWork, 0, len(prev))
for i := range cfg.ScrapeConfigs {
sc := &cfg.ScrapeConfigs[i]
dstLen := len(dst)
ok := true
for j := range sc.DigitaloceanSDConfigs {
sdc := &sc.DigitaloceanSDConfigs[j]
var okLocal bool
dst, okLocal = appendSDScrapeWork(dst, sdc, cfg.baseDir, sc.swc, "digitalocean_sd_config")
if ok {
ok = okLocal
}
}
if ok {
continue
}
swsPrev := swsPrevByJob[sc.swc.jobName]
if len(swsPrev) > 0 {
logger.Errorf("there were errors when discovering digitalocean targets for job %q, so preserving the previous targets", sc.swc.jobName)
dst = append(dst[:dstLen], swsPrev...)
}
}
return dst
}
// getFileSDScrapeWork returns `file_sd_configs` ScrapeWork from cfg. // getFileSDScrapeWork returns `file_sd_configs` ScrapeWork from cfg.
func (cfg *Config) getFileSDScrapeWork(prev []*ScrapeWork) []*ScrapeWork { func (cfg *Config) getFileSDScrapeWork(prev []*ScrapeWork) []*ScrapeWork {
// Create a map for the previous scrape work. // Create a map for the previous scrape work.

View file

@ -0,0 +1,92 @@
package digitalocean
import (
"encoding/json"
"fmt"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discoveryutils"
)
var configMap = discoveryutils.NewConfigMap()
type apiConfig struct {
client *discoveryutils.Client
port int
}
func newAPIConfig(sdc *SDConfig, baseDir string) (*apiConfig, error) {
ac, err := sdc.HTTPClientConfig.NewConfig(baseDir)
if err != nil {
return nil, fmt.Errorf("cannot parse auth config: %w", err)
}
apiServer := sdc.Server
if apiServer == "" {
apiServer = "https://api.digitalocean.com"
}
if !strings.Contains(apiServer, "://") {
scheme := "http"
if sdc.HTTPClientConfig.TLSConfig != nil {
scheme = "https"
}
apiServer = scheme + "://" + apiServer
}
proxyAC, err := sdc.ProxyClientConfig.NewConfig(baseDir)
if err != nil {
return nil, fmt.Errorf("cannot parse proxy auth config: %w", err)
}
client, err := discoveryutils.NewClient(apiServer, ac, sdc.ProxyURL, proxyAC)
if err != nil {
return nil, fmt.Errorf("cannot create HTTP client for %q: %w", apiServer, err)
}
cfg := &apiConfig{
client: client,
port: sdc.Port,
}
if cfg.port == 0 {
cfg.port = 80
}
return cfg, nil
}
func getAPIConfig(sdc *SDConfig, baseDir string) (*apiConfig, error) {
v, err := configMap.Get(sdc, func() (interface{}, error) { return newAPIConfig(sdc, baseDir) })
if err != nil {
return nil, err
}
return v.(*apiConfig), nil
}
const dropletsAPIPath = "/v2/droplets"
func getDroplets(getAPIResponse func(string) ([]byte, error)) ([]droplet, error) {
var droplets []droplet
nextAPIURL := dropletsAPIPath
for nextAPIURL != "" {
data, err := getAPIResponse(nextAPIURL)
if err != nil {
return nil, fmt.Errorf("cannot fetch data from digitalocean list api: %w", err)
}
apiResp, err := parseAPIResponse(data)
if err != nil {
return nil, err
}
droplets = append(droplets, apiResp.Droplets...)
nextAPIURL, err = apiResp.nextURLPath()
if err != nil {
return nil, err
}
}
return droplets, nil
}
func parseAPIResponse(data []byte) (*listDropletResponse, error) {
var dps listDropletResponse
if err := json.Unmarshal(data, &dps); err != nil {
return nil, fmt.Errorf("failed parse digitalocean api response: %q, err: %w", data, err)
}
return &dps, nil
}

View file

@ -0,0 +1,349 @@
package digitalocean
import (
"reflect"
"testing"
)
func Test_parseAPIResponse(t *testing.T) {
type args struct {
data []byte
}
tests := []struct {
name string
args args
want *listDropletResponse
wantErr bool
}{
{
name: "simple parse",
args: args{data: []byte(`{
"droplets": [
{
"id": 3164444,
"name": "example.com",
"memory": 1024,
"vcpus": 1,
"status": "active",
"kernel": {
"id": 2233,
"name": "Ubuntu 14.04 x64 vmlinuz-3.13.0-37-generic",
"version": "3.13.0-37-generic"
},
"features": [
"backups",
"ipv6",
"virtio"
],
"snapshot_ids": [],
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64",
"public": true,
"regions": [
"nyc1"
]
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.182",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
],
"v6": [
{
"ip_address": "2604:A880:0800:0010:0000:0000:02DD:4001",
"netmask": 64,
"gateway": "2604:A880:0800:0010:0000:0000:0000:0001",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3",
"features": [
"private_networking",
"backups",
"ipv6"
]
},
"tags": [
"tag1",
"tag2"
],
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
}
],
"links": {
"pages": {
"last": "https://api.digitalocean.com/v2/droplets?page=3&per_page=1",
"next": "https://api.digitalocean.com/v2/droplets?page=2&per_page=1"
}
}
}`)},
want: &listDropletResponse{
Droplets: []droplet{
{
Image: struct {
Name string `json:"name"`
Slug string `json:"slug"`
}(struct {
Name string
Slug string
}{Name: "14.04 x64", Slug: "ubuntu-16-04-x64"}),
Region: struct {
Slug string `json:"slug"`
}(struct{ Slug string }{Slug: "nyc3"}),
Networks: networks{
V6: []network{
{
IPAddress: "2604:A880:0800:0010:0000:0000:02DD:4001",
Type: "public",
},
},
V4: []network{
{
IPAddress: "104.236.32.182",
Type: "public",
},
},
},
SizeSlug: "s-1vcpu-1gb",
Features: []string{"backups", "ipv6", "virtio"},
Tags: []string{"tag1", "tag2"},
Status: "active",
Name: "example.com",
ID: 3164444,
VpcUUID: "f9b0769c-e118-42fb-a0c4-fed15ef69662",
},
},
Links: links{
Pages: struct {
Last string `json:"last,omitempty"`
Next string `json:"next,omitempty"`
}(struct {
Last string
Next string
}{Last: "https://api.digitalocean.com/v2/droplets?page=3&per_page=1", Next: "https://api.digitalocean.com/v2/droplets?page=2&per_page=1"}),
},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := parseAPIResponse(tt.args.data)
if (err != nil) != tt.wantErr {
t.Errorf("parseAPIResponse() error = %v, wantErr %v", err, tt.wantErr)
return
}
if !reflect.DeepEqual(got, tt.want) {
t.Errorf("parseAPIResponse() got = \n%v\n, \nwant \n%v\n", got, tt.want)
}
})
}
}
func Test_getDroplets(t *testing.T) {
type args struct {
getAPIResponse func(string) ([]byte, error)
}
tests := []struct {
name string
args args
wantDropletCount int
wantErr bool
}{
{
name: "get 4 droples",
args: args{
func(s string) ([]byte, error) {
var resp []byte
switch s {
case dropletsAPIPath:
// return next
resp = []byte(`{ "droplets": [
{
"id": 3164444,
"name": "example.com",
"status": "active",
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64",
"public": true,
"regions": [
"nyc1"
]
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.182",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3"
},
"tags": [
"tag1",
"tag2"
],
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
},
{
"id": 3164444,
"name": "example.com",
"status": "active",
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64"
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.183",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3"
},
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
},
{
"id": 3164444,
"name": "example.com",
"status": "active",
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64"
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.183",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3"
},
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
}
],
"links": {
"pages": {
"last": "https://api.digitalocean.com/v2/droplets?page=3&per_page=1",
"next": "https://api.digitalocean.com/v2/droplets?page=2&per_page=1"
}
}
}`)
default:
// return with empty next
resp = []byte(`{ "droplets": [
{
"id": 3164444,
"name": "example.com",
"status": "active",
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64"
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.183",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3"
},
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
},
{
"id": 3164444,
"name": "example.com",
"status": "active",
"image": {
"id": 6918990,
"name": "14.04 x64",
"distribution": "Ubuntu",
"slug": "ubuntu-16-04-x64"
},
"size_slug": "s-1vcpu-1gb",
"networks": {
"v4": [
{
"ip_address": "104.236.32.183",
"netmask": "255.255.192.0",
"gateway": "104.236.0.1",
"type": "public"
}
]
},
"region": {
"name": "New York 3",
"slug": "nyc3"
},
"vpc_uuid": "f9b0769c-e118-42fb-a0c4-fed15ef69662"
}
]
}`)
}
return resp, nil
},
},
wantDropletCount: 5,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := getDroplets(tt.args.getAPIResponse)
if (err != nil) != tt.wantErr {
t.Errorf("getDroplets() error = %v, wantErr %v", err, tt.wantErr)
return
}
if len(got) != tt.wantDropletCount {
t.Fatalf("unexpected droplets count: %d, want: %d, \n droplets: %v\n", len(got), tt.wantDropletCount, got)
}
})
}
}

View file

@ -0,0 +1,148 @@
package digitalocean
import (
"fmt"
"net/url"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discoveryutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/proxy"
)
// SDConfig represents service discovery config for digital ocean.
//
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config
type SDConfig struct {
Server string `yaml:"server,omitempty"`
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
ProxyURL proxy.URL `yaml:"proxy_url,omitempty"`
ProxyClientConfig promauth.ProxyClientConfig `yaml:",inline"`
Port int `yaml:"port,omitempty"`
}
// GetLabels returns Digital Ocean droplet labels according to sdc.
func (sdc *SDConfig) GetLabels(baseDir string) ([]map[string]string, error) {
cfg, err := getAPIConfig(sdc, baseDir)
if err != nil {
return nil, fmt.Errorf("cannot get API config: %w", err)
}
droplets, err := getDroplets(cfg.client.GetAPIResponse)
if err != nil {
return nil, err
}
return addDropletLabels(droplets, cfg.port), nil
}
// https://developers.digitalocean.com/documentation/v2/#retrieve-an-existing-droplet-by-id
type droplet struct {
ID int `json:"id"`
Name string `json:"name"`
Status string `json:"status"`
Features []string `json:"features"`
Image struct {
Name string `json:"name"`
Slug string `json:"slug"`
} `json:"image"`
SizeSlug string `json:"size_slug"`
Networks networks `json:"networks"`
Region struct {
Slug string `json:"slug"`
} `json:"region"`
Tags []string `json:"tags"`
VpcUUID string `json:"vpc_uuid"`
}
func (d *droplet) getIPByNet(netVersion, netType string) string {
var dropletNetworks []network
switch netVersion {
case "v4":
dropletNetworks = d.Networks.V4
case "v6":
dropletNetworks = d.Networks.V6
default:
logger.Fatalf("BUG, unexpected network type: %s, want v4 or v6", netVersion)
}
for _, net := range dropletNetworks {
if net.Type == netType {
return net.IPAddress
}
}
return ""
}
type networks struct {
V4 []network `json:"v4"`
V6 []network `json:"v6"`
}
type network struct {
IPAddress string `json:"ip_address"`
// private | public.
Type string `json:"type"`
}
// https://developers.digitalocean.com/documentation/v2/#list-all-droplets
type listDropletResponse struct {
Droplets []droplet `json:"droplets,omitempty"`
Links links `json:"links,omitempty"`
}
type links struct {
Pages struct {
Last string `json:"last,omitempty"`
Next string `json:"next,omitempty"`
} `json:"pages,omitempty"`
}
func (r *listDropletResponse) nextURLPath() (string, error) {
if r.Links.Pages.Next == "" {
return "", nil
}
u, err := url.Parse(r.Links.Pages.Next)
if err != nil {
return "", fmt.Errorf("cannot parse digital ocean next url: %s, err: %s", r.Links.Pages.Next, err)
}
return u.RequestURI(), nil
}
func addDropletLabels(droplets []droplet, defaultPort int) []map[string]string {
var ms []map[string]string
for _, droplet := range droplets {
if len(droplet.Networks.V4) == 0 {
continue
}
privateIPv4 := droplet.getIPByNet("v4", "private")
publicIPv4 := droplet.getIPByNet("v4", "public")
publicIPv6 := droplet.getIPByNet("v6", "public")
addr := discoveryutils.JoinHostPort(publicIPv4, defaultPort)
m := map[string]string{
"__address__": addr,
"__meta_digitalocean_droplet_id": fmt.Sprintf("%d", droplet.ID),
"__meta_digitalocean_droplet_name": droplet.Name,
"__meta_digitalocean_image": droplet.Image.Slug,
"__meta_digitalocean_image_name": droplet.Image.Name,
"__meta_digitalocean_private_ipv4": privateIPv4,
"__meta_digitalocean_public_ipv4": publicIPv4,
"__meta_digitalocean_public_ipv6": publicIPv6,
"__meta_digitalocean_region": droplet.Region.Slug,
"__meta_digitalocean_size": droplet.SizeSlug,
"__meta_digitalocean_status": droplet.Status,
"__meta_digitalocean_vpc": droplet.VpcUUID,
}
if len(droplet.Features) > 0 {
features := fmt.Sprintf(",%s,", strings.Join(droplet.Features, ","))
m["__meta_digitalocean_features"] = features
}
if len(droplet.Tags) > 0 {
tags := fmt.Sprintf(",%s,", strings.Join(droplet.Tags, ","))
m["__meta_digitalocean_tags"] = tags
}
ms = append(ms, m)
}
return ms
}

View file

@ -0,0 +1,98 @@
package digitalocean
import (
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discoveryutils"
)
func Test_addDropletLabels(t *testing.T) {
type args struct {
droplets []droplet
defaultPort int
}
tests := []struct {
name string
args args
want [][]prompbmarshal.Label
}{
{
name: "base labels add test",
args: args{
droplets: []droplet{
{
ID: 15,
Tags: []string{"private", "test"},
Status: "active",
Name: "ubuntu-1",
Region: struct {
Slug string `json:"slug"`
}(struct{ Slug string }{Slug: "do"}),
Features: []string{"feature-1", "feature-2"},
SizeSlug: "base-1",
VpcUUID: "vpc-1",
Image: struct {
Name string `json:"name"`
Slug string `json:"slug"`
}(struct {
Name string
Slug string
}{Name: "ubuntu", Slug: "18"}),
Networks: networks{
V4: []network{
{
Type: "public",
IPAddress: "100.100.100.100",
},
{
Type: "private",
IPAddress: "10.10.10.10",
},
},
V6: []network{
{
Type: "public",
IPAddress: "::1",
},
},
},
},
},
defaultPort: 9100,
},
want: [][]prompbmarshal.Label{
discoveryutils.GetSortedLabels(map[string]string{
"__address__": "100.100.100.100:9100",
"__meta_digitalocean_droplet_id": "15",
"__meta_digitalocean_droplet_name": "ubuntu-1",
"__meta_digitalocean_features": ",feature-1,feature-2,",
"__meta_digitalocean_image": "18",
"__meta_digitalocean_image_name": "ubuntu",
"__meta_digitalocean_private_ipv4": "10.10.10.10",
"__meta_digitalocean_public_ipv4": "100.100.100.100",
"__meta_digitalocean_public_ipv6": "::1",
"__meta_digitalocean_region": "do",
"__meta_digitalocean_size": "base-1",
"__meta_digitalocean_status": "active",
"__meta_digitalocean_tags": ",private,test,",
"__meta_digitalocean_vpc": "vpc-1",
}),
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := addDropletLabels(tt.args.droplets, tt.args.defaultPort)
var sortedLabelss [][]prompbmarshal.Label
for _, labels := range got {
sortedLabelss = append(sortedLabelss, discoveryutils.GetSortedLabels(labels))
}
if !reflect.DeepEqual(sortedLabelss, tt.want) {
t.Errorf("addTasksLabels() \ngot \n%v\n, \nwant \n%v\n", sortedLabelss, tt.want)
}
})
}
}

View file

@ -41,6 +41,9 @@ var (
dockerswarmSDCheckInterval = flag.Duration("promscrape.dockerswarmSDCheckInterval", 30*time.Second, "Interval for checking for changes in dockerswarm. "+ dockerswarmSDCheckInterval = flag.Duration("promscrape.dockerswarmSDCheckInterval", 30*time.Second, "Interval for checking for changes in dockerswarm. "+
"This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. "+ "This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. "+
"See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details") "See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details")
digitaloceanSDCheckInterval = flag.Duration("promscrape.digitaloceanSDCheckInterval", time.Minute, "Interval for checking for changes in digital ocean. "+
"This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. "+
"See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details")
promscrapeConfigFile = flag.String("promscrape.config", "", "Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. "+ promscrapeConfigFile = flag.String("promscrape.config", "", "Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. "+
"See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details") "See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details")
suppressDuplicateScrapeTargetErrors = flag.Bool("promscrape.suppressDuplicateScrapeTargetErrors", false, "Whether to suppress 'duplicate scrape target' errors; "+ suppressDuplicateScrapeTargetErrors = flag.Bool("promscrape.suppressDuplicateScrapeTargetErrors", false, "Whether to suppress 'duplicate scrape target' errors; "+
@ -111,6 +114,7 @@ func runScraper(configFile string, pushData func(wr *prompbmarshal.WriteRequest)
scs.add("ec2_sd_configs", *ec2SDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getEC2SDScrapeWork(swsPrev) }) scs.add("ec2_sd_configs", *ec2SDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getEC2SDScrapeWork(swsPrev) })
scs.add("gce_sd_configs", *gceSDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getGCESDScrapeWork(swsPrev) }) scs.add("gce_sd_configs", *gceSDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getGCESDScrapeWork(swsPrev) })
scs.add("dockerswarm_sd_configs", *dockerswarmSDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getDockerSwarmSDScrapeWork(swsPrev) }) scs.add("dockerswarm_sd_configs", *dockerswarmSDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getDockerSwarmSDScrapeWork(swsPrev) })
scs.add("digitalocean_sd_configs", *digitaloceanSDCheckInterval, func(cfg *Config, swsPrev []*ScrapeWork) []*ScrapeWork { return cfg.getDigitalOceanDScrapeWork(swsPrev) })
var tickerCh <-chan time.Time var tickerCh <-chan time.Time
if *configCheckInterval > 0 { if *configCheckInterval > 0 {

View file

@ -324,7 +324,7 @@ func (sw *scrapeWork) scrapeInternal(scrapeTimestamp, realTimestamp int64) error
// body must be released only after wc is released, since wc refers to body. // body must be released only after wc is released, since wc refers to body.
sw.prevBodyLen = len(body.B) sw.prevBodyLen = len(body.B)
leveledbytebufferpool.Put(body) leveledbytebufferpool.Put(body)
tsmGlobal.Update(sw.Config, sw.ScrapeGroup, up == 1, realTimestamp, int64(duration*1000), err) tsmGlobal.Update(sw.Config, sw.ScrapeGroup, up == 1, realTimestamp, int64(duration*1000), samplesScraped, err)
return err return err
} }
@ -391,7 +391,7 @@ func (sw *scrapeWork) scrapeStream(scrapeTimestamp, realTimestamp int64) error {
sw.prevLabelsLen = len(wc.labels) sw.prevLabelsLen = len(wc.labels)
wc.reset() wc.reset()
writeRequestCtxPool.Put(wc) writeRequestCtxPool.Put(wc)
tsmGlobal.Update(sw.Config, sw.ScrapeGroup, up == 1, realTimestamp, int64(duration*1000), err) tsmGlobal.Update(sw.Config, sw.ScrapeGroup, up == 1, realTimestamp, int64(duration*1000), samplesScraped, err)
return err return err
} }

View file

@ -1,9 +1,9 @@
{% import "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" {% import "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
%} %}
{% collapsespace %} {% stripspace %}
{% func TargetsResponsePlain (jts []jobTargetsStatuses, showOriginLabels bool) -%} {% func TargetsResponsePlain(jts []jobTargetsStatuses, emptyJobs []string, showOriginLabels bool) %}
{% for _, js := range jts %} {% for _, js := range jts %}
job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up) job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
@ -13,21 +13,26 @@ job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
labels := promLabelsString(ts.labels) labels := promLabelsString(ts.labels)
ol := promLabelsString(ts.originalLabels) ol := promLabelsString(ts.originalLabels)
%} %}
{%s= "\t" %}state={% if ts.up %}up{% else %}down{% endif %}, {%s= "\t" %}state={% if ts.up %}up{% else %}down{% endif %},{% space %}
endpoint={%s= ts.endpoint %}, endpoint={%s= ts.endpoint %},{ %space %}
labels={%s= labels %} labels={%s= labels %}
{% if showOriginLabels %}, originalLabels={%s= ol %}{% endif %}, {% if showOriginLabels %}, originalLabels={%s= ol %}{% endif %},{% space %}
last_scrape={%f.3 ts.lastScrapeTime.Seconds() %}s ago, last_scrape={%f.3 ts.lastScrapeTime.Seconds() %}s ago,{% space %}
scrape_duration={%f.3 ts.scrapeDuration.Seconds() %}s, scrape_duration={%f.3 ts.scrapeDuration.Seconds() %}s,{% space %}
error={%q= ts.error %} samples_scraped={%d ts.samplesScraped %},{% space %}
error={%q= ts.errMsg %}
{% newline %} {% newline %}
{% endfor %} {% endfor %}
{% endfor %} {% endfor %}
{% for _, jobName := range emptyJobs %}
job={%q= jobName %} (0/0 up)
{% newline %} {% newline %}
{% endfor %}
{% endfunc %} {% endfunc %}
{% func TargetsResponseHTML(jts []jobTargetsStatuses, redirectPath string, onlyUnhealthy bool) %} {% func TargetsResponseHTML(jts []jobTargetsStatuses, emptyJobs []string, redirectPath string, onlyUnhealthy bool) %}
<!DOCTYPE html> <!DOCTYPE html>
<html lang="en"> <html lang="en">
<head> <head>
@ -48,7 +53,7 @@ job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
Unhealthy Unhealthy
</button> </button>
</div> </div>
{% for _,js :=range jts %} {% for _, js := range jts %}
{% if onlyUnhealthy && js.upCount == js.targetsTotal %}{% continue %}{% endif %} {% if onlyUnhealthy && js.upCount == js.targetsTotal %}{% continue %}{% endif %}
<div> <div>
<h4> <h4>
@ -62,6 +67,7 @@ job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
<th scope="col">Labels</th> <th scope="col">Labels</th>
<th scope="col">Last Scrape</th> <th scope="col">Last Scrape</th>
<th scope="col">Scrape Duration</th> <th scope="col">Scrape Duration</th>
<th scope="col">Samples Scraped</th>
<th scope="col">Error</th> <th scope="col">Error</th>
</tr> </tr>
</thead> </thead>
@ -76,13 +82,35 @@ job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
</td> </td>
<td>{%f.3 ts.lastScrapeTime.Seconds() %}s ago</td> <td>{%f.3 ts.lastScrapeTime.Seconds() %}s ago</td>
<td>{%f.3 ts.scrapeDuration.Seconds() %}s</td> <td>{%f.3 ts.scrapeDuration.Seconds() %}s</td>
<td>{%s ts.error %}</td> <td>{%d ts.samplesScraped %}</td>
<td>{%s ts.errMsg %}</td>
</tr> </tr>
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div> </div>
{% endfor %} {% endfor %}
{% for _, jobName := range emptyJobs %}
<div>
<h4>
<a>{%s jobName %} (0/0 up)</a>
</h4>
<table class="table table-striped table-hover table-bordered table-sm">
<thead>
<tr>
<th scope="col">Endpoint</th>
<th scope="col">State</th>
<th scope="col">Labels</th>
<th scope="col">Last Scrape</th>
<th scope="col">Scrape Duration</th>
<th scope="col">Samples Scraped</th>
<th scope="col">Error</th>
</tr>
</thead>
</table>
</div>
{% endfor %}
</body> </body>
</html> </html>
{% endfunc %} {% endfunc %}
@ -93,4 +121,4 @@ job={%q= js.job %} ({%d js.upCount %}/{%d js.targetsTotal %} up)
{% endfor %} {% endfor %}
{% endfunc %} {% endfunc %}
{% endcollapsespace %} {% endstripspace %}

View file

@ -21,15 +21,15 @@ var (
) )
//line lib/promscrape/targets_response.qtpl:6 //line lib/promscrape/targets_response.qtpl:6
func StreamTargetsResponsePlain(qw422016 *qt422016.Writer, jts []jobTargetsStatuses, showOriginLabels bool) { func StreamTargetsResponsePlain(qw422016 *qt422016.Writer, jts []jobTargetsStatuses, emptyJobs []string, showOriginLabels bool) {
//line lib/promscrape/targets_response.qtpl:8 //line lib/promscrape/targets_response.qtpl:8
for _, js := range jts { for _, js := range jts {
//line lib/promscrape/targets_response.qtpl:8 //line lib/promscrape/targets_response.qtpl:8
qw422016.N().S(` job=`) qw422016.N().S(`job=`)
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
qw422016.N().Q(js.job) qw422016.N().Q(js.job)
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
qw422016.N().S(` (`) qw422016.N().S(`(`)
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
qw422016.N().D(js.upCount) qw422016.N().D(js.upCount)
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
@ -37,22 +37,16 @@ func StreamTargetsResponsePlain(qw422016 *qt422016.Writer, jts []jobTargetsStatu
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
qw422016.N().D(js.targetsTotal) qw422016.N().D(js.targetsTotal)
//line lib/promscrape/targets_response.qtpl:9 //line lib/promscrape/targets_response.qtpl:9
qw422016.N().S(` up) `) qw422016.N().S(`up)`)
//line lib/promscrape/targets_response.qtpl:10 //line lib/promscrape/targets_response.qtpl:10
qw422016.N().S(` qw422016.N().S(`
`) `)
//line lib/promscrape/targets_response.qtpl:10
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:11 //line lib/promscrape/targets_response.qtpl:11
for _, ts := range js.targetsStatus { for _, ts := range js.targetsStatus {
//line lib/promscrape/targets_response.qtpl:11
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:13 //line lib/promscrape/targets_response.qtpl:13
labels := promLabelsString(ts.labels) labels := promLabelsString(ts.labels)
ol := promLabelsString(ts.originalLabels) ol := promLabelsString(ts.originalLabels)
//line lib/promscrape/targets_response.qtpl:15
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:16 //line lib/promscrape/targets_response.qtpl:16
qw422016.N().S("\t") qw422016.N().S("\t")
//line lib/promscrape/targets_response.qtpl:16 //line lib/promscrape/targets_response.qtpl:16
@ -68,15 +62,17 @@ func StreamTargetsResponsePlain(qw422016 *qt422016.Writer, jts []jobTargetsStatu
//line lib/promscrape/targets_response.qtpl:16 //line lib/promscrape/targets_response.qtpl:16
} }
//line lib/promscrape/targets_response.qtpl:16 //line lib/promscrape/targets_response.qtpl:16
qw422016.N().S(`, endpoint=`) qw422016.N().S(`,`)
//line lib/promscrape/targets_response.qtpl:16
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:16
qw422016.N().S(`endpoint=`)
//line lib/promscrape/targets_response.qtpl:17 //line lib/promscrape/targets_response.qtpl:17
qw422016.N().S(ts.endpoint) qw422016.N().S(ts.endpoint)
//line lib/promscrape/targets_response.qtpl:17 //line lib/promscrape/targets_response.qtpl:17
qw422016.N().S(`, labels=`) qw422016.N().S(`,{ %space %}labels=`)
//line lib/promscrape/targets_response.qtpl:18 //line lib/promscrape/targets_response.qtpl:18
qw422016.N().S(labels) qw422016.N().S(labels)
//line lib/promscrape/targets_response.qtpl:18
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:19 //line lib/promscrape/targets_response.qtpl:19
if showOriginLabels { if showOriginLabels {
//line lib/promscrape/targets_response.qtpl:19 //line lib/promscrape/targets_response.qtpl:19
@ -86,288 +82,308 @@ func StreamTargetsResponsePlain(qw422016 *qt422016.Writer, jts []jobTargetsStatu
//line lib/promscrape/targets_response.qtpl:19 //line lib/promscrape/targets_response.qtpl:19
} }
//line lib/promscrape/targets_response.qtpl:19 //line lib/promscrape/targets_response.qtpl:19
qw422016.N().S(`, last_scrape=`) qw422016.N().S(`,`)
//line lib/promscrape/targets_response.qtpl:19
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:19
qw422016.N().S(`last_scrape=`)
//line lib/promscrape/targets_response.qtpl:20 //line lib/promscrape/targets_response.qtpl:20
qw422016.N().FPrec(ts.lastScrapeTime.Seconds(), 3) qw422016.N().FPrec(ts.lastScrapeTime.Seconds(), 3)
//line lib/promscrape/targets_response.qtpl:20 //line lib/promscrape/targets_response.qtpl:20
qw422016.N().S(`s ago, scrape_duration=`) qw422016.N().S(`s ago,`)
//line lib/promscrape/targets_response.qtpl:20
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:20
qw422016.N().S(`scrape_duration=`)
//line lib/promscrape/targets_response.qtpl:21 //line lib/promscrape/targets_response.qtpl:21
qw422016.N().FPrec(ts.scrapeDuration.Seconds(), 3) qw422016.N().FPrec(ts.scrapeDuration.Seconds(), 3)
//line lib/promscrape/targets_response.qtpl:21 //line lib/promscrape/targets_response.qtpl:21
qw422016.N().S(`s, error=`) qw422016.N().S(`s,`)
//line lib/promscrape/targets_response.qtpl:21
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:21
qw422016.N().S(`samples_scraped=`)
//line lib/promscrape/targets_response.qtpl:22 //line lib/promscrape/targets_response.qtpl:22
qw422016.N().Q(ts.error) qw422016.N().D(ts.samplesScraped)
//line lib/promscrape/targets_response.qtpl:22
qw422016.N().S(`,`)
//line lib/promscrape/targets_response.qtpl:22 //line lib/promscrape/targets_response.qtpl:22
qw422016.N().S(` `) qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:22
qw422016.N().S(`error=`)
//line lib/promscrape/targets_response.qtpl:23 //line lib/promscrape/targets_response.qtpl:23
qw422016.N().Q(ts.errMsg)
//line lib/promscrape/targets_response.qtpl:24
qw422016.N().S(` qw422016.N().S(`
`) `)
//line lib/promscrape/targets_response.qtpl:23 //line lib/promscrape/targets_response.qtpl:25
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:24
} }
//line lib/promscrape/targets_response.qtpl:24
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:25
}
//line lib/promscrape/targets_response.qtpl:25
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:26 //line lib/promscrape/targets_response.qtpl:26
qw422016.N().S(` }
//line lib/promscrape/targets_response.qtpl:28
for _, jobName := range emptyJobs {
//line lib/promscrape/targets_response.qtpl:28
qw422016.N().S(`job=`)
//line lib/promscrape/targets_response.qtpl:29
qw422016.N().Q(jobName)
//line lib/promscrape/targets_response.qtpl:29
qw422016.N().S(`(0/0 up)`)
//line lib/promscrape/targets_response.qtpl:30
qw422016.N().S(`
`) `)
//line lib/promscrape/targets_response.qtpl:26 //line lib/promscrape/targets_response.qtpl:31
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:28
}
//line lib/promscrape/targets_response.qtpl:28
func WriteTargetsResponsePlain(qq422016 qtio422016.Writer, jts []jobTargetsStatuses, showOriginLabels bool) {
//line lib/promscrape/targets_response.qtpl:28
qw422016 := qt422016.AcquireWriter(qq422016)
//line lib/promscrape/targets_response.qtpl:28
StreamTargetsResponsePlain(qw422016, jts, showOriginLabels)
//line lib/promscrape/targets_response.qtpl:28
qt422016.ReleaseWriter(qw422016)
//line lib/promscrape/targets_response.qtpl:28
}
//line lib/promscrape/targets_response.qtpl:28
func TargetsResponsePlain(jts []jobTargetsStatuses, showOriginLabels bool) string {
//line lib/promscrape/targets_response.qtpl:28
qb422016 := qt422016.AcquireByteBuffer()
//line lib/promscrape/targets_response.qtpl:28
WriteTargetsResponsePlain(qb422016, jts, showOriginLabels)
//line lib/promscrape/targets_response.qtpl:28
qs422016 := string(qb422016.B)
//line lib/promscrape/targets_response.qtpl:28
qt422016.ReleaseByteBuffer(qb422016)
//line lib/promscrape/targets_response.qtpl:28
return qs422016
//line lib/promscrape/targets_response.qtpl:28
}
//line lib/promscrape/targets_response.qtpl:30
func StreamTargetsResponseHTML(qw422016 *qt422016.Writer, jts []jobTargetsStatuses, redirectPath string, onlyUnhealthy bool) {
//line lib/promscrape/targets_response.qtpl:30
qw422016.N().S(` <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-giJF6kkoqNQ00vy+HMDP7azOuL0xtbfIcaT9wjKHr8RbDVddVHyTfAAsrekwKmP1" crossorigin="anonymous"> <title>Scrape targets</title> </head> <body class="m-3"> <h1>Scrape targets</h1> <div> <button type="button" class="btn `)
//line lib/promscrape/targets_response.qtpl:42
if !onlyUnhealthy {
//line lib/promscrape/targets_response.qtpl:42
qw422016.N().S(`btn-primary`)
//line lib/promscrape/targets_response.qtpl:42
} else {
//line lib/promscrape/targets_response.qtpl:42
qw422016.N().S(`btn-secondary`)
//line lib/promscrape/targets_response.qtpl:42
} }
//line lib/promscrape/targets_response.qtpl:42 //line lib/promscrape/targets_response.qtpl:33
qw422016.N().S(`" `) }
//line lib/promscrape/targets_response.qtpl:43
//line lib/promscrape/targets_response.qtpl:33
func WriteTargetsResponsePlain(qq422016 qtio422016.Writer, jts []jobTargetsStatuses, emptyJobs []string, showOriginLabels bool) {
//line lib/promscrape/targets_response.qtpl:33
qw422016 := qt422016.AcquireWriter(qq422016)
//line lib/promscrape/targets_response.qtpl:33
StreamTargetsResponsePlain(qw422016, jts, emptyJobs, showOriginLabels)
//line lib/promscrape/targets_response.qtpl:33
qt422016.ReleaseWriter(qw422016)
//line lib/promscrape/targets_response.qtpl:33
}
//line lib/promscrape/targets_response.qtpl:33
func TargetsResponsePlain(jts []jobTargetsStatuses, emptyJobs []string, showOriginLabels bool) string {
//line lib/promscrape/targets_response.qtpl:33
qb422016 := qt422016.AcquireByteBuffer()
//line lib/promscrape/targets_response.qtpl:33
WriteTargetsResponsePlain(qb422016, jts, emptyJobs, showOriginLabels)
//line lib/promscrape/targets_response.qtpl:33
qs422016 := string(qb422016.B)
//line lib/promscrape/targets_response.qtpl:33
qt422016.ReleaseByteBuffer(qb422016)
//line lib/promscrape/targets_response.qtpl:33
return qs422016
//line lib/promscrape/targets_response.qtpl:33
}
//line lib/promscrape/targets_response.qtpl:35
func StreamTargetsResponseHTML(qw422016 *qt422016.Writer, jts []jobTargetsStatuses, emptyJobs []string, redirectPath string, onlyUnhealthy bool) {
//line lib/promscrape/targets_response.qtpl:35
qw422016.N().S(`<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1"><link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-giJF6kkoqNQ00vy+HMDP7azOuL0xtbfIcaT9wjKHr8RbDVddVHyTfAAsrekwKmP1" crossorigin="anonymous"><title>Scrape targets</title></head><body class="m-3"><h1>Scrape targets</h1><div><button type="button" class="btn`)
//line lib/promscrape/targets_response.qtpl:47
if !onlyUnhealthy {
//line lib/promscrape/targets_response.qtpl:47
qw422016.N().S(`btn-primary`)
//line lib/promscrape/targets_response.qtpl:47
} else {
//line lib/promscrape/targets_response.qtpl:47
qw422016.N().S(`btn-secondary`)
//line lib/promscrape/targets_response.qtpl:47
}
//line lib/promscrape/targets_response.qtpl:47
qw422016.N().S(`"`)
//line lib/promscrape/targets_response.qtpl:48
if onlyUnhealthy { if onlyUnhealthy {
//line lib/promscrape/targets_response.qtpl:43 //line lib/promscrape/targets_response.qtpl:48
qw422016.N().S(`onclick="location.href='`) qw422016.N().S(`onclick="location.href='`)
//line lib/promscrape/targets_response.qtpl:43 //line lib/promscrape/targets_response.qtpl:48
qw422016.E().S(redirectPath) qw422016.E().S(redirectPath)
//line lib/promscrape/targets_response.qtpl:43 //line lib/promscrape/targets_response.qtpl:48
qw422016.N().S(`'"`) qw422016.N().S(`'"`)
//line lib/promscrape/targets_response.qtpl:43 //line lib/promscrape/targets_response.qtpl:48
} }
//line lib/promscrape/targets_response.qtpl:43 //line lib/promscrape/targets_response.qtpl:48
qw422016.N().S(`> All </button> <button type="button" class="btn `) qw422016.N().S(`>All</button><button type="button" class="btn`)
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
if onlyUnhealthy { if onlyUnhealthy {
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
qw422016.N().S(`btn-primary`) qw422016.N().S(`btn-primary`)
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
} else { } else {
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
qw422016.N().S(`btn-secondary`) qw422016.N().S(`btn-secondary`)
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
} }
//line lib/promscrape/targets_response.qtpl:46 //line lib/promscrape/targets_response.qtpl:51
qw422016.N().S(`" `) qw422016.N().S(`"`)
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
if !onlyUnhealthy { if !onlyUnhealthy {
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
qw422016.N().S(`onclick="location.href='`) qw422016.N().S(`onclick="location.href='`)
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
qw422016.N().S(redirectPath) qw422016.N().S(redirectPath)
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
qw422016.N().S(`?show_only_unhealthy=true'"`) qw422016.N().S(`?show_only_unhealthy=true'"`)
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
} }
//line lib/promscrape/targets_response.qtpl:47 //line lib/promscrape/targets_response.qtpl:52
qw422016.N().S(`> Unhealthy </button> </div> `) qw422016.N().S(`>Unhealthy</button></div>`)
//line lib/promscrape/targets_response.qtpl:51 //line lib/promscrape/targets_response.qtpl:56
for _, js := range jts { for _, js := range jts {
//line lib/promscrape/targets_response.qtpl:51 //line lib/promscrape/targets_response.qtpl:57
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:52
if onlyUnhealthy && js.upCount == js.targetsTotal { if onlyUnhealthy && js.upCount == js.targetsTotal {
//line lib/promscrape/targets_response.qtpl:52 //line lib/promscrape/targets_response.qtpl:57
continue continue
//line lib/promscrape/targets_response.qtpl:52 //line lib/promscrape/targets_response.qtpl:57
} }
//line lib/promscrape/targets_response.qtpl:52 //line lib/promscrape/targets_response.qtpl:57
qw422016.N().S(` <div> <h4> <a>`) qw422016.N().S(`<div><h4><a>`)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.E().S(js.job) qw422016.E().S(js.job)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.N().S(` (`) qw422016.N().S(`(`)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.N().D(js.upCount) qw422016.N().D(js.upCount)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.N().S(`/`) qw422016.N().S(`/`)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.N().D(js.targetsTotal) qw422016.N().D(js.targetsTotal)
//line lib/promscrape/targets_response.qtpl:55 //line lib/promscrape/targets_response.qtpl:60
qw422016.N().S(` up)</a> </h4> <table class="table table-striped table-hover table-bordered table-sm"> <thead> <tr> <th scope="col">Endpoint</th> <th scope="col">State</th> <th scope="col">Labels</th> <th scope="col">Last Scrape</th> <th scope="col">Scrape Duration</th> <th scope="col">Error</th> </tr> </thead> <tbody> `) qw422016.N().S(`up)</a></h4><table class="table table-striped table-hover table-bordered table-sm"><thead><tr><th scope="col">Endpoint</th><th scope="col">State</th><th scope="col">Labels</th><th scope="col">Last Scrape</th><th scope="col">Scrape Duration</th><th scope="col">Samples Scraped</th><th scope="col">Error</th></tr></thead><tbody>`)
//line lib/promscrape/targets_response.qtpl:69 //line lib/promscrape/targets_response.qtpl:75
for _, ts := range js.targetsStatus { for _, ts := range js.targetsStatus {
//line lib/promscrape/targets_response.qtpl:69 //line lib/promscrape/targets_response.qtpl:76
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:70
if onlyUnhealthy && ts.up { if onlyUnhealthy && ts.up {
//line lib/promscrape/targets_response.qtpl:70 //line lib/promscrape/targets_response.qtpl:76
continue continue
//line lib/promscrape/targets_response.qtpl:70 //line lib/promscrape/targets_response.qtpl:76
} }
//line lib/promscrape/targets_response.qtpl:70 //line lib/promscrape/targets_response.qtpl:76
qw422016.N().S(` <tr `) qw422016.N().S(`<tr`)
//line lib/promscrape/targets_response.qtpl:71 //line lib/promscrape/targets_response.qtpl:77
if !ts.up { if !ts.up {
//line lib/promscrape/targets_response.qtpl:71 //line lib/promscrape/targets_response.qtpl:77
qw422016.N().S(`class="alert alert-danger" role="alert"`) qw422016.N().S(`class="alert alert-danger" role="alert"`)
//line lib/promscrape/targets_response.qtpl:71 //line lib/promscrape/targets_response.qtpl:77
} }
//line lib/promscrape/targets_response.qtpl:71 //line lib/promscrape/targets_response.qtpl:77
qw422016.N().S(`> <td><a href="`) qw422016.N().S(`><td><a href="`)
//line lib/promscrape/targets_response.qtpl:72 //line lib/promscrape/targets_response.qtpl:78
qw422016.E().S(ts.endpoint) qw422016.E().S(ts.endpoint)
//line lib/promscrape/targets_response.qtpl:72 //line lib/promscrape/targets_response.qtpl:78
qw422016.N().S(`">`) qw422016.N().S(`">`)
//line lib/promscrape/targets_response.qtpl:72 //line lib/promscrape/targets_response.qtpl:78
qw422016.E().S(ts.endpoint) qw422016.E().S(ts.endpoint)
//line lib/promscrape/targets_response.qtpl:72 //line lib/promscrape/targets_response.qtpl:78
qw422016.N().S(`</a><br></td> <td>`) qw422016.N().S(`</a><br></td><td>`)
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
if ts.up { if ts.up {
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
qw422016.N().S(`UP`) qw422016.N().S(`UP`)
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
} else { } else {
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
qw422016.N().S(`DOWN`) qw422016.N().S(`DOWN`)
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
} }
//line lib/promscrape/targets_response.qtpl:73 //line lib/promscrape/targets_response.qtpl:79
qw422016.N().S(`</td> <td title="Original labels: `) qw422016.N().S(`</td><td title="Original labels:`)
//line lib/promscrape/targets_response.qtpl:74 //line lib/promscrape/targets_response.qtpl:80
streamformatLabel(qw422016, ts.originalLabels) streamformatLabel(qw422016, ts.originalLabels)
//line lib/promscrape/targets_response.qtpl:74 //line lib/promscrape/targets_response.qtpl:80
qw422016.N().S(`"> `) qw422016.N().S(`">`)
//line lib/promscrape/targets_response.qtpl:75 //line lib/promscrape/targets_response.qtpl:81
streamformatLabel(qw422016, ts.labels) streamformatLabel(qw422016, ts.labels)
//line lib/promscrape/targets_response.qtpl:75 //line lib/promscrape/targets_response.qtpl:81
qw422016.N().S(` </td> <td>`) qw422016.N().S(`</td><td>`)
//line lib/promscrape/targets_response.qtpl:77 //line lib/promscrape/targets_response.qtpl:83
qw422016.N().FPrec(ts.lastScrapeTime.Seconds(), 3) qw422016.N().FPrec(ts.lastScrapeTime.Seconds(), 3)
//line lib/promscrape/targets_response.qtpl:77 //line lib/promscrape/targets_response.qtpl:83
qw422016.N().S(`s ago</td> <td>`) qw422016.N().S(`s ago</td><td>`)
//line lib/promscrape/targets_response.qtpl:78 //line lib/promscrape/targets_response.qtpl:84
qw422016.N().FPrec(ts.scrapeDuration.Seconds(), 3) qw422016.N().FPrec(ts.scrapeDuration.Seconds(), 3)
//line lib/promscrape/targets_response.qtpl:78 //line lib/promscrape/targets_response.qtpl:84
qw422016.N().S(`s</td> <td>`) qw422016.N().S(`s</td><td>`)
//line lib/promscrape/targets_response.qtpl:79 //line lib/promscrape/targets_response.qtpl:85
qw422016.E().S(ts.error) qw422016.N().D(ts.samplesScraped)
//line lib/promscrape/targets_response.qtpl:79 //line lib/promscrape/targets_response.qtpl:85
qw422016.N().S(`</td> </tr> `) qw422016.N().S(`</td><td>`)
//line lib/promscrape/targets_response.qtpl:81 //line lib/promscrape/targets_response.qtpl:86
qw422016.E().S(ts.errMsg)
//line lib/promscrape/targets_response.qtpl:86
qw422016.N().S(`</td></tr>`)
//line lib/promscrape/targets_response.qtpl:88
} }
//line lib/promscrape/targets_response.qtpl:81 //line lib/promscrape/targets_response.qtpl:88
qw422016.N().S(` </tbody> </table> </div> `) qw422016.N().S(`</tbody></table></div>`)
//line lib/promscrape/targets_response.qtpl:85 //line lib/promscrape/targets_response.qtpl:92
} }
//line lib/promscrape/targets_response.qtpl:85 //line lib/promscrape/targets_response.qtpl:94
qw422016.N().S(` </body> </html> `) for _, jobName := range emptyJobs {
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:94
qw422016.N().S(`<div><h4><a>`)
//line lib/promscrape/targets_response.qtpl:97
qw422016.E().S(jobName)
//line lib/promscrape/targets_response.qtpl:97
qw422016.N().S(`(0/0 up)</a></h4><table class="table table-striped table-hover table-bordered table-sm"><thead><tr><th scope="col">Endpoint</th><th scope="col">State</th><th scope="col">Labels</th><th scope="col">Last Scrape</th><th scope="col">Scrape Duration</th><th scope="col">Samples Scraped</th><th scope="col">Error</th></tr></thead></table></div>`)
//line lib/promscrape/targets_response.qtpl:113
}
//line lib/promscrape/targets_response.qtpl:113
qw422016.N().S(`</body></html>`)
//line lib/promscrape/targets_response.qtpl:116
} }
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
func WriteTargetsResponseHTML(qq422016 qtio422016.Writer, jts []jobTargetsStatuses, redirectPath string, onlyUnhealthy bool) { func WriteTargetsResponseHTML(qq422016 qtio422016.Writer, jts []jobTargetsStatuses, emptyJobs []string, redirectPath string, onlyUnhealthy bool) {
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
qw422016 := qt422016.AcquireWriter(qq422016) qw422016 := qt422016.AcquireWriter(qq422016)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
StreamTargetsResponseHTML(qw422016, jts, redirectPath, onlyUnhealthy) StreamTargetsResponseHTML(qw422016, jts, emptyJobs, redirectPath, onlyUnhealthy)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
qt422016.ReleaseWriter(qw422016) qt422016.ReleaseWriter(qw422016)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
} }
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
func TargetsResponseHTML(jts []jobTargetsStatuses, redirectPath string, onlyUnhealthy bool) string { func TargetsResponseHTML(jts []jobTargetsStatuses, emptyJobs []string, redirectPath string, onlyUnhealthy bool) string {
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
qb422016 := qt422016.AcquireByteBuffer() qb422016 := qt422016.AcquireByteBuffer()
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
WriteTargetsResponseHTML(qb422016, jts, redirectPath, onlyUnhealthy) WriteTargetsResponseHTML(qb422016, jts, emptyJobs, redirectPath, onlyUnhealthy)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
qs422016 := string(qb422016.B) qs422016 := string(qb422016.B)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
qt422016.ReleaseByteBuffer(qb422016) qt422016.ReleaseByteBuffer(qb422016)
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
return qs422016 return qs422016
//line lib/promscrape/targets_response.qtpl:88 //line lib/promscrape/targets_response.qtpl:116
} }
//line lib/promscrape/targets_response.qtpl:90 //line lib/promscrape/targets_response.qtpl:118
func streamformatLabel(qw422016 *qt422016.Writer, labels []prompbmarshal.Label) { func streamformatLabel(qw422016 *qt422016.Writer, labels []prompbmarshal.Label) {
//line lib/promscrape/targets_response.qtpl:90 //line lib/promscrape/targets_response.qtpl:119
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:91
for _, label := range labels { for _, label := range labels {
//line lib/promscrape/targets_response.qtpl:91 //line lib/promscrape/targets_response.qtpl:120
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:92
qw422016.E().S(label.Name) qw422016.E().S(label.Name)
//line lib/promscrape/targets_response.qtpl:92 //line lib/promscrape/targets_response.qtpl:120
qw422016.N().S(`=`) qw422016.N().S(`=`)
//line lib/promscrape/targets_response.qtpl:92 //line lib/promscrape/targets_response.qtpl:120
qw422016.E().Q(label.Value) qw422016.E().Q(label.Value)
//line lib/promscrape/targets_response.qtpl:92 //line lib/promscrape/targets_response.qtpl:120
qw422016.N().S(` `) qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:92 //line lib/promscrape/targets_response.qtpl:121
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:92
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:93
} }
//line lib/promscrape/targets_response.qtpl:93 //line lib/promscrape/targets_response.qtpl:122
qw422016.N().S(` `)
//line lib/promscrape/targets_response.qtpl:94
} }
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
func writeformatLabel(qq422016 qtio422016.Writer, labels []prompbmarshal.Label) { func writeformatLabel(qq422016 qtio422016.Writer, labels []prompbmarshal.Label) {
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
qw422016 := qt422016.AcquireWriter(qq422016) qw422016 := qt422016.AcquireWriter(qq422016)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
streamformatLabel(qw422016, labels) streamformatLabel(qw422016, labels)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
qt422016.ReleaseWriter(qw422016) qt422016.ReleaseWriter(qw422016)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
} }
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
func formatLabel(labels []prompbmarshal.Label) string { func formatLabel(labels []prompbmarshal.Label) string {
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
qb422016 := qt422016.AcquireByteBuffer() qb422016 := qt422016.AcquireByteBuffer()
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
writeformatLabel(qb422016, labels) writeformatLabel(qb422016, labels)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
qs422016 := string(qb422016.B) qs422016 := string(qb422016.B)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
qt422016.ReleaseByteBuffer(qb422016) qt422016.ReleaseByteBuffer(qb422016)
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
return qs422016 return qs422016
//line lib/promscrape/targets_response.qtpl:94 //line lib/promscrape/targets_response.qtpl:122
} }

View file

@ -58,8 +58,9 @@ func WriteAPIV1Targets(w io.Writer, state string) {
} }
type targetStatusMap struct { type targetStatusMap struct {
mu sync.Mutex mu sync.Mutex
m map[*ScrapeWork]*targetStatus m map[*ScrapeWork]*targetStatus
jobNames []string
} }
func newTargetStatusMap() *targetStatusMap { func newTargetStatusMap() *targetStatusMap {
@ -74,6 +75,12 @@ func (tsm *targetStatusMap) Reset() {
tsm.mu.Unlock() tsm.mu.Unlock()
} }
func (tsm *targetStatusMap) registerJobNames(jobNames []string) {
tsm.mu.Lock()
tsm.jobNames = append(tsm.jobNames[:0], jobNames...)
tsm.mu.Unlock()
}
func (tsm *targetStatusMap) Register(sw *ScrapeWork) { func (tsm *targetStatusMap) Register(sw *ScrapeWork) {
tsm.mu.Lock() tsm.mu.Lock()
tsm.m[sw] = &targetStatus{ tsm.m[sw] = &targetStatus{
@ -88,7 +95,7 @@ func (tsm *targetStatusMap) Unregister(sw *ScrapeWork) {
tsm.mu.Unlock() tsm.mu.Unlock()
} }
func (tsm *targetStatusMap) Update(sw *ScrapeWork, group string, up bool, scrapeTime, scrapeDuration int64, err error) { func (tsm *targetStatusMap) Update(sw *ScrapeWork, group string, up bool, scrapeTime, scrapeDuration int64, samplesScraped int, err error) {
tsm.mu.Lock() tsm.mu.Lock()
ts := tsm.m[sw] ts := tsm.m[sw]
if ts == nil { if ts == nil {
@ -101,6 +108,7 @@ func (tsm *targetStatusMap) Update(sw *ScrapeWork, group string, up bool, scrape
ts.scrapeGroup = group ts.scrapeGroup = group
ts.scrapeTime = scrapeTime ts.scrapeTime = scrapeTime
ts.scrapeDuration = scrapeDuration ts.scrapeDuration = scrapeDuration
ts.samplesScraped = samplesScraped
ts.err = err ts.err = err
tsm.mu.Unlock() tsm.mu.Unlock()
} }
@ -156,6 +164,7 @@ func (tsm *targetStatusMap) WriteActiveTargetsJSON(w io.Writer) {
fmt.Fprintf(w, `,"lastError":%q`, errMsg) fmt.Fprintf(w, `,"lastError":%q`, errMsg)
fmt.Fprintf(w, `,"lastScrape":%q`, time.Unix(st.scrapeTime/1000, (st.scrapeTime%1000)*1e6).Format(time.RFC3339Nano)) fmt.Fprintf(w, `,"lastScrape":%q`, time.Unix(st.scrapeTime/1000, (st.scrapeTime%1000)*1e6).Format(time.RFC3339Nano))
fmt.Fprintf(w, `,"lastScrapeDuration":%g`, (time.Millisecond * time.Duration(st.scrapeDuration)).Seconds()) fmt.Fprintf(w, `,"lastScrapeDuration":%g`, (time.Millisecond * time.Duration(st.scrapeDuration)).Seconds())
fmt.Fprintf(w, `,"lastSamplesScraped":%d`, st.samplesScraped)
state := "up" state := "up"
if !st.up { if !st.up {
state = "down" state = "down"
@ -185,6 +194,7 @@ type targetStatus struct {
scrapeGroup string scrapeGroup string
scrapeTime int64 scrapeTime int64
scrapeDuration int64 scrapeDuration int64
samplesScraped int
err error err error
} }
@ -270,7 +280,8 @@ type jobTargetStatus struct {
originalLabels []prompbmarshal.Label originalLabels []prompbmarshal.Label
lastScrapeTime time.Duration lastScrapeTime time.Duration
scrapeDuration time.Duration scrapeDuration time.Duration
error string samplesScraped int
errMsg string
} }
type jobTargetsStatuses struct { type jobTargetsStatuses struct {
@ -280,13 +291,14 @@ type jobTargetsStatuses struct {
targetsStatus []jobTargetStatus targetsStatus []jobTargetStatus
} }
func (tsm *targetStatusMap) getTargetsStatusByJob() []jobTargetsStatuses { func (tsm *targetStatusMap) getTargetsStatusByJob() ([]jobTargetsStatuses, []string) {
byJob := make(map[string][]targetStatus) byJob := make(map[string][]targetStatus)
tsm.mu.Lock() tsm.mu.Lock()
for _, st := range tsm.m { for _, st := range tsm.m {
job := st.sw.Job() job := st.sw.Job()
byJob[job] = append(byJob[job], *st) byJob[job] = append(byJob[job], *st)
} }
jobNames := append([]string{}, tsm.jobNames...)
tsm.mu.Unlock() tsm.mu.Unlock()
var jts []jobTargetsStatuses var jts []jobTargetsStatuses
@ -313,7 +325,8 @@ func (tsm *targetStatusMap) getTargetsStatusByJob() []jobTargetsStatuses {
originalLabels: st.sw.OriginalLabels, originalLabels: st.sw.OriginalLabels,
lastScrapeTime: st.getDurationFromLastScrape(), lastScrapeTime: st.getDurationFromLastScrape(),
scrapeDuration: time.Duration(st.scrapeDuration) * time.Millisecond, scrapeDuration: time.Duration(st.scrapeDuration) * time.Millisecond,
error: errMsg, samplesScraped: st.samplesScraped,
errMsg: errMsg,
}) })
} }
jts = append(jts, jobTargetsStatuses{ jts = append(jts, jobTargetsStatuses{
@ -326,20 +339,37 @@ func (tsm *targetStatusMap) getTargetsStatusByJob() []jobTargetsStatuses {
sort.Slice(jts, func(i, j int) bool { sort.Slice(jts, func(i, j int) bool {
return jts[i].job < jts[j].job return jts[i].job < jts[j].job
}) })
return jts emptyJobs := getEmptyJobs(jts, jobNames)
return jts, emptyJobs
}
func getEmptyJobs(jts []jobTargetsStatuses, jobNames []string) []string {
jobNamesMap := make(map[string]struct{}, len(jobNames))
for _, jobName := range jobNames {
jobNamesMap[jobName] = struct{}{}
}
for i := range jts {
delete(jobNamesMap, jts[i].job)
}
emptyJobs := make([]string, 0, len(jobNamesMap))
for k := range jobNamesMap {
emptyJobs = append(emptyJobs, k)
}
sort.Strings(emptyJobs)
return emptyJobs
} }
// WriteTargetsHTML writes targets status grouped by job into writer w in html table, // WriteTargetsHTML writes targets status grouped by job into writer w in html table,
// accepts filter to show only unhealthy targets. // accepts filter to show only unhealthy targets.
func (tsm *targetStatusMap) WriteTargetsHTML(w io.Writer, showOnlyUnhealthy bool) { func (tsm *targetStatusMap) WriteTargetsHTML(w io.Writer, showOnlyUnhealthy bool) {
jss := tsm.getTargetsStatusByJob() jss, emptyJobs := tsm.getTargetsStatusByJob()
targetsPath := path.Join(httpserver.GetPathPrefix(), "/targets") targetsPath := path.Join(httpserver.GetPathPrefix(), "/targets")
WriteTargetsResponseHTML(w, jss, targetsPath, showOnlyUnhealthy) WriteTargetsResponseHTML(w, jss, emptyJobs, targetsPath, showOnlyUnhealthy)
} }
// WriteTargetsPlain writes targets grouped by job into writer w in plain text, // WriteTargetsPlain writes targets grouped by job into writer w in plain text,
// accept filter to show original labels. // accept filter to show original labels.
func (tsm *targetStatusMap) WriteTargetsPlain(w io.Writer, showOriginalLabels bool) { func (tsm *targetStatusMap) WriteTargetsPlain(w io.Writer, showOriginalLabels bool) {
jss := tsm.getTargetsStatusByJob() jss, emptyJobs := tsm.getTargetsStatusByJob()
WriteTargetsResponsePlain(w, jss, showOriginalLabels) WriteTargetsResponsePlain(w, jss, emptyJobs, showOriginalLabels)
} }

View file

@ -4,6 +4,7 @@ import (
"bytes" "bytes"
"fmt" "fmt"
"io" "io"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil" "github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
) )
@ -35,6 +36,7 @@ func ReadLinesBlock(r io.Reader, dstBuf, tailBuf []byte) ([]byte, []byte, error)
// //
// It is expected that read timeout on r exceeds 1 second. // It is expected that read timeout on r exceeds 1 second.
func ReadLinesBlockExt(r io.Reader, dstBuf, tailBuf []byte, maxLineLen int) ([]byte, []byte, error) { func ReadLinesBlockExt(r io.Reader, dstBuf, tailBuf []byte, maxLineLen int) ([]byte, []byte, error) {
startTime := time.Now()
if cap(dstBuf) < defaultBlockSize { if cap(dstBuf) < defaultBlockSize {
dstBuf = bytesutil.Resize(dstBuf, defaultBlockSize) dstBuf = bytesutil.Resize(dstBuf, defaultBlockSize)
} }
@ -55,6 +57,9 @@ again:
// This fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/60 . // This fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/60 .
return dstBuf, tailBuf, nil return dstBuf, tailBuf, nil
} }
if err != io.EOF {
err = fmt.Errorf("cannot read a block of data in %.3fs: %w", time.Since(startTime).Seconds(), err)
}
return dstBuf, tailBuf, err return dstBuf, tailBuf, err
} }
dstBuf = dstBuf[:len(dstBuf)+n] dstBuf = dstBuf[:len(dstBuf)+n]

View file

@ -69,7 +69,7 @@ func ParseStream(req *http.Request, callback func(rows []Row) error) error {
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf)
@ -107,6 +107,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -54,7 +54,7 @@ func ParseStream(r io.Reader, callback func(rows []Row) error) error {
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf)
@ -86,6 +86,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -82,7 +82,7 @@ func ParseStream(r io.Reader, isGzipped bool, precision, db string, callback fun
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlockExt(ctx.br, ctx.reqBuf, ctx.tailBuf, maxLineSize.N) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlockExt(ctx.br, ctx.reqBuf, ctx.tailBuf, maxLineSize.N)
@ -120,6 +120,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -53,7 +53,7 @@ func ParseStream(r io.Reader, callback func(rows []Row) error) error {
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf)
@ -85,6 +85,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -56,7 +56,7 @@ func ParseStream(r io.Reader, defaultTimestamp int64, isGzipped bool, callback f
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(ctx.br, ctx.reqBuf, ctx.tailBuf)
@ -88,6 +88,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -59,7 +59,7 @@ func ParseStream(req *http.Request, callback func(rows []Row) error) error {
func (ctx *streamContext) Read() bool { func (ctx *streamContext) Read() bool {
readCalls.Inc() readCalls.Inc()
if ctx.err != nil { if ctx.err != nil || ctx.hasCallbackError() {
return false return false
} }
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlockExt(ctx.br, ctx.reqBuf, ctx.tailBuf, maxLineLen.N) ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlockExt(ctx.br, ctx.reqBuf, ctx.tailBuf, maxLineLen.N)
@ -97,6 +97,13 @@ func (ctx *streamContext) Error() error {
return ctx.err return ctx.err
} }
func (ctx *streamContext) hasCallbackError() bool {
ctx.callbackErrLock.Lock()
ok := ctx.callbackErr != nil
ctx.callbackErrLock.Unlock()
return ok
}
func (ctx *streamContext) reset() { func (ctx *streamContext) reset() {
ctx.br.Reset(nil) ctx.br.Reset(nil)
ctx.reqBuf = ctx.reqBuf[:0] ctx.reqBuf = ctx.reqBuf[:0]

View file

@ -103,15 +103,6 @@ type indexDB struct {
loopsPerDateTagFilterCache *workingsetcache.Cache loopsPerDateTagFilterCache *workingsetcache.Cache
indexSearchPool sync.Pool indexSearchPool sync.Pool
// An inmemory set of deleted metricIDs.
//
// The set holds deleted metricIDs for the current db and for the extDB.
//
// It is safe to keep the set in memory even for big number of deleted
// metricIDs, since it usually requires 1 bit per deleted metricID.
deletedMetricIDs atomic.Value
deletedMetricIDsUpdateLock sync.Mutex
} }
// openIndexDB opens index db from the given path with the given caches. // openIndexDB opens index db from the given path with the given caches.
@ -140,14 +131,6 @@ func openIndexDB(path string, s *Storage) (*indexDB, error) {
uselessTagFiltersCache: workingsetcache.New(mem/128, time.Hour), uselessTagFiltersCache: workingsetcache.New(mem/128, time.Hour),
loopsPerDateTagFilterCache: workingsetcache.New(mem/128, time.Hour), loopsPerDateTagFilterCache: workingsetcache.New(mem/128, time.Hour),
} }
is := db.getIndexSearch(noDeadline)
dmis, err := is.loadDeletedMetricIDs()
db.putIndexSearch(is)
if err != nil {
return nil, fmt.Errorf("cannot load deleted metricIDs: %w", err)
}
db.setDeletedMetricIDs(dmis)
return db, nil return db, nil
} }
@ -214,7 +197,7 @@ func (db *indexDB) UpdateMetrics(m *IndexDBMetrics) {
m.UselessTagFiltersCacheRequests += cs.GetCalls m.UselessTagFiltersCacheRequests += cs.GetCalls
m.UselessTagFiltersCacheMisses += cs.Misses m.UselessTagFiltersCacheMisses += cs.Misses
m.DeletedMetricsCount += uint64(db.getDeletedMetricIDs().Len()) m.DeletedMetricsCount += uint64(db.s.getDeletedMetricIDs().Len())
m.IndexDBRefCount += atomic.LoadUint64(&db.refCount) m.IndexDBRefCount += atomic.LoadUint64(&db.refCount)
m.NewTimeseriesCreated += atomic.LoadUint64(&db.newTimeseriesCreated) m.NewTimeseriesCreated += atomic.LoadUint64(&db.newTimeseriesCreated)
@ -260,12 +243,6 @@ func (db *indexDB) doExtDB(f func(extDB *indexDB)) bool {
// //
// It decrements refCount for the previous extDB. // It decrements refCount for the previous extDB.
func (db *indexDB) SetExtDB(extDB *indexDB) { func (db *indexDB) SetExtDB(extDB *indexDB) {
// Add deleted metricIDs from extDB to db.
if extDB != nil {
dmisExt := extDB.getDeletedMetricIDs()
db.updateDeletedMetricIDs(dmisExt)
}
db.extDBLock.Lock() db.extDBLock.Lock()
prevExtDB := db.extDB prevExtDB := db.extDB
db.extDB = extDB db.extDB = extDB
@ -737,7 +714,7 @@ func (is *indexSearch) searchTagKeysOnDate(tks map[string]struct{}, date uint64,
kb := &is.kb kb := &is.kb
mp := &is.mp mp := &is.mp
mp.Reset() mp.Reset()
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
loopsPaceLimiter := 0 loopsPaceLimiter := 0
kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixDateTagToMetricIDs) kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixDateTagToMetricIDs)
kb.B = encoding.MarshalUint64(kb.B, date) kb.B = encoding.MarshalUint64(kb.B, date)
@ -817,7 +794,7 @@ func (is *indexSearch) searchTagKeys(tks map[string]struct{}, maxTagKeys int) er
kb := &is.kb kb := &is.kb
mp := &is.mp mp := &is.mp
mp.Reset() mp.Reset()
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
loopsPaceLimiter := 0 loopsPaceLimiter := 0
kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixTagToMetricIDs) kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixTagToMetricIDs)
prefix := kb.B prefix := kb.B
@ -935,7 +912,7 @@ func (is *indexSearch) searchTagValuesOnDate(tvs map[string]struct{}, tagKey []b
kb := &is.kb kb := &is.kb
mp := &is.mp mp := &is.mp
mp.Reset() mp.Reset()
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
loopsPaceLimiter := 0 loopsPaceLimiter := 0
kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixDateTagToMetricIDs) kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixDateTagToMetricIDs)
kb.B = encoding.MarshalUint64(kb.B, date) kb.B = encoding.MarshalUint64(kb.B, date)
@ -1021,7 +998,7 @@ func (is *indexSearch) searchTagValues(tvs map[string]struct{}, tagKey []byte, m
kb := &is.kb kb := &is.kb
mp := &is.mp mp := &is.mp
mp.Reset() mp.Reset()
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
loopsPaceLimiter := 0 loopsPaceLimiter := 0
kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixTagToMetricIDs) kb.B = is.marshalCommonPrefix(kb.B[:0], nsPrefixTagToMetricIDs)
kb.B = marshalTagValue(kb.B, tagKey) kb.B = marshalTagValue(kb.B, tagKey)
@ -1175,7 +1152,7 @@ func (is *indexSearch) searchTagValueSuffixesForPrefix(tvss map[string]struct{},
ts := &is.ts ts := &is.ts
mp := &is.mp mp := &is.mp
mp.Reset() mp.Reset()
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
loopsPaceLimiter := 0 loopsPaceLimiter := 0
ts.Seek(prefix) ts.Seek(prefix)
for len(tvss) < maxTagValueSuffixes && ts.NextItem() { for len(tvss) < maxTagValueSuffixes && ts.NextItem() {
@ -1616,7 +1593,7 @@ func (db *indexDB) deleteMetricIDs(metricIDs []uint64) error {
// atomically add deleted metricIDs to an inmemory map. // atomically add deleted metricIDs to an inmemory map.
dmis := &uint64set.Set{} dmis := &uint64set.Set{}
dmis.AddMulti(metricIDs) dmis.AddMulti(metricIDs)
db.updateDeletedMetricIDs(dmis) db.s.updateDeletedMetricIDs(dmis)
// Reset TagFilters -> TSIDS cache, since it may contain deleted TSIDs. // Reset TagFilters -> TSIDS cache, since it may contain deleted TSIDs.
invalidateTagCache() invalidateTagCache()
@ -1643,21 +1620,14 @@ func (db *indexDB) deleteMetricIDs(metricIDs []uint64) error {
return err return err
} }
func (db *indexDB) getDeletedMetricIDs() *uint64set.Set { func (db *indexDB) loadDeletedMetricIDs() (*uint64set.Set, error) {
return db.deletedMetricIDs.Load().(*uint64set.Set) is := db.getIndexSearch(noDeadline)
} dmis, err := is.loadDeletedMetricIDs()
db.putIndexSearch(is)
func (db *indexDB) setDeletedMetricIDs(dmis *uint64set.Set) { if err != nil {
db.deletedMetricIDs.Store(dmis) return nil, err
} }
return dmis, nil
func (db *indexDB) updateDeletedMetricIDs(metricIDs *uint64set.Set) {
db.deletedMetricIDsUpdateLock.Lock()
dmisOld := db.getDeletedMetricIDs()
dmisNew := dmisOld.Clone()
dmisNew.Union(metricIDs)
db.setDeletedMetricIDs(dmisNew)
db.deletedMetricIDsUpdateLock.Unlock()
} }
func (is *indexSearch) loadDeletedMetricIDs() (*uint64set.Set, error) { func (is *indexSearch) loadDeletedMetricIDs() (*uint64set.Set, error) {
@ -1751,7 +1721,7 @@ func (db *indexDB) searchTSIDs(tfss []*TagFilters, tr TimeRange, maxMetrics int,
var tagFiltersKeyBufPool bytesutil.ByteBufferPool var tagFiltersKeyBufPool bytesutil.ByteBufferPool
func (is *indexSearch) getTSIDByMetricName(dst *TSID, metricName []byte) error { func (is *indexSearch) getTSIDByMetricName(dst *TSID, metricName []byte) error {
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
ts := &is.ts ts := &is.ts
kb := &is.kb kb := &is.kb
kb.B = append(kb.B[:0], nsPrefixMetricNameToTSID) kb.B = append(kb.B[:0], nsPrefixMetricNameToTSID)
@ -2315,7 +2285,7 @@ func (is *indexSearch) searchMetricIDs(tfss []*TagFilters, tr TimeRange, maxMetr
sortedMetricIDs := metricIDs.AppendTo(nil) sortedMetricIDs := metricIDs.AppendTo(nil)
// Filter out deleted metricIDs. // Filter out deleted metricIDs.
dmis := is.db.getDeletedMetricIDs() dmis := is.db.s.getDeletedMetricIDs()
if dmis.Len() > 0 { if dmis.Len() > 0 {
metricIDsFiltered := sortedMetricIDs[:0] metricIDsFiltered := sortedMetricIDs[:0]
for _, metricID := range sortedMetricIDs { for _, metricID := range sortedMetricIDs {

View file

@ -1711,13 +1711,15 @@ func toTFPointers(tfs []tagFilter) []*tagFilter {
} }
func newTestStorage() *Storage { func newTestStorage() *Storage {
return &Storage{ s := &Storage{
cachePath: "test-storage-cache", cachePath: "test-storage-cache",
metricIDCache: workingsetcache.New(1234, time.Hour), metricIDCache: workingsetcache.New(1234, time.Hour),
metricNameCache: workingsetcache.New(1234, time.Hour), metricNameCache: workingsetcache.New(1234, time.Hour),
tsidCache: workingsetcache.New(1234, time.Hour), tsidCache: workingsetcache.New(1234, time.Hour),
} }
s.setDeletedMetricIDs(&uint64set.Set{})
return s
} }
func stopTestStorage(s *Storage) { func stopTestStorage(s *Storage) {

View file

@ -4,7 +4,6 @@ import (
"errors" "errors"
"fmt" "fmt"
"io/ioutil" "io/ioutil"
"math/bits"
"os" "os"
"path/filepath" "path/filepath"
"sort" "sort"
@ -478,81 +477,49 @@ func (rrs *rawRowsShard) Len() int {
} }
func (rrs *rawRowsShard) addRows(pt *partition, rows []rawRow) { func (rrs *rawRowsShard) addRows(pt *partition, rows []rawRow) {
var rrss []*rawRows var rowsToFlush []rawRow
rrs.mu.Lock() rrs.mu.Lock()
if cap(rrs.rows) == 0 { if cap(rrs.rows) == 0 {
rrs.rows = getRawRowsMaxSize().rows n := getMaxRawRowsPerShard()
rrs.rows = make([]rawRow, 0, n)
} }
maxRowsCount := cap(rrs.rows) maxRowsCount := cap(rrs.rows)
for { capacity := maxRowsCount - len(rrs.rows)
capacity := maxRowsCount - len(rrs.rows) if capacity >= len(rows) {
if capacity >= len(rows) { // Fast path - rows fit capacity.
// Fast path - rows fit capacity. rrs.rows = append(rrs.rows, rows...)
rrs.rows = append(rrs.rows, rows...) } else {
break
}
// Slow path - rows don't fit capacity. // Slow path - rows don't fit capacity.
// Fill rawRows to capacity and convert it to a part. // Put rrs.rows and rows to rowsToFlush and convert it to a part.
rrs.rows = append(rrs.rows, rows[:capacity]...) rowsToFlush = append(rowsToFlush, rrs.rows...)
rows = rows[capacity:] rowsToFlush = append(rowsToFlush, rows...)
rr := getRawRowsMaxSize() rrs.rows = rrs.rows[:0]
rrs.rows, rr.rows = rr.rows, rrs.rows
rrss = append(rrss, rr)
rrs.lastFlushTime = fasttime.UnixTimestamp() rrs.lastFlushTime = fasttime.UnixTimestamp()
} }
rrs.mu.Unlock() rrs.mu.Unlock()
for _, rr := range rrss { pt.flushRowsToParts(rowsToFlush)
pt.addRowsPart(rr.rows)
putRawRows(rr)
}
} }
type rawRows struct { func (pt *partition) flushRowsToParts(rows []rawRow) {
rows []rawRow maxRows := getMaxRawRowsPerShard()
} var wg sync.WaitGroup
for len(rows) > 0 {
func getRawRowsMaxSize() *rawRows { n := maxRows
size := getMaxRawRowsPerShard() if n > len(rows) {
return getRawRowsWithSize(size) n = len(rows)
}
func getRawRowsWithSize(size int) *rawRows {
p, sizeRounded := getRawRowsPool(size)
v := p.Get()
if v == nil {
return &rawRows{
rows: make([]rawRow, 0, sizeRounded),
} }
wg.Add(1)
go func(rowsPart []rawRow) {
defer wg.Done()
pt.addRowsPart(rowsPart)
}(rows[:n])
rows = rows[n:]
} }
return v.(*rawRows) wg.Wait()
} }
func putRawRows(rr *rawRows) {
rr.rows = rr.rows[:0]
size := cap(rr.rows)
p, _ := getRawRowsPool(size)
p.Put(rr)
}
func getRawRowsPool(size int) (*sync.Pool, int) {
size--
if size < 0 {
size = 0
}
bucketIdx := 64 - bits.LeadingZeros64(uint64(size))
if bucketIdx >= len(rawRowsPools) {
bucketIdx = len(rawRowsPools) - 1
}
p := &rawRowsPools[bucketIdx]
sizeRounded := 1 << uint(bucketIdx)
return p, sizeRounded
}
var rawRowsPools [19]sync.Pool
func (pt *partition) addRowsPart(rows []rawRow) { func (pt *partition) addRowsPart(rows []rawRow) {
if len(rows) == 0 { if len(rows) == 0 {
return return
@ -749,19 +716,14 @@ func (pt *partition) flushRawRows(isFinal bool) {
} }
func (rrss *rawRowsShards) flush(pt *partition, isFinal bool) { func (rrss *rawRowsShards) flush(pt *partition, isFinal bool) {
var wg sync.WaitGroup var rowsToFlush []rawRow
wg.Add(len(rrss.shards))
for i := range rrss.shards { for i := range rrss.shards {
go func(rrs *rawRowsShard) { rowsToFlush = rrss.shards[i].appendRawRowsToFlush(rowsToFlush, pt, isFinal)
rrs.flush(pt, isFinal)
wg.Done()
}(&rrss.shards[i])
} }
wg.Wait() pt.flushRowsToParts(rowsToFlush)
} }
func (rrs *rawRowsShard) flush(pt *partition, isFinal bool) { func (rrs *rawRowsShard) appendRawRowsToFlush(dst []rawRow, pt *partition, isFinal bool) []rawRow {
var rr *rawRows
currentTime := fasttime.UnixTimestamp() currentTime := fasttime.UnixTimestamp()
flushSeconds := int64(rawRowsFlushInterval.Seconds()) flushSeconds := int64(rawRowsFlushInterval.Seconds())
if flushSeconds <= 0 { if flushSeconds <= 0 {
@ -770,15 +732,12 @@ func (rrs *rawRowsShard) flush(pt *partition, isFinal bool) {
rrs.mu.Lock() rrs.mu.Lock()
if isFinal || currentTime-rrs.lastFlushTime > uint64(flushSeconds) { if isFinal || currentTime-rrs.lastFlushTime > uint64(flushSeconds) {
rr = getRawRowsMaxSize() dst = append(dst, rrs.rows...)
rrs.rows, rr.rows = rr.rows, rrs.rows rrs.rows = rrs.rows[:0]
} }
rrs.mu.Unlock() rrs.mu.Unlock()
if rr != nil { return dst
pt.addRowsPart(rr.rows)
putRawRows(rr)
}
} }
func (pt *partition) startInmemoryPartsFlusher() { func (pt *partition) startInmemoryPartsFlusher() {

View file

@ -120,6 +120,13 @@ type Storage struct {
// The minimum timestamp when composite index search can be used. // The minimum timestamp when composite index search can be used.
minTimestampForCompositeIndex int64 minTimestampForCompositeIndex int64
// An inmemory set of deleted metricIDs.
//
// It is safe to keep the set in memory even for big number of deleted
// metricIDs, since it usually requires 1 bit per deleted metricID.
deletedMetricIDs atomic.Value
deletedMetricIDsUpdateLock sync.Mutex
} }
// OpenStorage opens storage on the given path with the given retentionMsecs. // OpenStorage opens storage on the given path with the given retentionMsecs.
@ -208,6 +215,18 @@ func OpenStorage(path string, retentionMsecs int64, maxHourlySeries, maxDailySer
idbCurr.SetExtDB(idbPrev) idbCurr.SetExtDB(idbPrev)
s.idbCurr.Store(idbCurr) s.idbCurr.Store(idbCurr)
// Load deleted metricIDs from idbCurr and idbPrev
dmisCurr, err := idbCurr.loadDeletedMetricIDs()
if err != nil {
return nil, fmt.Errorf("cannot load deleted metricIDs for the current indexDB: %w", err)
}
dmisPrev, err := idbPrev.loadDeletedMetricIDs()
if err != nil {
return nil, fmt.Errorf("cannot load deleted metricIDs for the previous indexDB: %w", err)
}
s.setDeletedMetricIDs(dmisCurr)
s.updateDeletedMetricIDs(dmisPrev)
// Load data // Load data
tablePath := path + "/data" tablePath := path + "/data"
tb, err := openTable(tablePath, s.getDeletedMetricIDs, retentionMsecs) tb, err := openTable(tablePath, s.getDeletedMetricIDs, retentionMsecs)
@ -224,16 +243,29 @@ func OpenStorage(path string, retentionMsecs int64, maxHourlySeries, maxDailySer
return s, nil return s, nil
} }
func (s *Storage) getDeletedMetricIDs() *uint64set.Set {
return s.deletedMetricIDs.Load().(*uint64set.Set)
}
func (s *Storage) setDeletedMetricIDs(dmis *uint64set.Set) {
s.deletedMetricIDs.Store(dmis)
}
func (s *Storage) updateDeletedMetricIDs(metricIDs *uint64set.Set) {
s.deletedMetricIDsUpdateLock.Lock()
dmisOld := s.getDeletedMetricIDs()
dmisNew := dmisOld.Clone()
dmisNew.Union(metricIDs)
s.setDeletedMetricIDs(dmisNew)
s.deletedMetricIDsUpdateLock.Unlock()
}
// DebugFlush flushes recently added storage data, so it becomes visible to search. // DebugFlush flushes recently added storage data, so it becomes visible to search.
func (s *Storage) DebugFlush() { func (s *Storage) DebugFlush() {
s.tb.flushRawRows() s.tb.flushRawRows()
s.idb().tb.DebugFlush() s.idb().tb.DebugFlush()
} }
func (s *Storage) getDeletedMetricIDs() *uint64set.Set {
return s.idb().getDeletedMetricIDs()
}
// CreateSnapshot creates snapshot for s and returns the snapshot name. // CreateSnapshot creates snapshot for s and returns the snapshot name.
func (s *Storage) CreateSnapshot() (string, error) { func (s *Storage) CreateSnapshot() (string, error) {
logger.Infof("creating Storage snapshot for %q...", s.path) logger.Infof("creating Storage snapshot for %q...", s.path)

View file

@ -168,10 +168,10 @@ func (b *blockDec) reset(br byteBuffer, windowSize uint64) error {
// Read block data. // Read block data.
if cap(b.dataStorage) < cSize { if cap(b.dataStorage) < cSize {
if b.lowMem { if b.lowMem || cSize > maxCompressedBlockSize {
b.dataStorage = make([]byte, 0, cSize) b.dataStorage = make([]byte, 0, cSize)
} else { } else {
b.dataStorage = make([]byte, 0, maxBlockSize) b.dataStorage = make([]byte, 0, maxCompressedBlockSize)
} }
} }
if cap(b.dst) <= maxSize { if cap(b.dst) <= maxSize {

View file

@ -17,14 +17,16 @@ type decoderOptions struct {
lowMem bool lowMem bool
concurrent int concurrent int
maxDecodedSize uint64 maxDecodedSize uint64
maxWindowSize uint64
dicts []dict dicts []dict
} }
func (o *decoderOptions) setDefault() { func (o *decoderOptions) setDefault() {
*o = decoderOptions{ *o = decoderOptions{
// use less ram: true for now, but may change. // use less ram: true for now, but may change.
lowMem: true, lowMem: true,
concurrent: runtime.GOMAXPROCS(0), concurrent: runtime.GOMAXPROCS(0),
maxWindowSize: MaxWindowSize,
} }
o.maxDecodedSize = 1 << 63 o.maxDecodedSize = 1 << 63
} }
@ -52,7 +54,6 @@ func WithDecoderConcurrency(n int) DOption {
// WithDecoderMaxMemory allows to set a maximum decoded size for in-memory // WithDecoderMaxMemory allows to set a maximum decoded size for in-memory
// non-streaming operations or maximum window size for streaming operations. // non-streaming operations or maximum window size for streaming operations.
// This can be used to control memory usage of potentially hostile content. // This can be used to control memory usage of potentially hostile content.
// For streaming operations, the maximum window size is capped at 1<<30 bytes.
// Maximum and default is 1 << 63 bytes. // Maximum and default is 1 << 63 bytes.
func WithDecoderMaxMemory(n uint64) DOption { func WithDecoderMaxMemory(n uint64) DOption {
return func(o *decoderOptions) error { return func(o *decoderOptions) error {
@ -81,3 +82,21 @@ func WithDecoderDicts(dicts ...[]byte) DOption {
return nil return nil
} }
} }
// WithDecoderMaxWindow allows to set a maximum window size for decodes.
// This allows rejecting packets that will cause big memory usage.
// The Decoder will likely allocate more memory based on the WithDecoderLowmem setting.
// If WithDecoderMaxMemory is set to a lower value, that will be used.
// Default is 512MB, Maximum is ~3.75 TB as per zstandard spec.
func WithDecoderMaxWindow(size uint64) DOption {
return func(o *decoderOptions) error {
if size < MinWindowSize {
return errors.New("WithMaxWindowSize must be at least 1KB, 1024 bytes")
}
if size > (1<<41)+7*(1<<38) {
return errors.New("WithMaxWindowSize must be less than (1<<41) + 7*(1<<38) ~ 3.75TB")
}
o.maxWindowSize = size
return nil
}
}

View file

@ -22,10 +22,6 @@ type frameDec struct {
WindowSize uint64 WindowSize uint64
// maxWindowSize is the maximum windows size to support.
// should never be bigger than max-int.
maxWindowSize uint64
// In order queue of blocks being decoded. // In order queue of blocks being decoded.
decoding chan *blockDec decoding chan *blockDec
@ -50,8 +46,11 @@ type frameDec struct {
} }
const ( const (
// The minimum Window_Size is 1 KB. // MinWindowSize is the minimum Window Size, which is 1 KB.
MinWindowSize = 1 << 10 MinWindowSize = 1 << 10
// MaxWindowSize is the maximum encoder window size
// and the default decoder maximum window size.
MaxWindowSize = 1 << 29 MaxWindowSize = 1 << 29
) )
@ -61,12 +60,11 @@ var (
) )
func newFrameDec(o decoderOptions) *frameDec { func newFrameDec(o decoderOptions) *frameDec {
d := frameDec{ if o.maxWindowSize > o.maxDecodedSize {
o: o, o.maxWindowSize = o.maxDecodedSize
maxWindowSize: MaxWindowSize,
} }
if d.maxWindowSize > o.maxDecodedSize { d := frameDec{
d.maxWindowSize = o.maxDecodedSize o: o,
} }
return &d return &d
} }
@ -251,13 +249,17 @@ func (d *frameDec) reset(br byteBuffer) error {
} }
} }
if d.WindowSize > d.maxWindowSize { if d.WindowSize > uint64(d.o.maxWindowSize) {
printf("window size %d > max %d\n", d.WindowSize, d.maxWindowSize) if debugDecoder {
printf("window size %d > max %d\n", d.WindowSize, d.o.maxWindowSize)
}
return ErrWindowSizeExceeded return ErrWindowSizeExceeded
} }
// The minimum Window_Size is 1 KB. // The minimum Window_Size is 1 KB.
if d.WindowSize < MinWindowSize { if d.WindowSize < MinWindowSize {
println("got window size: ", d.WindowSize) if debugDecoder {
println("got window size: ", d.WindowSize)
}
return ErrWindowSizeTooSmall return ErrWindowSizeTooSmall
} }
d.history.windowSize = int(d.WindowSize) d.history.windowSize = int(d.WindowSize)
@ -352,8 +354,8 @@ func (d *frameDec) checkCRC() error {
func (d *frameDec) initAsync() { func (d *frameDec) initAsync() {
if !d.o.lowMem && !d.SingleSegment { if !d.o.lowMem && !d.SingleSegment {
// set max extra size history to 10MB. // set max extra size history to 2MB.
d.history.maxSize = d.history.windowSize + maxBlockSize*5 d.history.maxSize = d.history.windowSize + maxBlockSize
} }
// re-alloc if more than one extra block size. // re-alloc if more than one extra block size.
if d.o.lowMem && cap(d.history.b) > d.history.maxSize+maxBlockSize { if d.o.lowMem && cap(d.history.b) > d.history.maxSize+maxBlockSize {

2
vendor/modules.txt vendored
View file

@ -129,7 +129,7 @@ github.com/jmespath/go-jmespath
github.com/jstemmer/go-junit-report github.com/jstemmer/go-junit-report
github.com/jstemmer/go-junit-report/formatter github.com/jstemmer/go-junit-report/formatter
github.com/jstemmer/go-junit-report/parser github.com/jstemmer/go-junit-report/parser
# github.com/klauspost/compress v1.13.0 # github.com/klauspost/compress v1.13.1
## explicit ## explicit
github.com/klauspost/compress/flate github.com/klauspost/compress/flate
github.com/klauspost/compress/fse github.com/klauspost/compress/fse