Follow-up after68bad22fd2
Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit6494606924
)
190 KiB
weight | title | menu | aliases | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | Year 2022 |
|
|
v1.85.3
Released at 2022-12-20
Update note 1: This and newer releases of VictoriaMetrics may return gaps for rate(m[d])
queries on short time ranges if [d]
lookbehind window is set explicitly. For example, rate(http_requests_total[$__interval])
. This reduces confusion level when the user expects the needed results from the query with explicitly set lookbehind window. See this issue. The previous gap filling behaviour can be restored by removing explicit lookbehind window [d]
from the query, e.g. by substituting the rate(m[d])
with rate(m)
. See these docs for details.
- BUGFIX: fix
error when searching for TSIDs by metricIDs in the previous indexdb: EOF
error, which can occur during queries after unclean shutdown of VictoriaMetrics (e.g. via hardware reset, out of memory crash orkill -9
). The error has been introduced in v1.85.2. See this issue. - BUGFIX: VictoriaMetrics enterprise: expose proper values for
vm_downsampling_partitions_scheduled
andvm_downsampling_partitions_scheduled_size_bytes
metrics, which were added at v1.78.0. See this feature request. - BUGFIX: MetricsQL: never extend explicitly set lookbehind window for rate() function. This reduces the level of confusion when the user expects the needed results after explicitly seting the lookbehind window
[d]
in the queryrate(m[d])
. Previously VictoriaMetrics could silently extend the lookbehind window, so it covers at least two raw samples. Now this behavior works only if the lookbehind window in square brackets isn't set explicitly, e.g. in the case ofrate(m)
. See this issue for details. - BUGFIX: vmagent: respect
-usePromCompatibleNaming
flag if no relabeling or extra labels were set. See this issue for details. - BUGFIX: vmui: fix the wrong legend when queries are hidden. See this issue.
- BUGFIX: vmui: fix incorrect time selection after the timezone change. See this pull request.
v1.85.2
Released at 2022-12-19
-
FEATURE: support overriding of
-search.latencyOffset
value via URL paramlatency_offset
when performing requests to /api/v1/query and /api/v1/query_range. See this issue. -
FEATURE: allow changing field names in JSON logs if VictoriaMetrics components are started with
-loggerFormat=json
command-line flags. The field names can be changed with the-loggerJSONFields
command-line flag. For example-loggerJSONFields=ts:timestamp,msg:message
would renamets
andmsg
fields on the output JSON totimestamp
andmessage
fields. See this feature request. Thanks to @michal-kralik for the pull request. -
FEATURE: vmagent: expose
__meta_consul_tag_<tagname>
and__meta_consul_tagpresent_<tagname>
labels for targets discovered via consul_sd_configs. This simplifies converting Consul service tags to target labels with a simple relabeling rule:- action: labelmap regex: __meta_consul_tag_(.+)
This resolves this StackOverflow question.
-
BUGFIX: properly return query results for time series, which stop receiving new samples after the rotation of
indexdb
. Previously such time series could be missing in query results. See this issue. The issue has been introduced in v1.83.0. -
BUGFIX: allow specifying values bigger than 2GiB to the following command-line flag values on 32-bit architectures (
386
andarm
):-storage.minFreeDiskSpaceBytes
and-remoteWrite.maxDiskUsagePerURL
. Previously values bigger than 2GiB were incorrectly truncated on these architectures. -
BUGFIX: vmagent: stop dropping metric name by a mistake on the /metric-relabel-debug page.
v1.85.1
Released at 2022-12-14
It is recommended upgrading to VictoriaMetrics v1.85.2 because of the bug, which may result in incomplete query results for historical time series.
-
FEATURE: vmalert: support
$for
or.For
template variables in alert's annotations. See this issue. -
BUGFIX: DataDog protocol parser: do not re-use
host
anddevice
fields from the previously parsed messages if these fields are missing in the currently parsed message. See this issue. -
BUGFIX: reduce CPU usage when the regex-based relabeling rules are applied to more than 100K unique Graphite metrics. See this issue. The issue was introduced in v1.82.0.
-
BUGFIX: do not block merges of small parts by merges of big parts on hosts with small number of CPU cores. This issue could result in the increasing number of
storage/small
parts while big merge is in progress. This, in turn, could result in increased CPU usage and memory usage during querying, since queries need to inspect bigger number of small parts. The issue has been introduced in v1.85.0. -
BUGFIX: vmbackup: fix the
The source request body for synchronous copy is too large and exceeds the maximum permissible limit (256MB)
error when performing backups to Azure blob storage. See this issue.
v1.85.0
Released at 2022-12-11
It is recommended upgrading to VictoriaMetrics v1.85.2 because of the bug, which may result in incomplete query results for historical time series.
Update note 1: this release drops support for direct upgrade from VictoriaMetrics versions prior v1.28.0. Please upgrade to v1.84.0
, wait until finished round 2 of background conversion
line is emitted to log by single-node VictoriaMetrics or by vmstorage
, and then upgrade to newer releases.
Update note 2: this release splits type="indexdb"
metrics into type="indexdb/inmemory"
and type="indexdb/file"
metrics. This may break old dashboards and alerting rules, which contain label filter on {type="indexdb"}
. Such label filter must be substituted with {type=~"indexdb.*"}
, so it matches indexdb
from the previous releases and indexdb/inmemory
+ indexdb/file
from new releases. It is recommended upgrading to the latest available dashboards and alerting rules mentioned in these docs, since they already contain fixed label filters.
Update note 3: this release deprecates relabel_debug
and metric_relabel_debug
config options in scrape_configs. The -relabelDebug
, -remoteWrite.relabelDebug
and -remoteWrite.urlRelabelDebug
command-line options are also deprecated. Use more powerful target-level relabel debugging and metric-level relabel debugging instead as documented here.
-
FEATURE: vmagent: provide enhanced target-level and metric-level relabel debugging. See these docs and this issue.
-
FEATURE: leave a sample with the biggest value for identical timestamps per each
-dedup.minScrapeInterval
discrete interval when the deduplication is enabled. See this issue. -
FEATURE: add
-inmemoryDataFlushInterval
command-line flag, which can be used for controlling the frequency of in-memory data flush to disk. The data flush frequency can be reduced when VictoriaMetrics stores data to low-end flash device with limited number of write cycles (for example, on Raspberry PI). See this feature request. -
FEATURE: expose additional metrics for
indexdb
andstorage
parts stored in memory and forindexdb
parts stored in files (see storage docs for technical details):vm_active_merges{type="storage/inmemory"}
- active merges for in-memorystorage
partsvm_active_merges{type="indexdb/inmemory"}
- active merges for in-memoryindexdb
partsvm_active_merges{type="indexdb/file"}
- active merges for file-basedindexdb
partsvm_merges_total{type="storage/inmemory"}
- the total merges for in-memorystorage
partsvm_merges_total{type="indexdb/inmemory"}
- the total merges for in-memoryindexdb
partsvm_merges_total{type="indexdb/file"}
- the total merges for file-basedindexdb
partsvm_rows_merged_total{type="storage/inmemory"}
- the total rows merged for in-memorystorage
partsvm_rows_merged_total{type="indexdb/inmemory"}
- the total rows merged for in-memoryindexdb
partsvm_rows_merged_total{type="indexdb/file"}
- the total rows merged for file-basedindexdb
partsvm_rows_deleted_total{type="storage/inmemory"}
- the total rows deleted for in-memorystorage
partsvm_assisted_merges_total{type="storage/inmemory"}
- the total number of assisted merges for in-memorystorage
partsvm_assisted_merges_total{type="indexdb/inmemory"}
- the total number of assisted merges for in-memoryindexdb
partsvm_parts{type="storage/inmemory"}
- the total number of in-memorystorage
partsvm_parts{type="indexdb/inmemory"}
- the total number of in-memoryindexdb
partsvm_parts{type="indexdb/file"}
- the total number of file-basedindexdb
partsvm_blocks{type="storage/inmemory"}
- the total number of in-memorystorage
blocksvm_blocks{type="indexdb/inmemory"}
- the total number of in-memoryindexdb
blocksvm_blocks{type="indexdb/file"}
- the total number of file-basedindexdb
blocksvm_data_size_bytes{type="storage/inmemory"}
- the total size of in-memorystorage
blocksvm_data_size_bytes{type="indexdb/inmemory"}
- the total size of in-memoryindexdb
blocksvm_data_size_bytes{type="indexdb/file"}
- the total size of file-basedindexdb
blocksvm_rows{type="storage/inmemory"}
- the total number of in-memorystorage
rowsvm_rows{type="indexdb/inmemory"}
- the total number of in-memoryindexdb
rowsvm_rows{type="indexdb/file"}
- the total number of file-basedindexdb
rows
-
FEATURE: DataDog parser: add
device
tag when it is passed in thedevice
field is present in theseries
object of the input request. Thanks to @PerGon for the provided pull request. -
FEATURE: vmagent: improve service discovery performance when discovering big number of targets (10K and more).
-
FEATURE: vmagent: allow using
series_limit
option for limiting the number of series a single scrape target generates in stream parsing mode. See this feature request. -
FEATURE: vmagent: allow using
sample_limit
option for limiting the number of metrics a single scrape target can expose in every response sent over stream parsing mode. -
FEATURE: vmagent: add
exported_
prefix to metric names exported by scrape targets if these metric names clash with automatically generated metrics such asup
,scrape_samples_scraped
, etc. This prevents from corruption of automatically generated metrics. See this issue. -
FEATURE: vmagent: make the
host
label optional in DataDog data ingestion protocol. See this issue. -
FEATURE: VictoriaMetrics cluster: improve error message when the requested path cannot be properly parsed, so users could identify the issue and properly fix the path. Now the error message links to url format docs. See this issue.
-
FEATURE: VictoriaMetrics enterprise cluster: add
-storageNode.discoveryInterval
command-line flag tovmselect
andvminsert
to control load on DNS servers when automatic discovery of vmstorage nodes is enabled. See this issue. -
FEATURE: VictoriaMetrics enterprise cluster: allow reading and updating the list of
vmstorage
nodes atvmselect
andvminsert
nodes via file. See automatic discovery of vmstorage for details. -
FEATURE: vmalert: reduce memory and CPU usage by up to 50% on setups with thousands of recording/alerting groups. See this issue.
-
FEATURE: vmalert: add
-remoteWrite.sendTimeout
command-line flag, which allows configuring timeout for sending data to-remoteWrite.url
. See this issue. -
FEATURE: vmctl: add ability to migrate data between VictoriaMetrics clusters with automatic tenants discovery. See these docs and this issue.
-
FEATURE: vmctl: add ability to copy data from sources via Prometheus
remote_read
protocol. See these docs. The related issues: one and two. -
FEATURE: vmui: allow changing timezones for the requested data. See this issue.
-
FEATURE: vmui: provide fast path for hiding results for all the queries except the given one by clicking
eye
icon withctrl
key pressed. See this feature request. -
FEATURE: MetricsQL: add
range_trim_spikes(phi, q)
function for trimmingphi
percent of the largest spikes per each time series returned byq
. See these docs. -
FEATURE: MetricsQL: allow passing
inf
arg into limitk, topk, bottomk and other functions, which accept numeric arg, which limits the number of output time series. See this feature request. -
FEATURE: vmgateway: add support for JWT token signature verification. See these docs for details.
-
FEATURE: put the version of VictoriaMetrics in the first message of query trace. This should simplify debugging.
-
BUGFIX: vmagent: fix the
The request did not have a subscription or a valid tenant level resource provider
error when discovering Azure targets with azure_sd_configs. See this issue. -
BUGFIX: vmalert: properly pass HTTP headers during the alert state restore procedure. See this issue.
-
BUGFIX: vmalert: properly specify rule evaluation step during the replay mode. The
step
value was previously overriden by-datasource.queryStep
command-line flag. -
BUGFIX: vmalert: properly return the error message from remote-write failures. Before, error was ignored and only
vmalert_remotewrite_errors_total
was incremented. -
BUGFIX: vmui: fix sticky tooltip sizing, which could prevent from closing the tooltip. See this issue.
-
BUGFIX: vmui: properly put multi-line queries in the url, so it could be copy-n-pasted and opened without issues in a new browser tab. Previously the url for multi-line query couldn't be opened. See this issue.
-
BUGFIX: vmui: correctly handle
up
anddown
keypresses when editing multi-line queries. See this issue.
v1.84.0
Released at 2022-11-25
It is recommended upgrading to VictoriaMetrics v1.85.2 because of the bug, which may result in incomplete query results for historical time series.
-
FEATURE: add support for Pushgateway data import format via
/api/v1/import/prometheus
url. See these docs and this issue. Thanks to @PerGon for the initial implementation. -
FEATURE: VictoriaMetrics cluster: add
http://<vmselect>:8481/admin/tenants
API endpoint for returning a list of registered tenants. See these docs for details. -
FEATURE: VictoriaMetrics enterprise: add
-storageNode.filter
command-line flag for filtering the discovered vmstorage nodes with arbitrary regular expressions. See this feature request. -
FEATURE: MetricsQL: allow using numeric values with
K
,Ki
,M
,Mi
,G
,Gi
,T
andTi
suffixes inside MetricsQL queries. For example8Ki
equals to8*1024
, while8.2M
equals to8.2*1000*1000
. -
FEATURE: MetricsQL: add range_normalize function for normalizing multiple time series into
[0...1]
value range. This function is useful for correlation analysis of time series with distinct value ranges. See this issue. -
FEATURE: MetricsQL: add range_linear_regression function for calculating simple linear regression over the input time series on the selected time range. This function is useful for predictions and capacity planning. For example,
range_linear_regression(process_resident_memory_bytes)
can predict future memory usage based on the past memory usage. -
FEATURE: MetricsQL: add range_stddev and range_stdvar functions.
-
FEATURE: MetricsQL: optimize
expr1 op expr2
query whenexpr1
returns an empty result. In this case there is no sense in executingexpr2
forop
not equal toor
, since the end result will be empty according to PromQL series matching rules. See this issue. Thanks to @jianglinjian for pointing to this case. -
FEATURE: vmui: add the ability to upload/paste JSON to investigate the trace. See this issue and this pull request.
-
FEATURE: vmui: reduce JS bundle size from 200Kb to 100Kb. See this pull request.
-
FEATURE: vmui: add the ability to hide results of a particular query by clicking the
eye
icon. See this pull request. -
FEATURE: vmui: add copy button to row on Table view. The button copies row in MetricQL format. See this issue.
-
FEATURE: vmui: add compact table view. See this issue.
-
FEATURE: vmui: add the ability to "stick" a tooltip on the chart by clicking on a data point. See this issue and this pull request
-
FEATURE: vmui: add the ability to set up series custom limits. See this issue.
-
FEATURE: vmalert: add default alert list for vmalert's metrics. See alerts-vmalert.yml.
-
FEATURE: vmagent: expose
vmagent_relabel_config_*
,vm_relabel_config_*
andvm_promscrape_config_*
metrics for tracking relabel and scrape configuration hot-reloads. See this issue. -
BUGFIX: MetricsQL: properly return an empty result from limit_offset if the
offset
arg exceeds the number of inner time series. See this issue. -
BUGFIX: vmagent: properly discover GCE zones when
filter
option is set at gce_sd_configs. See this issue. -
BUGFIX: vmui: properly display the requested graph on the requested time range when navigating from Prometheus URL in Grafana.
-
BUGFIX: vmui: properly display wide tables. See this issue.
-
BUGFIX: reduce CPU usage spikes and memory usage spikes under high data ingestion rate introduced in v1.83.0. See this issue.
v1.83.1
Released at 2022-11-10
It is recommended upgrading to VictoriaMetrics v1.85.2 because of the bug, which may result in incomplete query results for historical time series.
-
FEATURE: vmagent: expose
__meta_consul_partition
label for targets discovered via consul_sd_configs in the same way as Prometheus 2.40 does. -
FEATURE: vmui: show the query trace in JSON view. See this issue. Thanks to @michal-kralik for the pull request.
-
BUGFIX: VictoriaMetrics enterprise: fix a panic at
vminsert
when the discovered list ofvmstorage
nodes is changed during automatic vmstorage discovery. See this issue. -
BUGFIX: properly register new time series in per-day inverted index if they were ingested during the last 10 seconds of the day. See this issue. Thanks to @lmarszal for the bugreport and for the initial fix.
-
BUGFIX: reduce the increased memory usage spikes for some workloads. The issue was introduced in v1.83.0.
-
BUGFIX: properly accept OpenTSDB telnet put lines without tags without the need to specify the trailing whitespace. See this issue.
v1.83.0
Released at 2022-10-29
It is recommended upgrading to VictoriaMetrics v1.85.2 because of the bug, which may result in incomplete query results for historical time series.
Update note 1: the indexdb/tagFilters
cache type at /metrics has been renamed to indexdb/tagFiltersToMetricIDs
in order to make its purpose more clear.
Update note 2: vmalert: the crlfEscape
template function becomes obsolete starting from this release. It can be safely removed from alerting templates, since \n
chars are properly escaped with other *Escape
functions now. See this and this issue for details.
-
FEATURE: VictoriaMetrics enterprise: add support for automatic
vmstorage
nodes discovering and updating atvmselect
andvminsert
. See these docs. -
FEATURE: VictoriaMetrics enterprise: allow configuring multiple retentions for distinct sets of time series. See these docs, this and this feature request.
-
FEATURE: VictoriaMetric cluster enterprise: add support for multiple retentions for distinct tenants - see these docs and this and this feature request.
-
FEATURE: allow limiting memory usage on a per-query basis with
-search.maxMemoryPerQuery
command-line flag. See this feature request. -
FEATURE: allow referring environment variables inside command-line flags via
%{ENV_VAR}
syntax. For example, ifAUTH_KEY=top-secret
environment variable is set, then-metricsAuthKey=%{AUTH_KEY}
command-line flag is automatically expanded to-storageDataPath=top-secret
at VictoriaMetrics startup. See these docs for details. -
FEATURE: allow referring environment variables inside other environment variables via
%{ENV_VAR}
syntax. For example, ifA=a-%{B}
,B=b-%{C}
andC=c
env vars are set, then VictoriaMetrics components automatically expand them toA=a-b-c
,B=b-c
andC=c
on startup. -
FEATURE: vmagent: drop all the labels with
__
prefix from discovered targets in the same way as Prometheus does according to this article. Previously the following labels were available during metric-level relabeling:__address__
,__scheme__
,__metrics_path__
,__scrape_interval__
,__scrape_timeout__
,__param_*
. Now these labels are available only during target-level relabeling. This should reduce CPU usage and memory usage forvmagent
setups, which scrape big number of targets. -
FEATURE: vmagent: improve the performance for metric-level relabeling, which can be applied via
metric_relabel_configs
section at scrape_configs, via-remoteWrite.relabelConfig
or via-remoteWrite.urlRelabelConfig
command-line options. -
FEATURE: vmagent: allow specifying full url in scrape target addresses (aka
__address__
label). This makes valid the following-promscrape.config
:scrape_configs: - job_name: abc metrics_path: /foo/bar scheme: https static_configs: - targets: # the following targets are scraped by the provided full urls - 'http://host1/metric/path1' - 'https://host2/metric/path2' - 'http://host3:1234/metric/path3?arg1=value1' # the following target is scraped by <scheme>://host4:1234<metrics_path> - host4:1234
-
FEATURE: vmagent: allow controlling staleness tracking on a per-scrape_config basis by specifying
no_stale_markers: true
orno_stale_markers: false
option in the corresponding scrape_config. -
FEATURE: vmalert: add
strvalue
andstripDomain
template functions in order to improve compatibility with Prometheus. -
FEATURE: vmalert: add
jsonEscape
andhtmlEscape
template functions. -
FEATURE: vmui: limit the number of plotted series. This should prevent from browser crashes or hangs when the query returns big number of time series. See this feature request.
-
FEATURE: vmui: reduce memory usage when querying big number of time series. See this issue.
-
FEATURE: vmui: add responsive styles for small screens. See this issue and this pull request.
-
FEATURE: log error if some environment variables referred at
-promscrape.config
via%{ENV_VAR}
aren't found. This should prevent from silent using incorrect config files. -
FEATURE: immediately shut down VictoriaMetrics apps on the second SIGINT or SIGTERM signal if they couldn't be finished gracefully for some reason after receiving the first signal.
-
FEATURE: improve the performance of /api/v1/series endpoint by eliminating loading of unused
TSID
data during the API call. -
FEATURE: vmbackupmanager: add functionality for automated restore from backup. See these docs.
-
BUGFIX: MetricsQL: properly merge buckets with identical
le
values, but with different string representation of these values when calculating histogram_quantile and histogram_share. For example,http_request_duration_seconds_bucket{le="5"}
andhttp_requests_duration_seconds_bucket{le="5.0"}
. Such buckets may be returned from distinct targets. Thanks to @647-coder for the pull request. -
BUGFIX: vmalert: change severity level for log messages about failed attempts for sending data to remote storage from
error
towarn
. The message for about all failed send attempts remains aterror
severity level. -
BUGFIX: vmalert: fix panic if
vmalert
runs with-clusterMode
command-line flag in multitenant mode. The issue has been introduced in v1.82.0. -
BUGFIX: vmalert: properly escape string passed to
quotesEscape
template function, so it can be safely embedded into JSON string. This makes obsolete thecrlfEscape
function. See this and this issue. -
BUGFIX: vmagent: do not show invalid error message in Kubernetes service discovery:
cannot parse WatchEvent json response: EOF
. The invalid error message has been appeared in v1.82.0. -
BUGFIX: vmagent: properly add
exported_
prefix to metric labels, which clashing with scrape target labels ifhonor_labels: true
option isn't set in scrape_config. Previously someexported_
prefixes were missing in the resulting metric labels. See this issue. The issue has been introduced in v1.82.0. -
BUGFIX:
vmselect
: expose missing metricvm_cache_size_max_bytes{type="promql/rollupResult"}
. This metric is used for monitoring rollup cache usage with the queryvm_cache_size_bytes{type="promql/rollupResult"} / vm_cache_size_max_bytes{type="promql/rollupResult"}
in the same way as this is done for other cache types.
v1.82.1
Released at 2022-10-14
- BUGFIX: vmui: automatically update graph, legend and url after the removal of query field. See this feature request and this comment.
- BUGFIX: vmalert: remove duplicate
alertname
JSON entry from generated alerts. See this issue. Thanks to @Howie59 for the fix! - BUGFIX: vmalert: fix integration with Grafana via
-vmalert.proxyURL
, which has been broken in v1.82.0. See this issue. - BUGFIX: vmbackup: set default region to
us-east-1
ifAWS_REGION
environment variable isn't set. The issue was introduced in vmbackup v1.82.0. See this pull request. - BUGFIX: vmbackupmanager: fix deletion of old backups at Azure blob storage.
- BUGFIX: MetricsQL: properly apply regex filters when searching for time series. Previously unexpected time series could be returned from regex filter. See this issue. The issue was introduced in v1.82.0.
- BUGFIX: vmagent: properly apply
if
section with regex filters. Previously unexpected metrics could be returned fromif
section. The issue was introduced in v1.82.0.
v1.82.0
Released at 2022-10-07
It isn't recommended to use VictoriaMetrics and vmagent v1.82.0 because of the bug, which may result in incorrect query results and relabeling results. Upgrade to v1.82.1 instead.
Update note 1: this release changes data format for /api/v1/export/native in incompatible way, so it cannot be imported into older version of VictoriaMetrics via /api/v1/import/native.
Update note 2: vmalert changes default value for command-line flag -datasource.queryStep
from 0s
to 5m
. The change supposed to improve reliability of the rules evaluation when evaluation interval is lower than scraping interval.
Update note 3: vm_account_id
and vm_project_id
labels must be passed to tcp-based Graphite
, InfluxDB
and OpenTSDB
endpoints
at VictoriaMetrics cluster instead of undocumented
VictoriaMetrics_AccountID
and VictoriaMetrics_ProjectID
labels when writing samples to the needed tenant.
See these docs for details.
-
FEATURE: VictoriaMetrics cluster: support specifying tenant ids via
vm_account_id
andvm_project_id
labels. See these docs and this feature request. -
FEATURE: vmagent: improve relabeling performance by up to 3x for non-trivial
regex
values such as([^:]+):.+
, which can be used for extracting ahost
part fromhost:port
label value. -
FEATURE: MetricsQL: improve performance by up to 4x for queries containing non-trivial
regex
filters such as{path=~"/foo/.+|/bar"}
. -
FEATURE: improve performance scalability on systems with many CPU cores for /federate and /api/v1/export/... endpoints.
-
FEATURE: sanitize metric names for data ingested via DataDog protocol according to DataDog metric naming. The behaviour can be disabled by passing
-datadog.sanitizeMetricName=false
command-line flag. Thanks to @PerGon for the pull request. -
FEATURE: add
-usePromCompatibleNaming
command-line flag to vmagent, to single-node VictoriaMetrics and tovminsert
component of VictoriaMetrics cluster. This flag can be used for normalizing the ingested metric names and label names to Prometheus-compatible form. If this flag is set, then all the chars unsupported by Prometheus are replaced with_
chars in metric names and labels of the ingested samples. See this feature request. -
FEATURE: accept whitespace in metric names and tags ingested via Graphite plaintext protocol according to the specs. See this issue.
-
FEATURE: check the correctness of raw sample timestamps stored on disk when reading them. This reduces the probability of possible silent corruption of the data stored on disk. This should help this and this issue.
-
FEATURE: atomically delete directories with snapshots, parts and partitions at storage level. Previously such directories can be left in partially deleted state when the deletion operation was interrupted by unclean shutdown. This may result in
cannot open file ...: no such file or directory
error on the next start. The probability of this error was quite high when NFS or EFS was used as persistent storage for VictoriaMetrics data. See this issue. -
FEATURE: set the
start
arg toend - 5 minutes
if isn't passed explicitly to /api/v1/labels and /api/v1/label/.../values. See this pull request. -
FEATURE: allow to define the minimum TLS version to use when accepting https requests to VictoriaMetrics components if
-tls
command-line flag is set. The minimum TLS version can be set via-tlsMinVersion
command-line flag. See this feature request. -
FEATURE: vmctl: add
vm-native-step-interval
command line flag forvm-native
mode. New option allows splitting the import process into chunks by time interval. This helps migrating data sets with high churn rate and provides better control over the process. See feature request. -
FEATURE: vmui: add
top queries
tab, which shows various stats for recently executed queries. See these docs and this feature request. -
FEATURE: vmui: move the "Execute Query" and "Add Query" buttons below the query fields, change icon for remove query. See this issue.
-
FEATURE: vmui: set the maximum number of queries to 4, remove multi Y-axes, left one for all queries and dotted lines to indicate queries in the graph. See this issue.
-
FEATURE: vmalert: add
debug
mode to the alerting rule settings for printing additional information into logs during evaluation. Seedebug
param in alerting rule config. -
FEATURE: vmalert: add experimental feature for displaying last 10 states of the rule (recording or alerting) evaluation. The state is available on the Rule page, which can be opened by clicking on
Details
link next to Rule's name on the/groups
page. -
FEATURE: vmalert: allow using extra labels in annotations. See this feature request.
-
FEATURE: vmalert: allow configuring authorization params per list of targets in vmalert's notifier config for
static_configs
. See this issue. -
FEATURE: vmalert: allow using
{{$labels}}
for templating in command-line flag-external.alert.source
. The change supposed to provide additional flexibility for generating alert's source link based on labels values. -
FEATURE: vmalert: add
vm_account_id
andvm_project_id
labels to results of alerting and recording rules if-clusterMode
is enabled. This improves multitenant support in vmalert. -
FEATURE: vmagent: minimize the time needed for reading large responses from scrape targets in stream parsing mode. This should reduce scrape durations for such targets as kube-state-metrics running in a big Kubernetes cluster.
-
FEATURE: MetricsQL: add sort_by_label_numeric and sort_by_label_numeric_desc functions for numeric sort of input time series by the specified labels. See this feature request.
-
FEATURE: vmbackup and vmrestore: retry GCS operations for up to 3 minutes on temporary failures. See this issue.
-
FEATURE: vmbackup: add support for saving / restoring backups to / from Azure blob storage. See this feature request.
-
FEATURE: vmbackupmanager: expose
vm_backup_in_flight
metric, which can be used for determining which backup types - latest, hourly, daily, weekly or monthly - are currently executed. -
FEATURE: vmgateway: add ability to extract JWT authorization token from non-standard HTTP header by passing it via
-auth.httpHeader
command-line flag. See this feature request. -
FEATURE: vmagent: expose
__meta_ec2_region
label for ec2_sd_config in the same way as Prometheus 2.39 does. -
FEATURE: vmagent: accept data ingestion requests via paths starting from
/prometheus
prefix in the same way as VictoriaMetrics does. For example,vmagent
now accepts Prometheusremote_write
data via both/api/v1/write
and/prometheus/api/v1/write
. This simplifies switching between single-node VictoriaMetrics andvmagent
. -
FEATURE: vmagent: add
external_labels
fromglobal
section at-promscrape.config
after the relabeling is applied to scraped metrics. This aligns with Prometheus behaviour. Previously theexternal_labels
were added to scrape targets, so they could be modified during relabeling. See this issue. -
FEATURE: vmagent: allow specifying per-
-remoteWrite.url
limits for on-disk size for pending data via-remoteWrite.maxDiskUsagePerURL
command-line flag. Thanks to @rbizos for the pull request. -
FEATURE: VictoriaMetrics cluster: log clear error when multiple identical
-storageNode
command-line flags are passed tovmselect
or tovminsert
. Previously these components were crashed with cryptic panicmetric ... is already registered
in this case. See this issue. -
BUGFIX: do not export stale metrics via /federate api after the staleness markers. Previously such metrics were exported with
NaN
values. this could break some setups. See this issue. -
BUGFIX: export infinity numbers as
"Infinity"
strings at /api/v1/export, so they can be parsed by standard JSON parsers. Previously infinity numbers were exported asInf
values, which couldn't be parsed by standard JSON parsers. See this issue. -
BUGFIX: vmauth: properly handle request paths ending with
/
such as/vmui/
. Previouslyvmui
was dropping the trailing/
, which could prevent from usingvmui
viavmauth
. See this issue. -
BUGFIX: vmagent: properly encode query params for aws signed requests, use
%20
instead of+
as api requires. See this issue. -
BUGFIX: vmagent: properly parse relabel config when regex ending with escaped
$
. See this issue. -
BUGFIX: MetricsQL: properly calculate
rate_over_sum(m[d])
assum_over_time(m[d])/d
. Previously thesum_over_time(m[d])
could be improperly divided by smaller thand
time range. See rate_over_sum() docs and this issue. -
BUGFIX: MetricsQL: properly calculate
increase(m[d])
over slow-changing counters with values smaller than 100. Previously increase could return unexpectedly big results in this case. See the related issue and this pull request. -
BUGFIX: MetricsQL: ignore empty series when applying limit_offset. It should improve queries with additional filters by value in expressions like
limit_offset(1,1, foo > 1)
. -
BUGFIX: MetricsQL: properly calculate quantiles_over_time when the lookbehind window contains only a single sample. Previously an empty result was incorrectly returned in this case.
-
BUGFIX: vmui: fix
RangeError: Maximum call stack size exceeded
error when the query returns too many data points atTable
view. See this pull request. -
BUGFIX: vmui: fix workaround for adding more queries via URL. See this issue.
-
BUGFIX: vmalert: re-evaluate annotations per each alert evaluation. Previously, annotations were evaluated only on alert's value change. This could result in stale annotations in some cases described in this pull request.
-
BUGFIX: prevent from excessive CPU usage when the storage enters read-only mode. The previous fix in v1.81.0 wasn't complete.
-
BUGFIX: vmalert: change default value for command-line flag
-datasource.queryStep
from0s
to5m
. Paramstep
is added by vmalert to every rule evaluation request sent to datasource. Before this change,step
was equal to group's evaluation interval by default. Paramstep
for instant queries defines how far VM can look back for the last written data point. The change supposed to improve reliability of the rules evaluation when evaluation interval is lower than scraping interval. -
BUGFIX: properly calculate
vm_rows_scanned_per_query
histogram exported at/metrics
page ofvmselect
and single-node VictoriaMetrics. Previously it could return misleadingly high numbers for rollup functions, which scan only a few samples on the provided lookbehind window in square brackets. For example,increase(m[1d])
always scans only 2 rows (akaraw samples
) per each returned time series.
v1.81.2
Released at 2022-09-08
- BUGFIX: VictoriaMetrics cluster: properly calculate query results at
vmselect
. See this issue. The issue has been introduced in v1.81.0.
v1.81.1
Released at 2022-09-02
It isn't recommended to use VictoriaMetrics cluster v1.81.1 because of the bug, which may result in incorrect query results. Upgrade to v1.81.2 instead.
-
FEATURE: MetricsQL: evaluate
q1
, ...,qN
in parallel when calculatingunion(q1, .., qN)
. Previously union args were evaluated sequentially. This could result in lower than expected performance. -
BUGFIX: VictoriaMetrics cluster: fix potential panic at
vmselect
under high load, which has been introduced in v1.81.0. See this issue.
v1.81.0
It isn't recommended to use VictoriaMetrics cluster v1.81.0 because of the bug, which may result in vmselect
crashes under high load. Upgrade to v1.81.2 instead.
Released at 2022-08-31
Update note 1: vmalert by default hides values of -remoteWrite.url
, -remoteRead.url
and -datasource.url
in logs and at http://vmalert:8880/flags
for security reasons. See the corresponding SECURITY change in the Changelog below for additional info.
Update note 2: vmalert by default points alert source url to /vmalert/alert?...
aka web UI instead of /vmalert/api/v1/alert?...
aka JSON handler. The old behavior can be achieved by setting -external.alert.source=vmalert/api/v1/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}
command-line flag.
-
SECURITY: vmalert: do not expose
-remoteWrite.url
,-remoteRead.url
and-datasource.url
command-line flag values in logs and athttp://vmalert:8880/flags
page by default, since they may contain sensitive data such as auth keys. This alignsvmalert
behaviour with vmagent, which doesn't expose-remoteWrite.url
command-line flag value in logs and athttp://vmagent:8429/flags
page by default. Specify-remoteWrite.showURL
,-remoteRead.showURL
and-datasource.showURL
command-line flags for showing values for the corresponding-*.url
flags in logs. Thanks to @mble for the pull request. -
SECURITY: upgrade base docker image (alpine) from 3.16.1 to 3.16.2. See alpine 3.16.2 release notes.
-
FEATURE: return shorter error messages to Grafana and to other clients requesting /api/v1/query and /api/v1/query_range endpoints. This should simplify reading these errors by humans. The long error message with full context is still written to logs.
-
FEATURE: add the ability to fine-tune the number of points, which can be generated per each matching time series during subquery evaluation. This can be done with the
-search.maxPointsSubqueryPerTimeseries
command-line flag. See this feature request. -
FEATURE: vmagent: improve the performance for relabeling rules with commonly used regular expressions in
regex
andif
fields such assome_string
,prefix.*
,prefix.+
,foo|bar|baz
,.*foo.*
and.+foo.+
. -
FEATURE: vmagent: reduce CPU usage when discovering big number of Kubernetes targets with big number of labels and annotations.
-
FEATURE: vmagent: add ability to accept multitenant data via OpenTSDB
/api/put
protocol at/insert/<tenantID>/opentsdb/api/put
http endpoint if multitenant support is enabled atvmagent
. Thanks to @chengjianyun for the pull request. -
FEATURE: monitoring: expose
vm_hourly_series_limit_max_series
,vm_hourly_series_limit_current_series
,vm_daily_series_limit_max_series
andvm_daily_series_limit_current_series
metrics when-search.maxHourlySeries
or-search.maxDailySeries
limits are set. This allows alerting when the number of unique series reaches the configured limits. See these docs for details. -
FEATURE: VictoriaMetrics cluster: reduce the amounts of logging at
vmstorage
whenvmselect
connects/disconnects tovmstorage
. -
FEATURE: VictoriaMetrics cluster: improve performance for heavy queries on systems with many CPU cores.
-
FEATURE: vmagent: add ability to use
{{label_name}}
placeholders in thereplacement
option of relabeling rules. This simplifies constructing label values from multiple existing label values. See these docs for details. -
FEATURE: vmagent: generate additional per-target metrics -
scrape_series_limit
,scrape_series_current
andscrape_series_limit_samples_dropped
if series limit is set according to these docs. This simplifies alerting on targets with the exceeded series limit. See these docs for details on these metrics. -
FEATURE: vmagent: add support for MX record types in dns_sd_configs in the same way as Prometheus 2.38 does.
-
FEATURE: vmagent: add
__meta_kubernetes_service_port_number
meta-label forrole: service
in kubernetes_sd_configs in the same way as Prometheus 2.38 does. -
FEATURE: vmagent: add
__meta_kubernetes_pod_container_image
meta-label forrole: pod
in kubernetes_sd_configs in the same way as Prometheus 2.38 does. -
FEATURE: vmagent: retry HTTP requests after some wait time during service discovery and during target scrapes if the server returns 429 HTTP status code (aka
Too many requests
). See this issue. -
FEATURE: vmui: add a legend in the top right corner for shortcut keys. See this feature request.
-
FEATURE: vmalert: add
toTime()
template function in the same way as Prometheus 2.38 does. See these docs. -
FEATURE: vmalert: add
$alertID
and$groupID
template variables. These variables may be used for templating annotations or-external.alert.source
command-line flag. See the full list of supported variables here. -
FEATURE: vmalert: add
$activeAt
template variable. See this feature request. See the full list of supported variables here. Thanks to @laixintao for the pull request. -
FEATURE: vmalert: point alert source to vmalert's UI at
/vmalert/alert?...
instead of JSON handler at/vmalert/api/v1/alert?...
. This improves user experience. The old behavior can be achieved by setting-external.alert.source=vmalert/api/v1/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}
command-line flag. -
BUGFIX: prevent from excess CPU usage when the storage enters read-only mode.
-
BUGFIX: improve performance for requests to /api/v1/labels and /api/v1/label/.../values when the filter in the
match[]
query arg matches small number of time series. The performance for this case has been reduced in v1.78.0. See this and this issues. -
BUGFIX: increase the default limit on the number of concurrent merges for small parts from 8 to 16. This should help resolving potential issues with heavy data ingestion. See this comment from @lukepalmer .
-
BUGFIX: MetricsQL: fix panic when incorrect arg is passed as
phi
into histogram_quantiles function. See this issue.
v1.80.0
Released at 2022-08-08
-
FEATURE: vmalert: allow configuring additional HTTP request headers for
-datasource.url
,-remoteWrite.url
and-remoteRead.url
via-datasource.headers
,-remoteWrite.headers
and-remoteRead.headers
command-line flags. Additional HTTP request headers also can be set on group level viaheaders
param - see these docs and this issue. -
FEATURE: MetricsQL: execute left and right sides of certain operations in parallel. For example,
q1 or q2
,aggr_func(q1) <op> q2
,q1 <op> aggr_func(q1)
. This may improve query performance if VictoriaMetrics has enough free resources for parallel processing of both sides of the operation. See this feature request. -
FEATURE: vmauth: allow multiple sections with duplicate
username
but with differentpassword
values at-auth.config
file. -
FEATURE: add ability to push internal metrics (e.g. metrics exposed at
/metrics
page) to the configured remote storage from all the VictoriaMetrics components. See these docs. -
FEATURE: improve performance for heavy queries over big number of time series on systems with big number of CPU cores. See this issue. Thanks to @zqyzyq for the idea.
-
FEATURE: improve performance for registering new time series in
indexdb
by up to 50%. Thanks to @ahfuzhang for the issue. -
FEATURE: vmagent: add ability to specify tenantID in target labels. In this case metrics from the given target are routed to the given
__tenant_id__
. See these docs and this feature request. -
FEATURE: vmagent: add service discovery for Yandex Cloud. See these docs and this feature request.
-
FEATURE: vmui. Zoom in the graph by selecting the needed time range in the same way Grafana does. Hold
ctrl
(orcmd
on MacOS) in order to move the graph to the left/right. Holdctrl
(orcmd
on MacOS) and scroll up/down in order to zoom in/out the area under the cursor. See this feature request. -
BUGFIX: VictoriaMetrics cluster: fix potential panic in multi-level cluster setup when top-level
vmselect
is configured with-replicationFactor
bigger than 1. See this issue. -
BUGFIX: vmagent: properly handle custom
endpoint
value in ec2_sd_configs. It was ignored since v1.77.0 because of a bug in the implementation of this feature request. See this issue. -
BUGFIX: vmagent: add missing
__meta_kubernetes_ingress_class_name
meta-label forrole: ingress
service discovery in Kubernetes. See this commit from Prometheus. -
BUGFIX: vmagent: allow stale responses from Consul service discovery (aka consul_sd_configs) by default in the same way as Prometheus does. This should reduce load on Consul when discovering big number of targets. Stale responses can be disabled by specifying
allow_stale: false
option inconsul_sd_config
. See this issue. -
BUGFIX: vmagent: dockerswarm_sd_configs: properly set
__meta_dockerswarm_container_label_*
labels instead of__meta_dockerswarm_task_label_*
labels as Prometheus does. See this issue. -
BUGFIX: vmagent: set
up
metric to0
for partial scrapes in stream parsing mode. Previously theup
metric was set to1
when at least a single metric has been scraped before the error. This aligns the behaviour ofvmselect
with Prometheus. -
BUGFIX: vmagent: restart all the scrape jobs during config reload after
global
section is changed inside-promscrape.config
. See this issue. -
BUGFIX: vmagent: properly assume role with AWS ECS credentials. See this issue. Thanks to @transacid for the fix.
-
BUGFIX: vmagent: do not split regex in relabeling rules into multiple lines if it contains groups. This fixes this issue.
-
BUGFIX: MetricsQL: return series from
q1
ifq2
doesn't return matching time series in the queryq1 ifnot q2
. Previously series fromq1
weren't returned in this case. -
BUGFIX: vmui: properly show date picker at
Table
tab. See this issue. -
BUGFIX: properly generate http redirects if
-http.pathPrefix
command-line flag is set. See this issue.
v1.79.14
Released at 2023-08-12
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: upgrade Go builder from Go1.20.4 to Go1.21.0.
-
SECURITY: upgrade base docker image (Alpine) from 3.18.2 to 3.18.3. See alpine 3.18.3 release notes.
-
BUGFIX: vmagent: properly apply
if
filters during relabeling. Previously theif
filter could improperly work. See this issue and this pull request. -
BUGFIX: vmalert: Properly form path to static assets in WEB UI if
http.pathPrefix
set. See this issue. -
BUGFIX: vmalert: Properly set datasource query params. See this issue. Thanks to @gsakun for the pull request.
-
BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
and/api/v1/alerts
API handlers. See this issue. -
BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
for groups with present name but absentrules
. See this issue.
v1.79.13
Released at 2023-05-18
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- SECURITY: upgrade Go builder from Go1.20.3 to Go1.20.4. See the list of issues addressed in Go1.20.4.
- SECURITY: upgrade base docker image (alpine) from 3.17.3 to 3.18.0. See alpine 3.18.0 release notes.
- SECURITY: serve
/robots.txt
content to disallow indexing of the exposed instances by search engines. See this issue for details.
v1.79.12
Released at 2023-04-06
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: upgrade base docker image (alpine) from 3.17.2 to 3.17.3. See alpine 3.17.3 release notes.
-
SECURITY: upgrade Go builder from Go1.20.2 to Go1.20.3. See the list of issues addressed in Go1.20.3.
-
BUGFIX: vmagent: fix CPU and memory usage spikes when files pointed by file_sd_config cannot be re-read. See this_issue.
-
BUGFIX: prevent unexpected merges on start-up when
-storage.minFreeDiskSpaceBytes
is set. See the issue. -
BUGFIX: verify response code when fetching configuration files via HTTP. See this issue.
v1.79.11
Released at 2023-03-12
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: upgrade Go builder from Go1.20.1 to Go1.20.2. See the list of issues addressed in Go1.20.2.
-
BUGFIX: fix a bug, which could lead to incomplete or empty results for heavy queries selecting tens of thousands of time series. See this pull request.
-
BUGFIX: VictoriaMetrics cluster: properly take into account
-rpc.disableCompression
command-line flag atvmstorage
. It was ignored since v1.78.0. See this pull request. -
BUGFIX: prevent from possible
SIGBUS
crash on ARM architectures (Raspberry Pi), which deny unaligned access to 8-byte words. Thanks to @oliverpool for narrowing down the issue and for the initial attempt to fix it.
v1.79.10
Released at 2023-02-27
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- BUGFIX: prevent from high CPU usage on the first UTC hour of the data. The issue has been introduced in v1.79.5 when fixing this issue.
v1.79.9
Released at 2023-02-24
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: upgrade base docker image (alpine) from 3.17.1 to 3.17.2. See alpine 3.17.2 release notes.
-
SECURITY: upgrade Go builder from Go1.20.0 to Go1.20.1. See the list of issues addressed in Go1.20.1.
-
BUGFIX: properly parse timestamps in milliseconds when ingesting data via OpenTSDB telnet put protocol. Previously timestamps in milliseconds were mistakenly multiplied by 1000. Thanks to @Droxenator for the pull request.
v1.79.8
Released at 2023-02-03
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- BUGFIX: fix a bug, which could prevent background merges for the previous partitions until restart if the storage didn't have enough disk space for final deduplication and down-sampling.
- BUGFIX: vmagent: update API version for ec2_sd_configs to fix the issue with missing
__meta_ec2_availability_zone_id
attribute. - BUGFIX: VictoriaMetrics cluster: fix panic on top-level vmselect nodes of multi-level setup when the
-replicationFactor
flag is set and request containstrace
query parameter. See this issue. - BUGFIX: vmagent: dockerswarm_sd_configs: apply
filters
only to objects of the specifiedrole
. Previously filters were applied to all the objects, which could cause errors when different types of objects were used with filters that were not compatible with them. See this issue. - BUGFIX: vmagent: suppress all the scrape errors when
-promscrape.suppressScrapeErrors
is enabled. Previously some scrape errors were logged even if-promscrape.suppressScrapeErrors
flag was set. - BUGFIX: vmagent: consistently put the scrape url with scrape target labels to all error logs for failed scrapes. Previously some failed scrapes were logged without this information.
- BUGFIX: MetricsQL: properly parse
M
andMi
suffixes as1e6
multipliers in1M
and1Mi
numeric constants. See this issue. The issue has been introduced in v1.79.7.
v1.79.7
Released at 2023-01-10
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- BUGFIX: properly parse floating-point numbers without integer or fractional parts such as
.123
and20.
during data import. See this issue. - BUGFIX: MetricsQL: properly parse durations with uppercase suffixes such as
10S
,5MS
,1W
, etc. See this issue. - BUGFIX: vmagent: dockerswarm_sd_configs: properly encode
filters
field. See this issue - BUGFIX: allow specifying values bigger than 2GiB to the following command-line flag values on 32-bit architectures (
386
andarm
):-storage.minFreeDiskSpaceBytes
and-remoteWrite.maxDiskUsagePerURL
. Previously values bigger than 2GiB were incorrectly truncated on these architectures. - BUGFIX: VictoriaMetrics enterprise: expose proper values for
vm_downsampling_partitions_scheduled
andvm_downsampling_partitions_scheduled_size_bytes
metrics, which were added at v1.78.0. See this feature request. - BUGFIX: DataDog protocol parser: do not re-use
host
anddevice
fields from the previously parsed messages if these fields are missing in the currently parsed message. See this issue.
v1.79.6
Released at 2022-12-11
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: update Go builder from v1.19.3 to v1.19.4. See the changelog.
-
SECURITY: update base Docker image for VictoriaMetrics components from Alpine 3.16.2 to Alpine v3.17.0. See the changelog.
-
BUGFIX: vmagent: fix the
The request did not have a subscription or a valid tenant level resource provider
error when discovering Azure targets with azure_sd_configs. See this issue. -
BUGFIX: vmagent: properly discover GCE zones when
filter
option is set at gce_sd_configs. See this issue. -
BUGFIX: vmalert: properly specify rule evaluation step during the replay mode. The
step
value was previously overriden by-datasource.queryStep
command-line flag. -
BUGFIX: vmalert: properly return the error message from remote-write failures. Before, error was ignored and only
vmalert_remotewrite_errors_total
was incremented. -
BUGFIX: MetricsQL: properly return an empty result from limit_offset if the
offset
arg exceeds the number of inner time series. See this issue.
v1.79.5
Released at 2022-11-10
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
Update note 1: vmalert: the crlfEscape
template function becomes obsolete starting from this release. It can be safely removed from alerting templates, since \n
chars are properly escaped with other *Escape
functions now. See this and this issue for details.
-
SECURITY: update Go builder to v1.19.3. This fixes CVE-2022 security issue. See the changelog.
-
BUGFIX: properly register new time series in per-day inverted index if they were ingested during the last 10 seconds of the day. See this issue. Thanks to @lmarszal for the bugreport and for the initial fix.
-
BUGFIX: properly accept OpenTSDB telnet put lines without tags without the need to specify the trailing whitespace. See this issue.
-
BUGFIX: MetricsQL: properly merge buckets with identical
le
values, but with different string representation of these values when calculating histogram_quantile and histogram_share. For example,http_request_duration_seconds_bucket{le="5"}
andhttp_requests_duration_seconds_bucket{le="5.0"}
. Such buckets may be returned from distinct targets. Thanks to @647-coder for the pull request. -
BUGFIX: vmalert: change severity level for log messages about failed attempts for sending data to remote storage from
error
towarn
. The message for about all failed send attempts remains aterror
severity level. -
BUGFIX: vmalert: properly escape string passed to
quotesEscape
template function, so it can be safely embedded into JSON string. This makes obsolete thecrlfEscape
function. See this and this issue. -
BUGFIX:
vmselect
: expose missing metricvm_cache_size_max_bytes{type="promql/rollupResult"}
. This metric is used for monitoring rollup cache usage with the queryvm_cache_size_bytes{type="promql/rollupResult"} / vm_cache_size_max_bytes{type="promql/rollupResult"}
in the same way as this is done for other cache types.
v1.79.4
Released at 2022-10-07
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
Update note 1: vmalert changes default value for command-line flag -datasource.queryStep
from 0s
to 5m
. The change supposed to improve reliability of the rules evaluation when evaluation interval is lower than scraping interval.
-
FEATURE: expose
vmagent_remotewrite_queues
metric and use it in alerting rules in order to improve the detection of remote storage connection saturation. See this pull request. -
BUGFIX: do not export stale metrics via /federate api after the staleness markers. Previously such metrics were exported with
NaN
values. this could break some setups. See this issue. -
BUGFIX: vmauth: properly handle request paths ending with
/
such as/vmui/
. Previouslyvmui
was dropping the trailing/
, which could prevent from usingvmui
viavmauth
. See this issue. -
BUGFIX: vmagent: properly encode query params for aws signed requests, use
%20
instead of+
as api requires. See this issue. -
BUGFIX: MetricsQL: properly calculate
rate_over_sum(m[d])
assum_over_time(m[d])/d
. Previously thesum_over_time(m[d])
could be improperly divided by smaller thand
time range. See rate_over_sum() docs and this issue. -
BUGFIX: MetricsQL: properly calculate
increase(m[d])
over slow-changing counters with values smaller than 100. Previously increase could return unexpectedly big results in this case. See the related issue and this pull request. -
BUGFIX: MetricsQL: ignore empty series when applying limit_offset. It should improve queries with additional filters by value in expressions like
limit_offset(1,1, foo > 1)
. -
BUGFIX: MetricsQL: properly calculate quantiles_over_time when the lookbehind window contains only a single sample. Previously an empty result was incorrectly returned in this case.
-
BUGFIX: vmui: fix
RangeError: Maximum call stack size exceeded
error when the query returns too many data points atTable
view. See this pull request. -
BUGFIX: vmalert: re-evaluate annotations per each alert evaluation. Previously, annotations were evaluated only on alert's value change. This could result in stale annotations in some cases described in this pull request.
-
BUGFIX: prevent from excessive CPU usage when the storage enters read-only mode. The previous fix in v1.79.3 wasn't complete.
-
BUGFIX: vmalert: change default value for command-line flag
-datasource.queryStep
from0s
to5m
. Paramstep
is added by vmalert to every rule evaluation request sent to datasource. Before this change,step
was equal to group's evaluation interval by default. Paramstep
for instant queries defines how far VM can look back for the last written data point. The change supposed to improve reliability of the rules evaluation when evaluation interval is lower than scraping interval. -
BUGFIX: properly calculate
vm_rows_scanned_per_query
histogram exported at/metrics
page ofvmselect
and single-node VictoriaMetrics. Previously it could return misleadingly high numbers for rollup functions, which scan only a few samples on the provided lookbehind window in square brackets. For example,increase(m[1d])
always scans only 2 rows (akaraw samples
) per each returned time series.
v1.79.3
Released at 2022-08-30
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
-
SECURITY: vmalert: do not expose
-remoteWrite.url
,-remoteRead.url
and-datasource.url
command-line flag values in logs and athttp://vmalert:8880/flags
page by default, since they may contain sensitive data such as auth keys. This alignsvmalert
behaviour with vmagent, which doesn't expose-remoteWrite.url
command-line flag value in logs and athttp://vmagent:8429/flags
page by default. Specify-remoteWrite.showURL
,-remoteRead.showURL
and-datasource.showURL
command-line flags for showing values for the corresponding-*.url
flags in logs. Thanks to @mble for the pull request. -
SECURITY: upgrade base docker image (alpine) from 3.16.1 to 3.16.2. See alpine 3.16.2 release notes.
-
BUGFIX: prevent from excess CPU usage when the storage enters read-only mode.
-
BUGFIX: improve performance for requests to /api/v1/labels and /api/v1/label/.../values when the filter in the
match[]
query arg matches small number of time series. The performance for this case has been reduced in v1.78.0. See this and this issues. -
BUGFIX: increase the default limit on the number of concurrent merges for small parts from 8 to 16. This should help resolving potential issues with heavy data ingestion. See this comment from @lukepalmer .
-
BUGFIX: MetricsQL: fix panic when incorrect arg is passed as
phi
into histogram_quantiles function. See this issue.
v1.79.2
Released at 2022-08-08
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- BUGFIX: VictoriaMetrics cluster: fix potential panic in multi-level cluster setup when top-level
vmselect
is configured with-replicationFactor
bigger than 1. See this issue. - BUGFIX: vmagent: properly handle custom
endpoint
value in ec2_sd_configs. It was ignored since v1.77.0 because of a bug in the implementation of this feature request. - BUGFIX: vmagent: add missing
__meta_kubernetes_ingress_class_name
meta-label forrole: ingress
service discovery in Kubernetes. See this commit from Prometheus. - BUGFIX: vmagent: allow stale responses from Consul service discovery (aka consul_sd_configs) by default in the same way as Prometheus does. This should reduce load on Consul when discovering big number of targets. Stale responses can be disabled by specifying
allow_stale: false
option inconsul_sd_config
. See this issue. - BUGFIX: vmagent: dockerswarm_sd_configs: properly set
__meta_dockerswarm_container_label_*
labels instead of__meta_dockerswarm_task_label_*
labels as Prometheus does. See this issue. - BUGFIX: vmagent: set
up
metric to0
for partial scrapes in stream parsing mode. Previously theup
metric was set to1
when at least a single metric has been scraped before the error. This aligns the behaviour ofvmselect
with Prometheus. - BUGFIX: vmagent: restart all the scrape jobs during config reload after
global
section is changed inside-promscrape.config
. See this issue. - BUGFIX: vmagent: properly assume role with AWS ECS credentials. See this issue. Thanks to @transacid for the fix.
- BUGFIX: vmagent: do not split regex in relabeling rules into multiple lines if it contains groups. This fixes this issue.
- BUGFIX: MetricsQL: return series from
q1
ifq2
doesn't return matching time series in the queryq1 ifnot q2
. Previously series fromq1
weren't returned in this case. - BUGFIX: vmui: properly show date picker at
Table
tab. See this issue. - BUGFIX: properly generate http redirects if
-http.pathPrefix
command-line flag is set. See this issue.
v1.79.1
Released at 2022-08-02
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
- SECURITY: upgrade base docker image (alpine) from 3.16.0 to 3.16.1 . See alpine 3.16.1 release notes.
v1.79.0
Released at 2022-07-14
v1.79.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.79.x line will be supported for at least 12 months since v1.79.0 release
Update note 1: this release introduces backwards-incompatible changes to vm_partial_results_total
metric by changing its labels to be consistent with vm_requests_total
metric. If you use alerting rules or Grafana dashboards, which rely on this metric, then they must be updated. The official dashboards for VictoriaMetrics don't use this metric.
Update note 2: vmalert adds /vmalert/
prefix to web urls according to this issue. This may affect vmalert
instances with non-empty -http.pathPrefix
command-line flag. After the update, configuring this flag is no longer needed. Here's why.
Update note 3: this release introduces backwards-incompatible changes to communication protocol between vmselect
and vmstorage
nodes in cluster version of VictoriaMetrics because of added ability to query vmselect
data from other vmselect
nodes - see these docs, so read requests to vmselect
will fail until the upgrade is complete. These errors will stop after all the vmselect
and vmstorage
nodes are updated to the new release. It is safe to downgrade to previous releases at any time.
Update note 4: this release removes support of deprecated in 1.70.0 param extra_filter_labels
from vmalert's groups definition. This deprecated param was replaced with params.
Update note 5: this release changes naming for published linux binaries at releases. Now names for binaries for all the supported platforms match the following template - $(APP_NAME)-$(GOOS)-$(GOARCH)-$(VERSION).tar.gz
. For example, victoria-metrics-linux-amd64-v1.79.0.tar.gz
. Previously linux binaries didn't have $(GOOS)
part, e.g. they had the name victoria-metrics-amd64-v1.79.0.tar.gz
. Please update automation scripts for upgrading VictoriaMetrics releases according to this change.
- FEATURE: vmagent: add azure_sd_configs service discovery mechanism. It allows discovering Virtual Machines at Azure Cloud. See this issue.
- FEATURE: vmalert: deprecate alert's status link
/api/v1/<groupID>/<alertID>/status
in favour ofapi/v1/alert?group_id=<group_id>&alert_id=<alert_id>"
. The old alert's status link is still supported, but will be removed in future releases. See this issue. - FEATURE: cluster version of VictoriaMetrics: add support for querying lower-level
vmselect
nodes from upper-levelvmselect
nodes. This makes possible to build multi-level cluster setups for global querying view and HA purposes without the need to use Promxy. See these docs and this issue. - FEATURE: add
-search.setLookbackToStep
command-line flag, which enables InfluxDB-like gap filling during querying. See these docs for details. - FEATURE: vmui: add an UI for query tracing. It can be enabled by clicking
trace query
checkbox and re-running the query. See this feature request. - FEATURE: vmagent: add
-remoteWrite.headers
command-line option for specifying optional HTTP headers to send to the configured-remoteWrite.url
. For example,-remoteWrite.headers='Foo:Bar^^Baz:x'
would sendFoo: Bar
andBaz: x
HTTP headers with every request to-remoteWrite.url
. See this feature request. - FEATURE: vmagent: push per-target
scrape_samples_limit
metric to the configured-remoteWrite.url
ifsample_limit
option is set for this target in scrape_configs. See this feature request. - FEATURE: vmagent: attach node-level labels to kubernetes_sd_config targets if
attach_metadata: {"node": true}
is set forrole: endpoints
androle: endpointslice
. This is a feature backport from Prometheus 2.37 - see this pull request. - FEATURE: vmagent: add ability to specify additional HTTP headers to send to scrape targets via
headers
section inscrape_configs
. This can be used when the scrape target requires custom authorization and authentication like in this stackoverflow question. For example, the following config instructs sendingMy-Auth: top-secret
andTenantID: FooBar
headers with each request tohttp://host123:8080/metrics
:
scrape_configs:
- job_name: foo
headers:
- "My-Auth: top-secret"
- "TenantID: FooBar"
static_configs:
- targets: ["host123:8080"]
-
FEATURE: add ability to pass
limit
query arg toapi/v1/series
endpoint. This can be used if only a sample of up tolimit
series must be returned from the endpoint. See this feature request and these docs. -
FEATURE: query tracing: show timestamps in query traces in human-readable format (aka
RFC3339
in UTC timezone) instead of milliseconds since Unix epoch. For example,2022-06-27T10:32:54.506Z
instead of1656325974506
. This improves traces' readability. -
FEATURE: improve performance of /api/v1/series requests, which return big number of time series.
-
FEATURE: VictoriaMetrics cluster: improve query performance when replication is enabled.
-
FEATURE: MetricsQL: properly handle partial counter resets in remove_resets function. Now
remove_resets(sum(m))
should returns the expected increasing line when some time series matchingm
disappear on the selected time range. Previously such a query would return horizontal line after the disappeared series. -
FEATURE: expose
vm_next_retention_seconds
metric athttp://victoriametrics:8428/metrics
, which shows the number of seconds left until the nextindexdb
rotation. Thanks to @guidao for the pull request. -
FEATURE: expose additional histogram metrics at
http://victoriametrics:8428/metrics
, which may help understanding query workload:vm_rows_read_per_query
- the number of raw samples read per query.vm_rows_scanned_per_query
- the number of raw samples scanned per query. This number can exceedvm_rows_read_per_query
ifstep
query arg passed to /api/v1/query_range is smaller than the lookbehind window set in square brackets of rollup function. For example, ifincrease(some_metric[1h])
is executed with thestep=5m
, then the same raw samples on a hour time range are scanned1h/5m=12
times. See this article for details.vm_rows_read_per_series
- the number of raw samples read per queried series.vm_series_read_per_query
- the number of series read per query.
-
FEATURE: publish binaries for FreeBSD and OpenBSD at releases page.
-
FEATURE: vmui: allow selecting the needed columns at table view. This functionally may help when the selected time series contain many different labels. See this feature request and this pull request.
-
BUGFIX: consistently name binaries at releases page in the form
$(APP_NAME)-$(GOOS)-$(GOARCH)-$(VERSION).tar.gz
. For example,victoria-metrics-linux-amd64-v1.79.0.tar.gz
. Previously the$(GOOS)
part was missing in binaries for Linux. -
BUGFIX: vmalert: allow using
__name__
label (aka metric name) in alerting annotations. For example:
{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}
- BUGFIX: limit max memory occupied by the cache, which stores parsed regular expressions. Previously too long regular expressions passed in MetricsQL queries could result in big amounts of used memory (e.g. multiple of gigabytes). Now the max cache size for parsed regexps is limited to a a few megabytes.
- BUGFIX: MetricsQL: properly handle partial counter resets when calculating rate, irate and increase functions. Previously these functions could return zero values after partial counter resets until the counter increases to the last value before partial counter reset. See this issue.
- BUGFIX: MetricsQL: properly calculate histogram_quantile over Prometheus buckets with unexpected values. See this issue.
- BUGFIX: MetricsQL: properly evaluate timezone_offset function over time range covering time zone offset switches. See this issue.
- BUGFIX: vmagent: properly add service-level labels (
__meta_kubernetes_service_*
) to discovered targets forrole: endpointslice
in kubernetes_sd_config. Previously these labels were missing. See this issue. - BUGFIX: vmagent: make sure that stale markers are generated with the actual timestamp when unsuccessful scrape occurs. This should prevent from possible time series overlap on scrape target restart in dynamic environments such as Kubernetes.
- BUGFIX: vmagent: properly reload changed
-promscrape.config
file when-promscrape.configCheckInterval
option is set. The changed config file wasn't reloaded in this case since v1.69.0. See this pull request. Thanks to @ttyv for the fix. - BUGFIX: vmagent: properly set
Host
header during target scraping whenproxy_url
is set to http proxy. Previously theHost
header was set to the proxy hostname instead of the target hostname. See this issue. - BUGFIX: VictoriaMetrics cluster: assume that the response is complete if
-search.denyPartialResponse
is enabled and up to-replicationFactor - 1
vmstorage
nodes are unavailable. See this issue. - BUGFIX: vmselect: update
vm_partial_results_total
metric labels to be consistent withvm_requests_total
labels. - BUGFIX: accept tags without values when reading data in DataDog format. Thanks to @PerGon for the pull request.
- BUGFIX: vmui: properly pass the end of the selected time range to
time
query arg to /api/v1/query when displaying the requested data in JSON and table views. Previously thetime
query arg wasn't set, so/api/v1/query
was always returning query results for the current time regardless of the selected time range. See this issue. - BUGFIX: vmui: allow clicking on the suggestion from autocomplete list. See this issue.
- BUGFIX: vmui: apply the selected time range in date picker only after clicking the
Apply
button. See this issue.
v1.78.1
Released at 2022-07-08
Update notes: it is recommended clearing caches after the upgrade from v1.78.0 in order to immediately fix the issue for newly ingested data. Otherwise the issue may exist for newly ingested data for up to a day after the upgrade.
- BUGFIX: properly register time series in per-day inverted index. Previously some series could miss registration in the per-day inverted index. This could result in missing time series during querying. The issue has been introduced in v1.78.0. See this and this issues.
v1.78.0
Released at 2022-06-20
Warning (2022-07-03): VictoriaMetrics v1.78.0 contains a bug, which may result in missing time series during queries. It is recommended upgrading to v1.78.1, which fixes the bug.
Update notes: this release introduces backwards-incompatible changes to communication protocol between vmselect
and vmstorage
nodes in cluster version of VictoriaMetrics because of added query tracing, so read requests to vmselect
will fail until the upgrade is complete. These errors will stop after all the vmselect
and vmstorage
nodes are updated to the new release. It is safe to downgrade to previous releases.
-
SECURITY: add
-flagsAuthKey
command-line flag for protecting/flags
endpoint from unauthorized access. Though this endpoint already hides values for command-line flags withkey
andpassword
substrings in their names, other sensitive information could be exposed there. See This issue. -
FEATURE: support query tracing, which allows determining bottlenecks during query processing. See these docs and this feature request.
-
FEATURE: vmui: add
cardinality
tab, which can help identifying the source of high cardinality and high churn rate issues. See this and this feature requests and these docs. -
FEATURE: vmui: small UX enhancements according to this feature request.
-
FEATURE: allow overriding default limits for in-memory cache
indexdb/tagFilters
via flag-storage.cacheSizeIndexDBTagFilters
. See this issue. -
FEATURE: add support of
lowercase
anduppercase
relabeling actions in the same way as Prometheus 2.36.0 does. See this issue. -
FEATURE: add ability to change the
indexdb
rotation timezone offset via-retentionTimezoneOffset
command-line flag. Previously it was performed at 4am UTC time. This could lead to performance degradation in the middle of the day when VictoriaMetrics runs in time zones located too far from UTC. Thanks to @cnych for the pull request. -
FEATURE: limit the number of background merge threads on systems with big number of CPU cores by default. This increases the max size of parts, which can be created during background merge when
-storageDataPath
directory has limited free disk space. This may improve on-disk data compression efficiency and query performance. The limits can be tuned if needed with-smallMergeConcurrency
and-bigMergeConcurrency
command-line flags. See this pull request. -
FEATURE: accept optional
limit
query arg at /api/v1/labels and /api/v1/label/.../values for limiting the number of sample entries returned from these endpoints. See these docs. -
FEATURE: optimize performance for /api/v1/labels and /api/v1/label/.../values endpoints when
match[]
,extra_label
orextra_filters[]
query args are passed to these endpoints. This should help with this issue. -
FEATURE: vmalert: support
limit
param per-group for limiting number of produced samples per each rule. Thanks to @Howie59 for implementation. -
FEATURE: vmalert: remove dependency on Internet access at web API pages. Previously the functionality and the layout of these pages was broken without Internet access. See this issue.
-
FEATURE: vmalert: send alerts to the configured notifiers in parallel. Previously alerts were sent to notifiers sequentially. This could delay sending pending alerts when notifier blocks on the currently sent alert.
-
FEATURE: vmagent: implement the
http://vmagent:8429/service-discovery
page in the same way as Prometheus does. This page shows the original labels for all the discovered targets alongside the resulting labels after the relabeling. This simplifies service discovery debugging. -
FEATURE: vmagent: remove dependency on Internet access at
http://vmagent:8429/targets
page. Previously the page layout was broken without Internet access. See this issue. -
FEATURE: vmagent: add support for
kubeconfig_file
option at kubernetes_sd_configs. It may be useful for Kubernetes monitoring byvmagent
outside Kubernetes cluster. See this issue. -
FEATURE: vmagent: expose
/api/v1/status/config
endpoint in the same way as Prometheus does. See these docs. -
FEATURE: vmagent: add
-promscrape.suppressScrapeErrorsDelay
command-line flag, which can be used for delaying and aggregating the logging of per-target scrape errors. This may reduce the amounts of logs whenvmagent
scrapes many unreliable targets. See this feature request. Thanks to @jelmd for the initial implementation. -
FEATURE: vmagent: add
-promscrape.cluster.name
command-line flag, which allows proper data de-duplication when the same target is scraped from multiple vmagent clusters. See this issue. -
FEATURE: vmagent: add
action: graphite
relabeling rules optimized for extracting labels from Graphite-style metric names. See these docs and this feature request. -
FEATURE: VictoriaMetrics enterprise: expose
vm_downsampling_partitions_scheduled
andvm_downsampling_partitions_scheduled_size_bytes
metrics, which can be used for tracking the progress of initial downsampling for historical data. See this feature request. -
FEATURE: VictoriaMetrics cluster: do not spend up to 5 seconds when trying to connect to unavailable
vmstorage
nodes. This should improve query latency when some ofvmstorage
nodes aren't available. Exposevm_tcpdialer_addr_available{addr="..."}
metric athttp://vmselect:8481/metrics
for determining whether the givenaddr
is available for establishing new connections. See this comment. -
FEATURE: VictoriaMetrics cluster: add
-vmstorageDialTimeout
command-line flags tovmselect
andvminsert
for tuning the maximum duration for connection establishing tovmstorage
nodes. This should help resolving this issue. -
BUGFIX: support for data ingestion in DataDog format from legacy clients / agents. See this pull request. Thanks to @elProxy for the fix.
-
BUGFIX: vmagent: do not expose
vm_promscrape_service_discovery_duration_seconds_bucket
metric for unused service discovery types. This reduces the number of metrics exported athttp://vmagent:8429/metrics
. See this issue. -
BUGFIX: vmalert: properly apply
alert_relabel_configs
relabeling rules to-notifier.config
according to these docs. Thanks to @spectvtor for the bugfix. -
BUGFIX: vmalert: properly add
Content-Encoding: snappy
,Content-Type: application/x-protobuf
andX-Prometheus-Remote-Write-Version: 0.1.0
request headers whenvmalert
sends evaluated recording rules' data to-remoteWrite.url
. These headers are needed by some remote storage systems in order to properly decode snappy-encoded request body. See this and this pull requests. Thanks to @manji-0 for th fix. -
BUGFIX: deny background merge when the storage enters read-only mode, e.g. when free disk space becomes lower than
-storage.minFreeDiskSpaceBytes
. Background merge needs additional disk space, so it could result inno space left on device
errors. See this issue. -
BUGFIX: vmui: properly apply the selected time range when auto-refresh is enabled. See this issue.
-
BUGFIX: vmui: properly update the url with vmui state when new query is entered. See this issue.
-
BUGFIX: Graphite render API: properly calculate sample timestamps when
moving*()
functions such as movingAverage() are applied over summarize(). -
BUGFIX: limit the
end
query arg value to+2 days
in the future at/api/v1/*
endpoints, because VictoriaMetrics doesn't allow storing samples with timestamps bigger than +2 days in the future. This should help resolving this issue. -
BUGFIX: properly register time series in per-day inverted index during the first hour after
indexdb
rotation. Previously this could lead to missing time series during querying if these time series stopped receiving new samples during the first hour afterindexdb
rotation. See this issue. -
BUGFIX: do not register new series when
-storage.maxHourlySeries
or-storage.maxDailySeries
limits were reached. Previously samples for new series weren't added to the database when the cardinality limit was reached, but series were still registered in the inverted index (akaindexdb
). This could lead to unboundindexdb
growth during high churn rate.
v1.77.2
Released at 2022-05-21
-
FEATURE: vmalert: support reusable templates for rules annotations. The path to the template files can be specified via
-rule.templates
flag. See more about this feature here. Thanks to @AndrewChubatiuk for the pull request. See this feature request. -
FEATURE: vmalert: expose
vmalert_iteration_interval_seconds
metric athttp://vmalert:8880/metrics
. This metric shows the configured per-group evaluation interval. See this feature request. -
FEATURE: vmctl: add
influx-prometheus-mode
command-line flag, which allows to restore the original time series written from Prometheus into InfluxDB during data migration from InfluxDB to VictoriaMetrics. See this feature request. Thanks to @mback2k for the pull request. -
FEATURE: vmagent: add ability to specify AWS service name when issuing requests to AWS api. See this feature request. Thanks to @transacid for the pull request.
-
BUGFIX: vmagent: fix a bug, which could lead to incomplete discovery of scrape targets in Kubernetes (aka
kubernetes_sd_config
). the bug has been introduced in v1.77.0. -
BUGFIX: vmalert: support
scalar
result type in response. See this issue. -
BUGFIX: vmalert: support strings in
humanize.*
template function in the same way as Prometheus does. See this issue. -
BUGFIX: vmalert: proxy
/rules
requests to vmalert from Grafana's alerting UI. This removes errors in Grafana's UI for Grafana versions older than8.5.*
. See this issue -
BUGFIX: vmalert: do not add
/api/v1/query
suffix to-datasource.url
if-remoteRead.disablePathAppend
command-line flag is set. Previously this flag was applied only to-remoteRead.url
, which could confuse users. -
BUGFIX: vmalert: prevent from possible resource leak on config update, which could lead to the slowdown of
vmalert
over time. See this pull request. -
BUGFIX: MetricsQL: do not return values from label_value() function if the original time series has no values at the selected timestamps.
-
BUGFIX: VictoriaMetrics cluster: limit the number of concurrently established connections from vmselect to vmstorage. This should prevent from potentially high spikes in the number of established connections after temporary slowdown in connection handshake procedure between vmselect and vmstorage because of spikes in workload. See this issue.
-
BUGFIX: vmctl: fix build for Solaris / SmartOS. See this issue.
v1.77.1
Released at 2022-05-07
-
FEATURE: vmagent: add ability to specify filters for Availability Zones in ec2_sd_config via
az_filters
section. This section can contain AZ-specific set of filters in the same way as the existingfilters
section, which is used for filtering EC2 instances. The list of supported AZ-specific filters is available here. -
FEATURE: vmagent: expose
vmagent_remotewrite_global_rows_pushed_before_relabel_total
andvmagent_remotewrite_rows_pushed_after_relabel_total
metrics athttp://vmagent:8429/metrics
, which can be used for monitoring the rate of rows (aka samples) pushed to remote storage before and after the relabeling via-remoteWrite.relabelConfig
and-remoteWrite.urlRelabelConfig
. See relabeling docs for details. -
FEATURE: vmctl: add ability to skip
db
label during InfluxDB data import wheninflux-skip-database-label
option is used. See this pull request. Thanks to @mback2k . -
BUGFIX: vmagent: properly process passwords and secrets specified in the file pointed by
-promscrape.config
command-line flag. All the passwords and secrets were mistakenly replaced with<secret>
string inv1.77.0
. See this and this issue. -
BUGFIX: vmagent: rename
vmagent_remote_write_rate_limit_reached_total
metric tovmagent_remotewrite_rate_limit_reached_total
, so its name is consistent with the rest ofvmagent_remotewrite_
metrics. -
BUGFIX: vmagent: rename
promscrape_stale_samples_created_total
metric tovm_promscrape_stale_samples_created_total
, so its name is consistent with the rest ofvm_promscrape_
metrics. -
BUGFIX: vmctl: properly import InfluxDB measurements if they contain
db
tag. Previously this could result in incomplete import of measurement tags. See this pull request. Thanks to @mback2k for the bugfix. -
BUGFIX: vmui: do not reset the selected relative time range when entering new query. See this issue.
-
BUGFIX: vmbackup: disallow writing backups to
-storageDataPath
directory, since this directory is managed solely by VictoriaMetrics orvmstorage
. Other apps shouldn't write into this directory. See this issue. -
BUGFIX: do not allow setting
-retentionPeriod
smaller than one day, since VictoriaMetrics doesn't support properly such small retention periods. See this issue. -
BUGFIX: VictoriaMetrics cluster: do not drop samples routed to readonly
vmstorage
nodes if-dropSamplesOnOverload
command-line flag is set. Try re-routing them to healthyvmstorage
nodes instead. See this issue.
v1.77.0
Released at 2022-05-05
-
FEATURE: vmagent: add support for sending data to remote storage with AWS sigv4 authorization. See this feature request.
-
FEATURE: vmagent: allow filtering targets by target url and by target labels with time series selector on
http://vmagent:8429/targets
page. This may be useful whenvmagent
scrapes big number of targets. See this feature request. -
FEATURE: vmagent: reduce
-promscrape.config
reload duration when the config contains big number of jobs (aka scrape_configs sections) and only a few of them are changed. Previously all the jobs were restarted. Now only the jobs with changed configs are restarted. This should reduce the probability of data miss because of slow config reload. See this issue. -
FEATURE: vmagent: improve service discovery speed for big number of scrape targets. This should help when
vmagent
discovers big number of targets (e.g. thousands) in Kubernetes cluster. The service discovery speed now should scale with the number of CPU cores available tovmagent
. -
FEATURE: vmagent: add ability to attach node-level labels and annotations to discovered Kubernetes pod targets in the same way as Prometheus 2.35 does. See this feature request and this pull request.
-
FEATURE: vmagent: add support for
tls_config
andproxy_url
options atoauth2
section in the same way as Prometheus does. See oauth2 docs. -
FEATURE: vmagent: add support for
min_version
option attls_config
section in the same way as Prometheus does. See tls_config docs. -
FEATURE: vmagent: expose
vmagent_remotewrite_rate_limit
metric athttp://vmagent:8429/metrics
, which can be used for alerting rules such asrate(vmagent_remotewrite_conn_bytes_written_total) / vmagent_remotewrite_rate_limit > 0.8
when-remoteWrite.rateLimit
command-line flag is set. See this pull request. -
FEATURE: vmalert: add support for DNS-based discovery for notifiers in the same way as Prometheus does (aka
dns_sd_configs
). See these docs and this feature request. -
FEATURE: vmalert: add
-replay.disableProgressBar
command-line flag, which allows disabling progressbar in rules' backfilling mode. See this issue. -
FEATURE: allow specifying TLS cipher suites for incoming https requests via
-tlsCipherSuites
command-line flag. See this feature request. -
FEATURE: allow specifying TLS cipher suites for mTLS connections between cluster components via
-cluster.tlsCipherSuites
command-line flag. See these docs. -
FEATURE: vmstorage: add
-snapshotsMaxAge
command-line flag for automatic removal of snapshots older than the given age. -
FEATURE: vmui: show an empty graph on the selected time range when there is no data on it. Previously
No data to show
placeholder was shown instead of the graph in this case. This prevented from zooming and scrolling of such a graph. -
FEATURE: vmui: show the selected
last N minutes/hours/days
in the top right corner. Previously thestart - end
duration was shown instead, which could be hard to interpret. See this feature request. -
FEATURE: vmui: execute the query when
enter
button is pressed in the same way as Prometheus does. Multi-line query can be entered by pressingshift-enter
in the query input field. -
FEATURE: expose
vm_indexdb_items_added_total
andvm_indexdb_items_added_size_bytes_total
counters at/metrics
page, which can be used for monitoring the rate for addition of new entries inindexdb
(akainverted index
) alongside the total size in bytes for the added entries. See this feature request. -
FEATURE: vmctl: show data processing speed during data migration.
-
FEATURE: MetricsQL: add
drop_common_labels()
function, which drops commonlabel="name"
pairs from the passed time series. See these docs. -
FEATURE: MetricsQL: add
tlast_change_over_time(m[d])
function, which returns the timestamp of the last change ofm
on the given lookbehind windowd
. See these docs. -
FEATURE: leave the last raw sample per each
-dedup.minScrapeInterval
discrete interval when the deduplication is enabled. This aligns better with the staleness rules in Prometheus comparing to the previous behaviour when the first sample per each-dedup.minScrapeInterval
was left. -
FEATURE: VictoriaMetrics cluster: add ability to disable peer TLS certificate verification with
-cluster.tlsInsecureSkipVerify
command-line flag. See mTLS docs for details. See this feature request. -
FEATURE: add a handler for
/api/v1/status/buildinfo
endpoint, which is used by Grafana starting from v8.5.0 . See this pull request. -
FEATURE: add ability to proxy alerting API requests from Grafana to vmalert by passing
-vmalert.proxyURL
command-line flag to single-node VictoriaMetrics or tovmselect
at cluster version of VictoriaMetrics. See this issue. -
BUGFIX: export staleness markers as
null
values from JSON export API. Previously they were exported asNaN
values. This could break the exported JSON parsing, sinceNaN
values aren't supported by JSON specification. -
BUGFIX: VictoriaMetrics cluster: close
vmselect->vmstorage
connections if they were idle for more than 30 seconds. Exposevm_tcpdialer_conns_idle
metric athttp://vmselect:8481/metrics
with the number of idle connections tovmstorage
. See this issue. -
BUGFIX: vmctl: return non-zero exit code on error. This allows handling
vmctl
errors in shell scripts. Previouslyvmctl
was returning 0 exit code on error. See this issue. -
BUGFIX: vmctl: prevent from indefinite hang on
Ctrl+C
. See this issue. -
BUGFIX: vmagent: properly show
scrape_timeout
andscrape_interval
options athttp://vmagent:8429/config
page. Previously these options weren't displayed even if they were set in-promscrape.config
. -
BUGFIX: vmagent: handle non-standard http redirect status codes, which may be returned by scrape targets, in the same way as Prometheus does. See this issue.
-
BUGFIX: vmalert: skip template execution during rules' validation. This should prevent from
error evaluating annotation template
errors when some template functions expect non-empty args. See this issue. -
BUGFIX: vmalert: fixed truncating alerts expression in table, updated table cell layout. See this issue.
-
BUGFIX: MetricsQL: properly handle joins on time series filtered by values. For example,
kube_pod_container_resource_requests{resource="cpu"} * on (namespace,pod) group_left() (kube_pod_status_phase{phase=~"Pending|Running"}==1)
. This query could result induplicate time series on the right side
error even if==1
filter leaves only a single time series per(namespace,pod)
labels. Now such query is properly executed. -
BUGFIX: MetricsQL: properly handle
scalar default vector
,scalar if vector
andscalar ifnot vector
queries. Previously such queries could return unexpected results from thevector
part. -
BUGFIX: Official Grafana dashboards for VictoriaMetrics: take into account
indexdb
when calculating disk space usage. See this issue.
v1.76.1
Released at 2022-04-12
Update notes: this release introduces backwards-incompatible changes to communication protocol between vmselect
and vmstorage
nodes in cluster version of VictoriaMetrics, so read requests to vmselect
will fail until the upgrade is complete. These errors will stop after all the vmselect
and vmstorage
nodes are updated to the new release. It is safe to downgrade to previous releases.
-
FEATURE: vmalert: add support for
alert_relabel_configs
option at-notifier.config
. This option allows configuring relabeling rules for alerts before sending them to configured notifiers. See these docs for details. -
FEATURE: vmagent: allow passing StatefulSet pod names to
-promscrape.cluster.memberNum
command-line flag. In this case the member number is automatically extracted from the pod name, which must end with the number in the range0 ... promscrape.cluster.membersCount-1
. For example,vmagent-0
,vmagent-1
, etc. See this feature request and these docs. -
BUGFIX: VictoriaMetrics cluster: properly propagate limits at
-search.max*
command-line flags fromvminsert
tovmstorage
. The limits are-search.maxUniqueTimeseries
,-search.maxSeries
,-search.maxFederateSeries
,-search.maxExportSeries
,-search.maxGraphiteSeries
and-search.maxTSDBStatusSeries
. They weren't propagated tovmstorage
because of the bug. These limits were introduced in v1.76.0. See this bug. -
BUGFIX: fix goroutine leak and possible deadlock when importing invalid data via native binary format. See this pull request.
-
BUGFIX: Graphite Render API: properly calculate hitCount function. Previously it could return empty results if there were no original samples in some parts of the selected time range.
-
BUGFIX: MetricsQL: allow overriding built-in function names inside WITH templates. For example,
WITH (sum(a,b) = a + b + 1) sum(x,y)
now expands intox + y + 1
. Previously such a query would fail withcannot use reserved name
error. See this bugreport. -
BUGFIX: vmui: properly display values greater than 1000 on Y axis. See this issue.
v1.76.0
Released at 2022-04-07
Update notes: this release introduces backwards-incompatible changes to communication protocol between vmselect
and vmstorage
nodes in cluster version of VictoriaMetrics, so read requests to vmselect
will fail until the upgrade is complete. These errors will stop after all the vmselect
and vmstorage
nodes are updated to the new release. It is safe to downgrade to previous releases.
-
FEATURE: vmctl: add ability to verify files obtained via native export. See these docs and this feature request.
-
FEATURE: vmui: add pre-defined dashboards for per-job CPU usage, memory usage and disk IO usage. See this pull request for details.
-
FEATURE: vmalert: improve compatibility with Prometheus Alert Generator specification. See this pull request.
-
FEATURE: vmalert: add
-datasource.disableKeepAlive
command-line flag, which can be used for disabling HTTP keep-alive connections to datasources. This option can be useful for distributing load among multiple datasources behind TCP proxy such as HAProxy. -
FEATURE: Cluster version of VictoriaMetrics: reduce memory usage by up to 50% for
vminsert
andvmstorage
under high ingestion rate. -
FEATURE: vmgateway: Allow to read
-ratelimit.config
file from URL. Also add-ratelimit.configCheckInterval
command-line option. See this issue. -
FEATURE: add the following command-line flags, which can be used for fine-grained limiting of CPU and memory usage during various API calls:
-search.maxFederateSeries
for limiting the number of time series, which can be returned from /federate.-search.maxExportSeries
for limiting the number of time series, which can be returned from /api/v1/export.-search.maxSeries
for limiting the number of time series, which can be returned from /api/v1/series.-search.maxTSDBStatusSeries
for limiting the number of time series, which can be scanned during the request to /api/v1/status/tsdb.-search.maxGraphiteSeries
for limiting the number of time series, which can be scanned during the request to Graphite Render API.
Previously the -search.maxUniqueTimeseries
command-line flag was used as a global limit for all these APIs. Now the -search.maxUniqueTimeseries
is used only for limiting the number of time series, which can be scanned during requests to /api/v1/query and /api/v1/query_range.
When using cluster version of VictoriaMetrics, these command-line flags (including -search.maxUniqueTimeseries
) must be passed to vmselect
instead of vmstorage
.
- BUGFIX: vmagent and vmauth: reduce the probability of
TLS handshake error from XX.XX.XX.XX: EOF
errors when-remoteWrite.url
points to HTTPS url atvmauth
. See this issue. - BUGFIX: return
Content-Type: text/html
response header when requesting/
HTTP path at VictoriaMetrics components. Previouslytext/plain
response header was returned, which could lead to broken page formatting. See this issue. - BUGFIX: Graphite Render API: accept floating-point values for maxDataPoints query arg, since some clients send floating-point values instead of integer values for this arg.
v1.75.1
Released at 2022-03-28
- BUGFIX: update base image for VictoriaMetrics from
alpine-3.15.0
toalpine-3.15.2
. This fixes CVE-2022-0778. See alpine 3.15.2 release docs.
v1.75.0
Released at 2022-03-18
Update notes: release contains breaking change to vmalert's API introduced in ee396b5.
It replaces the api/v1/groups
API handler with api/v1/rules
handler in order to become compatible
with alerts generator specification.
See other changes introduced to vmalert here.
-
FEATURE: VictoriaMetrics cluster: add support for mTLS communications between cluster components. See these docs and this feature request.
-
FEATURE: vmalert: add ability to use OAuth2 for
-datasource.url
,-notifier.url
and-remoteRead.url
. See the corresponding command-line flags containingoauth2
in their names here. -
FEATURE: vmalert: add ability to use Bearer Token for
-notifier.url
via-notifier.bearerToken
and-notifier.bearerTokenFile
command-line flags. See this issue. -
FEATURE: vmalert: add
sortByLabel
template function in order to be consistent with Prometheus. See these docs for more details. -
FEATURE: vmalert: improve compliance with Prometheus Alert Generator Specification.
-
FEATURE: vmalert: add
-rule.resendDelay
command-line flag, which specifies the minumum amount of time to wait before resending an alert to Alertmanager (e.g. this is equivalent to-rules.alert.resend-delay
option from Prometheus. See this feature request. -
FEATURE: vmauth: transparently treat
Authorization: Token ...
request headers asAuthorization: Bearer ...
request headers. This allows sending requests tovmauth
from InfluxDB clients. See this issue. Thanks to @dcircelli for the pull request. -
FEATURE: do not log trivial network errors such as
broken pipe
andconnection reset by peer
. This error could occur when writing data to the client, which closes the connection to VictoriaMetrics due to request timeout or similar reason. See this issue. -
BUGFIX: Graphite Render API: return an additional point after
until
timestamp in the same way as Graphite does. Previously VictoriaMetrics didn't return this point, which could result in missing last point on the graph. -
BUGFIX: properly locate series with the given
name
and without the givenlabel
when using thename{label=~"foo|"}
series selector. Previously such series could be skipped. See this issue. Thanks to @jduncan0000 for discovering and fixing the issue. -
BUGFIX: properly free up memory occupied by deleted cache entries for the following caches:
indexdb/dataBlocks
,indexdb/indexBlocks
,storage/indexBlocks
. This should reduce the increased memory usage starting from v1.73.0. See this and this issue. -
BUGFIX: reduce the interval for checking for free disk space from 30 seconds to 1 second. This should reduce the probability of
no space left on device
panics when-storage.minFreeDiskSpaceBytes
is set to too low values. See this issue. -
BUGFIX: vmagent: prevent from panic at vmagent when importing a time series with big number of samples. See this issue. Thanks to @bleedfish for discovering and fixing the issue.
v1.74.0
Released at 2022-03-03
Update notes: In this release VictoriaMetrics may use some extra memory due to issues #2242 and #2007. These issues were addressed in v1.75.0, so we recommend updating straight to it.
- FEATURE: add support for conditional relabeling via
if
filter. Theif
filter can contain arbitrary series selector. For example, the following rule drops targets matchingfoo{bar="baz"}
series selector:
- action: drop
if: 'foo{bar="baz"}'
This rule is equivalent to less clear traditional one:
- action: drop
source_labels: [__name__, bar]
regex: 'foo;baz'
See relabeling docs and this issue for more details.
-
FEATURE: reduce memory usage for various caches under high churn rate.
-
FEATURE: vmagent: re-use Kafka client when pushing data from many tenants to Kafka. Previously a separate Kafka client was created per each tenant. This could lead to increased load on Kafka. See how to push data from vmagent to Kafka.
-
FEATURE: improve performance when registering new time series. See this issue. Thanks to @ahfuzhang .
-
BUGFIX: return the proper number of datapoints from
moving*()
functions such asmovingAverage()
in Graphite Render API. Previously these functions could return too big number of samples if maxDataPoints query arg is explicitly passed to/render
API. -
BUGFIX: properly handle series selector containing a filter for multiple metric names plus a negative filter. For example,
{__name__=~"foo|bar",job!="baz"}
. Previously VictoriaMetrics could return series withfoo
orbar
names and withjob="baz"
. See this issue. -
BUGFIX: vmgateway: properly parse JWT tokens if they are encoded with URL-safe base64 encoding.
v1.73.1
Released at 2022-02-22
Update notes: In this release VictoriaMetrics may use some extra memory due to issues #2242 and #2007. These issues were addressed in v1.75.0, so we recommend updating straight to it.
-
FEATURE: allow overriding default limits for the following in-memory caches, which usually occupy the most memory:
storage/tsid
- the cache speeds up lookups of internal metric ids bymetric_name{labels...}
during data ingestion. The size for this cache can be tuned with-storage.cacheSizeStorageTSID
command-line flag.indexdb/dataBlocks
- the cache speeds up data lookups in<-storageDataPath>/indexdb
files. The size for this cache can be tuned with-storage.cacheSizeIndexDBDataBlocks
command-line flag.indexdb/indexBlocks
- the cache speeds up index lookups in<-storageDataPath>/indexdb
files. The size for this cache can be tuned with-storage.cacheSizeIndexDBIndexBlocks
command-line flag. See also cache tuning docs. See this issue.
-
FEATURE: add
-influxDBLabel
command-line flag for overridingdb
label name for the data imported into VictoriaMetrics via InfluxDB line protocol. Thanks to @johnatannvmd for the pull request. -
FEATURE: return
X-Influxdb-Version
HTTP header in responses to InfluxDB write requests. This is needed for some InfluxDB clients. See this comment and this issue. -
BUGFIX: reduce memory usage during the first three hours after the upgrade from versions older than v1.73.0. The memory usage spike was related to the need of in-memory caches' re-population after the upgrade because of the fix for this issue. Now cache size limits are reduced in order to occupy less memory during the upgrade.
-
BUGFIX: fix a bug, which could significantly slow down requests to
/api/v1/labels
and/api/v1/label/<label_name>/values
. These APIs are used by Grafana for auto-completion of label names and label values. See this issue. -
BUGFIX: vmalert: add support for
$externalLabels
and$externalURL
template vars in the same way as Prometheus does. See this issue. -
BUGFIX: vmalert: make sure notifiers are discovered during initialization if they are configured via
consul_sd_configs
. Previously they could be discovered in 30 seconds (the default value for-promscrape.consulSDCheckInterval
command-line flag) after the initialization. See this pull request. -
BUGFIX: update default value for
-promscrape.fileSDCheckInterval
, so it matches default duration used by Prometheus for checking for updates infile_sd_configs
. See this issue. Thanks to @corporate-gadfly for the fix. -
BUGFIX: VictoriaMetrics cluster: do not return partial responses from
vmselect
if at least a singlevmstorage
node was reachable and returned an app-level error. Such errors are usually related to cluster mis-configuration, so they must be returned to the caller instead of being masked by partial responses. Partial responses can be returned only if some ofvmstorage
nodes are unreachable during the query. This may help the following issues: one, two.
v1.73.0
Released at 2022-02-14
Update notes: In this release VictoriaMetrics may use some extra memory described in issues #2242 and #2007. These issues were addressed in v1.75.0, so we recommend updating straight to it.
-
FEATURE: publish VictoriaMetrics binaries for MacOS amd64 and MacOS arm64 (aka MacBook M1) at releases page. See this issue and this issue.
-
FEATURE: reduce CPU and disk IO usage during
indexdb
rotation once per-retentionPeriod
. See this issue. -
FEATURE: VictoriaMetrics cluster: add
-dropSamplesOnOverload
command-line flag forvminsert
. If this flag is set, thenvminsert
drops incoming data if the destinationvmstorage
is temporarily unavailable or cannot keep up with the ingestion rate. The number of dropped rows can be monitored viavm_rpc_rows_dropped_on_overload_total
metric atvminsert
. -
FEATURE: VictoriaMetrics cluster: improve re-routing logic, so it re-routes incoming data more evenly if some of
vmstorage
nodes are temporarily unavailable and/or accept data at slower rate than othervmstorage
nodes. Also significantly reduce possible re-routing storm whenvminsert
runs with-disableRerouting=false
command-line flag. This should help the following issues: one, two, three, four, five. -
FEATURE: MetricsQL: cover more cases with the label filters' propagation optimization. This should improve the average performance for practical queries. The following cases are additionally covered:
- Multi-level transform functions. For example,
abs(round(foo{a="b"})) + bar{x="y"}
is now optimized toabs(round(foo{a="b",x="y"})) + bar{a="b",x="y"}
- Binary operations with
on()
,without()
,group_left()
andgroup_right()
modifiers. For example,foo{a="b"} on (a) + bar
is now optimized tofoo{a="b"} on (a) + bar{a="b"}
- Multi-level binary operations. For example,
foo{a="b"} + bar{x="y"} + baz{z="q"}
is now optimized tofoo{a="b",x="y",z="q"} + bar{a="b",x="y",z="q"} + baz{a="b",x="y",z="q"}
- Aggregate functions. For example,
sum(foo{a="b"}) by (c) + bar{c="d"}
is now optimized tosum(foo{a="b",c="d"}) by (c) + bar{c="d"}
- Multi-level transform functions. For example,
-
FEATURE MetricsQL: optimize joining with
*_info
labels. For example:kube_pod_created{namespace="prod"} * on (uid) group_left(node) kube_pod_info
now automatically adds the needed filters onuid
label tokube_pod_info
before selecting series for the right side of*
operation. This may save CPU, RAM and disk IO resources. See this article for details on*_info
labels. See this issue. -
FEATURE: all: improve performance for arm64 builds of VictoriaMetrics components by up to 15%. See this pull request.
-
FEATURE: all: expose
process_cpu_cores_available
metric, which shows the number of CPU cores available to the app. The number can be fractional if the corresponding cgroup limit is set to a fractional value. This metric is useful for alerting on CPU saturation. For example, the following query alerts when the app uses more than 90% of CPU during the last 5 minutes:rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9
. See this issue. -
FEATURE: vmalert: add ability to configure notifiers (e.g. alertmanager) via a file in the way similar to Prometheus. See these docs, this pull request.
-
FEATURE: vmalert: add support for Consul service discovery for notifiers. See this issue.
-
FEATURE: vmalert: add support for specifying Basic Auth password for notifiers via a file. See this issue.
-
FEATURE: vmagent: provide the ability to fetch target responses on behalf of
vmagent
by clicking theresponse
link for the needed target at/targets
page. This feature may be useful for debugging responses from targets located in isolated environments. -
FEATURE: vmagent: show the total number of scrapes and the total number of scrape errors per target at
/targets
page. This information may be useful when debugging unreliable scrape targets. -
FEATURE: vmagent and single-node VictoriaMetrics: disallow unknown fields at
-promscrape.config
file. Previously unknown fields were allowed. This could lead to long-living silent config errors. The previous behaviour can be returned by passing-promscrape.config.strictParse=false
command-line flag. -
FEATURE: vmagent: add
__meta_kubernetes_endpointslice_label*
and__meta_kubernetes_endpointslice_annotation*
labels forrole: endpointslice
targets in kubernetes_sd_config to be consistent with otherrole
values. See this issue. -
FEATURE: vmagent: add
collapse all
andexpand all
buttons tohttp://vmagent:8429/targets
page. See this issue. -
FEATURE: vmagent: support Prometheus-like durations in
-promscrape.config
. See this comment. -
FEATURE: automatically re-read
-tlsCertFile
and-tlsKeyFile
files, so their contents can be updated without the need to restart VictoriaMetrics apps. See this issue. -
BUGFIX: calculate absent_over_time() in the same way as Prometheus does. Previously it could return multiple time series instead of at most one time series like Prometheus does. See this issue.
-
BUGFIX: return proper results from
highestMax()
function at Graphite render API. Previously it was incorrectly returning timeseries with min peaks instead of max peaks. -
BUGFIX: properly limit indexdb cache sizes. Previously they could exceed values set via
-memory.allowedPercent
and/or-memory.allowedBytes
whenindexdb
contained many data parts. See this issue. -
BUGFIX: vmui: fix a bug, which could break time range picker when editing
From
orTo
input fields. See this issue. -
BUGFIX: vmui: fix a bug, which could break switching between
graph
,json
andtable
views. See this issue. -
BUGFIX: vmui: fix possible UI freeze after querying
node_uname_info
time series. See this issue. -
BUGFIX: show the original location of the warning or error message when logging throttled messages. Previously the location inside
lib/logger/throttler.go
was shown. This could increase the complexity of debugging. -
BUGFIX: vmalert: fix links at web UI. See this issue.
-
BUGFIX: vmagent: properly discover pods without exposed ports for the given service for
role: endpoints
androle: endpointslice
in kubernetes_sd_config. See this issue. -
BUGFIX: vmagent: properly display
zone
contents forgce_sd_configs
section athttp://vmagent:8429/config
page. See this issue. Thanks to @artifactori for the bugfix. -
BUGFIX: vmagent: properly handle
all_tenants: true
config option atopenstack_sd_config
. See this issue.
v1.72.0
Released at 2022-01-18
- FEATURE: MetricsQL: add support for
@
modifier, which is enabled by default in Prometheus starting from Prometheus v2.33.0. See these docs and this feature request. VictoriaMetrics extends@
modifier with the following additional features:- It can contain arbitrary expression. For example,
foo @ (end() - 1h)
would returnfoo
value atend - 1 hour
timestamp on the selected time range[start ... end]
. Another example:foo @ (now() - 10m)
would returnfoo
value 10 minutes ago from the current time. - It can be put everywhere in the query. For example,
sum(foo) @ start()
would calculatesum(foo)
atstart
timestamp on the selected time range[start ... end]
.
- It can contain arbitrary expression. For example,
- FEATURE: MetricsQL: add support for optional
keep_metric_names
modifier, which can be applied to all the rollup functions and transform functions. This modifier prevents from deleting metric names from function results. For example,rate({__name__=~"foo|bar"}[5m]) keep_metric_names
leavesfoo
andbar
metric names inrate()
results. This feature provides an additional workaround for this issue. - FEATURE: vmagent: add support for Kubernetes service discovery in the current namespace in the same way as Prometheus does. For example, the following config limits pod discovery to the namespace where vmagent runs:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
namespaces:
own_namespace: true
-
FEATURE: vmagent: add
__meta_kubernetes_node_provider_id
label for discovered Kubernetes nodes in the same way as Prometheus does. -
FEATURE: vmagent: log error message when remote storage returns 400 or 409 http errors. This should simplify detection and debugging of this case. See this issue.
-
FEATURE: vmagent: expose
promscrape_stale_samples_created_total
metric for monitoring the total number of created stale samples when scraping Prometheus targets. See these docs for the information on when stale samples (aka staleness markers) can be created. -
FEATURE: vmrestore: store
restore-in-progress
file in-dst
directory whilevmrestore
is running. This file is automatically deleted whenvmrestore
is successfully finished. This helps detecting incompletely restored data on VictoriaMetrics start. See this issue. -
FEATURE: vmctl: print the last sample timestamp when the data migration is interrupted either by user or by error. This helps continuing the data migration from the interruption moment. See this issue.
-
FEATURE: vmalert: expose
vmalert_remotewrite_total
metric at/metrics
page. This makes possible calculating SLOs for error rate during writing recording rules and alert state to-remoteWrite.url
with the queryvmalert_remotewrite_errors_total / vmalert_remotewrite_total
. See this issue. Thanks to @afoninsky . -
FEATURE: vmalert: add
stripPort
template function in the same way as Prometheus does. -
FEATURE: vmalert: add
parseDuration
template function in the same way as Prometheus does. -
FEATURE: MetricsQL: add
stale_samples_over_time(m[d])
function for calculating the number of staleness marks for time seriesm
over the durationd
. This function may be useful for detecting flapping metrics at scrape targets, which periodically disappear and then appear again. -
FEATURE: vmgateway: add support for
extra_filters
option. See this issue. -
FEATURE: vmui: improve UX according to this feature request. Thanks to @Loori-R .
-
FEATURE: vmui: limit the number of requests sent to VictoriaMetrics during zooming / scrolling. See this issue.
-
BUGFIX: vmagent: make sure that
vmagent
replicas scrape the same targets at different time offsets when replication is enabled in vmagent clustering mode. This guarantees that the deduplication consistently leaves samples from the samevmagent
replica. -
BUGFIX: return the proper response stub from
/api/v1/query_exemplars
handler, which is needed for Grafana v8+. See this issue. -
BUGFIX: vmctl: fix a few edge cases and improve migration speed for OpenTSDB importer. See this pull request.
-
BUGFIX: fix possible data race when searching for time series matching
{key=~"value|"}
filter over time range covering multiple days. See this pull request. Thanks to @waldoweng for the provided fix. -
BUGFIX: vmagent: do not send staleness markers on graceful shutdown. This follows Prometheus behavior. See this comment.
-
BUGFIX: vmagent: properly set
__address__
label indockerswarm_sd_config
. See this issue. Thanks to @ashtuchkin for the fix. -
BUGFIX: vmui: fix incorrect calculations for graph limits on y axis. This could result in incorrect graph rendering in some cases. See this issue.
-
BUGFIX: vmui: fix handling for multi-line queries. See this issue.
Previous releases
See changes for older releases here.