Commit graph

2668 commits

Author SHA1 Message Date
Zakhar Bessarab
2fe33b3d97
app/vmalert/datasource/graphite: allow overriding "from" parameter for datasource queries (#4687)
* app/vmalert/datasource/graphite: allow overriding "from" parameter for datasource queries

Fixes construction of URL parameters for graphite render to allow overriding "from" parameter.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4685
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmalert/datasource/graphite: update flow for building URL parameters

Makes flow of building URL parameters same as Prometheus datasource has:
1) Setting all default values
2) Merging those values with provided `extraParams`

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/CHANGELOG.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-22 14:18:52 -07:00
Aliaksandr Valialkin
8c59813c17
app/vlinsert/loki: fix build for architectures where int is 32-bit 2023-07-20 21:55:28 -07:00
Aliaksandr Valialkin
324a3c5288
lib/promscrape: follow-up after 6aa50ca954
- Improve docs
- Hide `debug relabeling` column when -promscrape.dropOriginalLabels command-line flag is set
- Inline the code from the added template functions, since the code is harder to follow
  with the template functions, especially when these functions have misleading names.
  Also, these functions are used only in one place, e.g. they do not reduce the amounts of code.
- Hide `click to show original labels` title at `labels` column when original labels aren't available.
- Show the reason on whey original labels aren't available at /service-discovery page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4597
2023-07-20 21:54:09 -07:00
Aliaksandr Valialkin
c921bc0833
app/{vmselect,vlselect}: run make vmui-update vmui-logs-update after recent changes to VMUI
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4604
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4294
2023-07-20 21:53:51 -07:00
Yury Molodov
7757a31a73
fix: change getting serverUrl for vmui-logs (#4604) (#4677) 2023-07-20 21:53:30 -07:00
Yury Molodov
02a182e2a1
feat: change columns for active queries (#4676) 2023-07-20 21:53:12 -07:00
Yury Molodov
15e1d16afc
vmui: enhancements multiline field editing (#4294)
* fix: change textarea for relabel page

* feat: add comment for monaco theme

* fix: change behavior of multiline fields

* vmui: merge master
2023-07-20 21:52:53 -07:00
Aliaksandr Valialkin
30098ac8bd
app/vlinsert/loki: follow-up after 09df5b66fd
- Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki

- Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header

- Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler

- Check JSON field types more strictly.

- Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients,
  which store timestamps in float64 instead of int64.

- Optimize parsing of Loki labels in Prometheus text exposition format.

- Simplify tests.

- Remove lib/slicesutil, since there are no more users for it.

- Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels
  as stream fields in most Loki setups.

- Allow empty of missing timestamps in the ingested logs.
  The current timestamp at VictoriaLogs side is then used for the ingested logs.
  This simplifies debugging and testing of the provided HTTP-based data ingestion APIs.

The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB
after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 .
This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation
similar to the one used at lib/prompb or lib/prompbmarshal .
2023-07-20 21:52:11 -07:00
hagen1778
1d2a0e0e10
docs: fix the next release version for vmalert
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-20 21:49:34 -07:00
hagen1778
f7d60613a9
docs: specify min version and limitations for vmalert's unit tests
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-20 21:27:50 -07:00
Aliaksandr Valialkin
a0b7def89d
app/vmselect/promql: fix tests after 781947a7e2 2023-07-20 21:25:30 -07:00
Haleygo
939c8b8372
vmalert: init unit test (#4596)
vmalert: support unit tests

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945
---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-20 21:19:45 -07:00
hagen1778
a0cc6eb7a1
docs: mention the simplest way to migrate data in vmctl docs
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-20 21:12:18 -07:00
Dmytro Kozlov
f0d8f77e6d
app/vmagent: fix creating target id if --promscrape.dropOriginalLabels flag was used (#4616)
* app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used

* app/vmagent: hide links if OriginalLabels was dropped

* app/vmagent: update CHANGELOG.md and added information to the docs

* app/vmagent: fix comments
2023-07-20 19:21:41 -07:00
Zakhar Bessarab
7e24edd506
app/vlinsert/loki: fix compatibility with latest MetricsQL lib (#4675)
Loki uses default labels format without "or" operator. This format can't create a list of LabelFilters, so only first set of LabelFilters should be used.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-20 17:06:09 -07:00
Zakhar Bessarab
5b3cbd4db1
app/vlinsert: add support of loki push protocol (#4482)
* app/vlinsert: add support of loki push protocol

- implemented loki push protocol for both Protobuf and JSON formats
- added examples in documentation
- added example docker-compose

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert: move protobuf metric into its own file

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: update reference to docker image

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: make volume name unique

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: add license reference

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: fix volume name

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/VictoriaLogs/data-ingestion: add stream fields for loki JSON ingestion example

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: move entities to places where those are used

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: refactor to use common components

- use CommonParameters from insertutils
- stop ingestion after first error similar to elasticsearch and jsonline

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: address review feedback

- add missing logstorage.PutLogRows calls
- refactor tenant ID parsing to use common function
- reduce number of allocations for parsing by reusing  logfields slices
- add tests and benchmarks for requests processing funcs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-20 16:49:43 -07:00
Aliaksandr Valialkin
0cbe5ccb4a
app/vmselect: rename promql.WriteActiveQueries() to promql.ActiveQueriesHandler()
This makes it more consistent with the rest of handlers inside app/vmselect/main.go

This is a follow-up for 6a96fd8ed5

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4598
2023-07-20 11:30:40 -07:00
Aliaksandr Valialkin
992c300ce9
all: replace atomic.Value with atomic.Pointer[T]
This eliminates the need in .(*T) casting for results obtained from Load()

Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map,
because map is already a pointer type.
2023-07-19 17:48:26 -07:00
Yury Molodov
44b4963ac6
feat: optimize vmui-log bundle size (#4602)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-19 16:52:18 -07:00
Aliaksandr Valialkin
443c266406
app/vmalert/README.md: sync with docs/vmalert.md after 54b7bd4564 2023-07-19 16:31:30 -07:00
Yury Molodov
3ad80e281f
vmui: add Active Queries page (#4653)
* feat: add page to display a list of active queries (#4598)

* app/vmagent: code formatting

* fix: remove console

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-07-19 16:02:58 -07:00
dependabot[bot]
9d55da5d26
build(deps-dev): bump word-wrap in /app/vmui/packages/vmui (#4664)
Bumps [word-wrap](https://github.com/jonschlinkert/word-wrap) from 1.2.3 to 1.2.4.
- [Release notes](https://github.com/jonschlinkert/word-wrap/releases)
- [Commits](https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4)

---
updated-dependencies:
- dependency-name: word-wrap
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-19 15:43:59 -07:00
Roman Khavronenko
80768d53dd
docs: follow-up after aec4b5db81 (#4638)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-19 14:48:17 -07:00
Roman Khavronenko
debe1793bb
vmalert: follow-up after d4ac4b7813 (#4659)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-18 16:03:28 -07:00
venkatbvc
bd2a37429c
vmalert: allow to blackhole alerting notifications (#4639)
vmalert: support option to blackhole alerting notifications

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4122

---------

Co-authored-by: Rao, B V Chalapathi <b_v_chalapathi.rao@nokia.com>
2023-07-18 16:02:48 -07:00
Yury Molodov
cde32da04f
vmui: add tip to Explore Metrics page (#4615)
* feat: add tip to Explore Metrics page (#4248)

* fix: update description page
2023-07-18 16:01:24 -07:00
Aliaksandr Valialkin
5ace0701d3
app/vmselect/promql: add the ability to copy all the labels from one side of group_left()/group_right() operation
This is performed by specifying `*` inside group_left()/group_right().
Also allow specifying prefix for the copied labels via `group_left(...) prefix "..."` and `group_right(...) prefix "..."` syntax.
For example, the following query adds all the namespace-related labels to pod info, and prefixes all the copied label names with "ns_" prefix:

  kube_pod_info * on(namespace) group_left(*) prefix "ns_" kube_namespace_labels

This resolves the following StackOverflow questions:

- https://stackoverflow.com/questions/76661818/how-to-add-namespace-labels-to-pod-labels-in-prometheus
- https://stackoverflow.com/questions/76653997/how-can-i-make-a-new-copy-of-kube-namespace-labels-metric-with-a-different-name
2023-07-17 16:58:30 -07:00
Aliaksandr Valialkin
cc54fa2a56
app/vmselect/promql: recommend to use (a op b) keep_metric_names instead of a op b keep_metric_names
The `a op b keep_metric_names` is ambigouos to `a op (b keep_metric_names)` when `b` is a transform or rollup function.
For example, `a + rate(b) keep_metric_names`. So it is better to use more clear syntax: `(a op b) keep_metric_names`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3710
2023-07-16 23:47:15 -07:00
Zakhar Bessarab
781947a7e2
metricsql: add support of using keep_metric_names for binary operations (#4109)
* metricsql: add support of using keep_metric_names for binary operations

This should help to avoid confusion with queries like one in the issue #3710.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* wip

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-16 03:01:27 -07:00
Aliaksandr Valialkin
a7fdc3fcc7
all: add support for or filters in series selectors
This commit adds ability to select series matching distinct filters via a single series selector.
For example, the following selector selects series with either {env="prod",job="a"}
or {env="dev",job="b"} labels:

  {env="prod",job="a" or env="dev",job="b"}

The `or` filter is supported in all the VictoriaMetrics tools now.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3997
Uses https://github.com/VictoriaMetrics/metricsql/pull/14
2023-07-15 23:56:18 -07:00
Aliaksandr Valialkin
6a65af5112
all: replace ElasticSearch -> Elasticsearch for the sake of consistency
This is a follow-up for 7f6b5dc47b
2023-07-14 10:52:43 -07:00
Haleygo
5e5c805599
vmalert: fix evalTS after modify group interval (#4629) 2023-07-14 10:47:29 -07:00
Roman Khavronenko
89b5a6a4d5
vmctl: mention replicationFactor during migration (#4633)
Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4624

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-14 10:36:46 -07:00
dependabot[bot]
a15a66ee89
build(deps): bump tough-cookie in /app/vmui/packages/vmui (#4603)
Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.1.2 to 4.1.3.
- [Release notes](https://github.com/salesforce/tough-cookie/releases)
- [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md)
- [Commits](https://github.com/salesforce/tough-cookie/compare/v4.1.2...v4.1.3)

---
updated-dependencies:
- dependency-name: tough-cookie
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-13 22:22:53 -07:00
Aliaksandr Valialkin
e1cf962bad
lib/storage: switch from global to per-day index for MetricName -> TSID mapping
Previously all the newly ingested time series were registered in global `MetricName -> TSID` index.
This index was used during data ingestion for locating the TSID (internal series id)
for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names).

The `MetricName -> TSID` index is stored on disk in order to make sure that the data
isn't lost on VictoriaMetrics restart or unclean shutdown.

The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding
data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache,
and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics
uses in-memory cache for speeding up the lookup for active time series.
This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested
active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk.

VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases:

- If `storage/tsid` cache capacity isn't enough for active time series.
  Then just increase available memory for VictoriaMetrics or reduce the number of active time series
  ingested into VictoriaMetrics.

- If new time series is ingested into VictoriaMetrics. In this case it cannot find
  the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index,
  since it doesn't know that the index has no the corresponding entry too.
  This is a typical event under high churn rate, when old time series are constantly substituted
  with new time series.

Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index,
are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics.

Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName`
for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod.
This index can become very large under high churn rate and long retention. VictoriaMetrics
caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups.
The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing
recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series.

This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly
reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics
consults only the index for the current day when new time series is ingested into it.

The downside of this change is increased indexdb size on disk for workloads without high churn rate,
e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store
identical `MetricName -> TSID` entries for static time series for every day.

This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation,
since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 .

At the same time the change fixes the issue, which could result in lost access to time series,
which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698

The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed
in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685

This is a follow-up for 1f28b46ae9
2023-07-13 17:03:50 -07:00
Aliaksandr Valialkin
650af7c5ca
app/vmalert: silence golagci-lint at TestAlertingRule_Template
Add a break if gotAlert is nil

This removes the following golangci-lint warning:

app/vmalert/alerting_test.go:868:8: SA5011(related information): this check suggests that the pointer can be nil (staticcheck)
				if gotAlert == nil {
				   ^
2023-07-13 12:16:00 -07:00
Dmytro Kozlov
f31ac064f9
app/vmctl: fix panic --remote-read-filter-time-start flag not defined (#4605)
* app/vmctl: fix panic `--remote-read-filter-time-start` flag not defined

* app/vmctl: update CHANGELOG.md

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-07-13 12:13:21 -07:00
Roman Khavronenko
fdccb56620
vmalert: check for negative offset for missed rounds (#4628)
It could happen for low evaluation intervals and irregular
delays during execution that evaluation time would get
a negative offset. This could result into cumulative
discrepancy between the actual time and evaluation time for rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-13 12:05:52 -07:00
Aliaksandr Valialkin
b07a1c85b9
all: update Go builder from 1.20.5 to 1.20.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.6+label%3ACherryPickApproved
2023-07-12 01:00:24 -07:00
Dmytro Kozlov
5c4ca4aea8
app/vmctl: remove undefined flag from the documentation. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4552. (#4606) 2023-07-10 15:01:54 -07:00
Aliaksandr Valialkin
f65153018b
app/{vmselect,vlselect}: run make vmui-update vmui-logs-update 2023-07-09 12:44:04 -07:00
Zakhar Bessarab
ddd918b93c
docs: make httpAuth.* flags description less ambiguous (#4588)
* docs: make `httpAuth.*` flags description less ambiguous

Currently, it may confuse users whether `httpAuth.*` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs: fix a typo

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-09 12:36:14 -07:00
Haleygo
ef8e3eb9b3
vmselect: fix result in Prometheus query when time is small (#4578)
vmselect: fix result in Prometheus query when time is small

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-09 12:33:29 -07:00
Aliaksandr Valialkin
e1a2404db5
app/vmselect/netstorage: follow-up after 173ccf4333
- Clarify docs about -replicationFactor command-line flag at vmselect
- Clarify description for -replicationFactor and -search.skipSlowReplicas command-line flags
- Fix the logic for returning responses if -search.skipSlowReplicas command-line flag
  is enabled. The logic was broken in the 173ccf4333,
  so it could return responses only if some of vmstorage nodes return error,
  while it should return when query results are successfully collected from more than
  (len(storageNodes) - replicationFactor) vmstorage nodes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2023-07-09 11:58:22 -07:00
Haleygo
3c2308fd52
vmalert:fix query request using rfc3339 format (#4577)
vmalert: consistently use time.RFC3339 format for time in queries

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-09 11:03:10 -07:00
Haleygo
14e242d0b9
vmselect: fix result collect count (#4599) 2023-07-08 08:21:27 +02:00
Roman Khavronenko
173ccf4333
vmselect: introduce search.skipSlowReplicas cmd-line flag (#4538)
* vmselect: introduce `search.skipSlowReplicas` cmd-line flag

vmselect has two logical conditions during request processing when
`-replicationFactor` cmd-line flag is set:
1. If at least `len(storageNodes) - replicationFactor` responded, it could skip
waiting for the rest of nodes to respond. This could lead to problems described
here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207.
2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded
without an error.

The P1 showed itself error-prone and became the main reason why
`-replicationFactor` wasn't recommended to use at vmselect level.
However, this optimization could be still very useful in situations
when there are slow and fast replicas in cluster.

But P2 remains viable and important conditionless.
Hiding P1 behind the feature-flag `search.skipSlowReplicas`
should make `-replicationFactor` flag usable again. And let users
choose whether they want P1 to be respected.

Related issues
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: update changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-07 11:50:26 +02:00
Roman Khavronenko
109e55f865
vmalert: allow disabling of step param attached to instant queries (#4574)
vmalert: allow disabling of `step` param attached to instant queries

This might be useful for using vmalert with datasources that to not support this param,
unlike VictoriaMetrics.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4573

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 23:13:56 -07:00
Aliaksandr Valialkin
0107d78639
docs/vmgateway.md: update -help output 2023-07-06 23:07:47 -07:00
Aliaksandr Valialkin
ee4280d132
docs/vmbackupmanager.md: update -help output 2023-07-06 22:57:31 -07:00