Commit graph

9075 commits

Author SHA1 Message Date
Aliaksandr Valialkin
c6b2cac892
docs/VictoriaLogs/CHANGELOG.md: typo fix after 255d1d4e13: returns -> return 2024-09-26 09:00:55 +02:00
Aliaksandr Valialkin
255d1d4e13
app/vlselect/logsql: clone the query with the current timestamp when performing live tailing requests in the loop
Previously the original timestamp was used in the copied query, so _time:duration filters
were applied to the original time range: (timestamp-duration ... timestamp]. This resulted
in stopped live tailing, since new logs have timestamps bigger than the original time range.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7028
2024-09-26 08:57:23 +02:00
Github Actions
c89a7a0b62
Automatic update operator docs from VictoriaMetrics/operator@5271a59 (#7099)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: f41gh7 <18450869+f41gh7@users.noreply.github.com>
2024-09-25 23:23:08 +02:00
Aliaksandr Valialkin
c66da8b0ba
docs/LTS-releases.md: consistently use v prefix in front of VictoriaMetrics releases 2024-09-25 19:29:30 +02:00
Aliaksandr Valialkin
e9950f6307
lib/logstorage: add blocks_count pipe
This pipe is useful for debugging purposes when the number of processed blocks must be calculated for the given query:

    <query> | blocks_count

This helps detecting the root cause of query performance slowdown in cases like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070
2024-09-25 19:17:48 +02:00
Aliaksandr Valialkin
65b93b17b1
lib/logstorage: lazily read column headers metadata during queries
This improves performance for analytical queries, which do not need column headers metadata.
For example, the following query doesn't need column headers metadata, since _stream and min(_time)
are stored in block header, which is read separately from colum headers metadata:

  _time:1w | stats by (_stream) min(_time) min_time

This commit significantly improves the performance for this query.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070
2024-09-25 19:17:48 +02:00
Aliaksandr Valialkin
4599429f51
lib/logstorage: read timestamps column when it is really needed during query execution
Previously timestamps column was read unconditionally on every query.
This could significantly slow down queries, which do not need reading this column
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-25 19:17:47 +02:00
Andrii Chubatiuk
f934f71708
docs/victorialogs/data-ingestion: removed FluentBit Elasticsearch from examples (#7093)
removed FluentBit Elasticsearch example from docs as custom headers are
not supported by elasticsearch output till
https://github.com/fluent/fluent-bit/pull/9416 is merged and released

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6985

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-09-25 18:43:26 +02:00
Andrii Chubatiuk
e75ae1b274
deployment: restructure victorialogs examples (#6971)
### Describe Your Changes

- Use common compose.yaml file for all victorialogs setups to set
version in a single place and override it on demand for each agent and
protocol
- Replaced multiple victorialogs instances in HA setup with single setup
with `deploy.replica` parameter set
- Added fluentd setup

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-09-25 18:33:26 +02:00
Github Actions
612be0954c
Automatic update operator docs from VictoriaMetrics/operator@1feab7d (#7092)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: f41gh7 <18450869+f41gh7@users.noreply.github.com>
2024-09-25 15:02:24 +02:00
Roman Khavronenko
6b1b47df54
app/vmalert: bump default values for sending data to remoteWrite.url (#7084)
* `remoteWrite.maxQueueSize` from `100_000` to `1_000_000`, this should
improve resiliency of recording rules that produce many series;
* `remoteWrite.maxBatchSize` from `1_000` to `10_000`, this should be
more efficient to send from netwroking perspective;
* `remoteWrite.concurrency` from `1` to `4`, this should imrpove speed
of sending the generated series.

The new settings should improve remote write performance of vmalert with
default settings.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Hui Wang <haley@victoriametrics.com>
2024-09-25 15:01:39 +02:00
Aliaksandr Valialkin
7f1ba18719
lib/logstorage: improve the performance of obtaining _stream column value
Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries.
This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU
has its own blockSearch instance.

This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch
cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.
2024-09-24 20:57:00 +02:00
Aliaksandr Valialkin
cf2e7d0d92
lib/logstorage/consts.go: document that it isn't recommended setting maxColumnsPerBlock constant to too big values
This should help avoiding cases like this one - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6425#issuecomment-2337446083
2024-09-24 18:51:46 +02:00
Aliaksandr Valialkin
f86e093b20
lib/logstorage: improve performance for streamID.marshalString() by more than 2x
The streamID.marshalString() is executed in hot path if the query selects _stream_id field.

Command to run the benchmark:

go test ./lib/logstorage/ -run=NONE -bench=BenchmarkStreamIDMarshalString -benchtime=5s

Results before the commit:

BenchmarkStreamIDMarshalString-16    	438480714	        14.04 ns/op	  71.23 MB/s	       0 B/op	       0 allocs/op

Results after the commit:

BenchmarkStreamIDMarshalString-16    	982459660	         6.049 ns/op	 165.30 MB/s	       0 B/op	       0 allocs/op
2024-09-24 18:35:04 +02:00
Aliaksandr Valialkin
919d2dc90e
lib/logstorage: add benchmark for streamID.marshalString 2024-09-24 18:31:38 +02:00
Roman Khavronenko
9a0f697622
docs: update CONTRIBUTING.md with practical requirements (#7087)
The change supposed to have more practical recommendations and reflect
the real processes for maintaining the project.


Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-24 18:22:18 +02:00
Github Actions
e28265fa39
Automatic update operator docs from VictoriaMetrics/operator@27ad7e1 (#7086)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: f41gh7 <18450869+f41gh7@users.noreply.github.com>
2024-09-24 15:47:30 +02:00
hagen1778
8bb3f2fd43
lib/promscrape: make linter happy
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-24 15:12:55 +02:00
hagen1778
c7569dac50
lib/promscrape: temporary disable TestClientProxyReadOk
This test is very flaky and prevents other tests from running in CI.
Disabling this test should improve tests quality, since it isn't reliable anyway.

There is a ticket to fix this test - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7062

Once fixed, this test should be uncommented.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-24 14:59:25 +02:00
Zhu Jiekun
5319acb8ed
vmagent: remote write respect Retry-After in header (#6124)
### Describe Your Changes
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6097

#### Changed
- Remote write retry policy in `vmagent` is changed into:
  1. Respect `Retry-After` duration if exists.
2. Otherwise, calculate next retry duration by backoff policy (x2) and
max retry duration limit.
 
#### Docs
- `CHANGELOG.md`.

---
### Checklist
The following checks are mandatory:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Zakhar Bessarab <me@zekker-dev.tk>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-09-24 12:44:03 +02:00
Dmytro Kozlov
cbeb7d50e8
lib/promscrape: show only unhealthy targets if show_only_unhealthy filter is enabled (#6960)
### Describe Your Changes

It is better to show only unhealthy targets instead of all of them when
`show_only_unhealthy` filter is enabled.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3536

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-09-24 12:18:24 +02:00
Phuong Le
df665a13c9
docs: update logos files and usage rules (#6980)
### Describe Your Changes

New logos and usage guideline

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-09-24 11:53:58 +02:00
Zhu Jiekun
fea4433362
docs: [VictoriaLogs] OTel Collector elasticsearchexporter header note (#7074)
### Describe Your Changes

By default, the `elasticsearchexporter` in OTel Collector puts the log
message under a field other than `_msg` (e.g., `Body`). Without
specifying via an HTTP header, those logs may not be queried correctly.
See also:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785.

This pull request updates the example configuration and notes for the
`elasticsearchexporter`.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-09-24 11:52:09 +02:00
Dmytro Kozlov
91b28d0527
deployment/docker: update grafana datasources to the latest version (#7083)
### Describe Your Changes

Updated grafana plugins to the latest releases 

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-09-24 11:51:18 +02:00
Github Actions
524579d9bd
Automatic update operator docs from VictoriaMetrics/operator@75bc1b4 (#7080)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: AndrewChubatiuk <3162380+AndrewChubatiuk@users.noreply.github.com>
2024-09-24 11:50:50 +02:00
hagen1778
a5c002edef
deployment/alerts: fix copy&paste typo in TooHighGoroutineSchedulingLatency
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-24 11:48:19 +02:00
Roman Khavronenko
4d0b41e63b
deployment: add panel and alerts for displying go scheduler latency (#7078)
The panel and alerting rule should help to understand whether VM
component doesn't have enough CPU resources or gets throttled. The alert
is applicable for all VM components.
The panel was added to vmalert, vmagent, vmsingle, vm clusert and
victorialogs dashes.

-------------------

This alerting rule should have help us identify resource shortage for
sandbox vmagent - see [this
link](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.range_input=23d13h25m25s424ms&g0.end_input=2024-09-23T14%3A11%3A00&g0.relative_time=none&g0.tab=0&g0.expr=histogram_quantile%280.99%2C+sum%28rate%28go_sched_latencies_seconds_bucket%7Bjob%3D%22vmagent-monitoring-vmagent%22%7D%5B5m%5D%29%29+by+%28le%2C+job%2C+instance%29%29+%3E+0.1)
for example. We weren't aware of resource shortage, because VM metrics
assumed this vmagent had 1vCPU while in fact its limit was 0.2vCPU.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-23 16:54:42 +02:00
Aliaksandr Valialkin
109772bdc4
lib/cgroup: round GOMAXPROCS to the lower integer value of cpuQuota
Rounding GOMAXPROCS to the upper interger value of cpuQuota increases chances of CPU starvation,
non-optimimal goroutine scheduling and additional CPU overhead related to context switching.

So it is better to round GOMAXPROCS to the lower integer value of cpuQuota.
2024-09-23 16:09:12 +02:00
Aliaksandr Valialkin
3964889705
app/vmselect/promql: consistently replace NaN data points with non-NaN values for range_first and range_last functions
It is expected that range_first and range_last functions return non-nan const value across all the points
if the original series contains at least a single non-NaN value. Previously this rule was violated for NaN data points
in the original series. This could confuse users.

While at it, add tests for series with NaN values across all the range_* and running_* functions, in order to maintain
consistent handling of NaN values across these functions.
2024-09-23 14:59:29 +02:00
hagen1778
3ed172eeeb
docs: add note about testing new releases on testing env
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-23 11:58:43 +02:00
Zakhar Bessarab
78ecfede06
license: add ability to reload keys (#775)
* lib/license: add support of license key hot-reload

* docs: add info about license key hot reload

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-09-23 09:04:17 +02:00
Aliaksandr Valialkin
57183c9b61
docs/changelog/CHANGELOG.md: moved the description of the fix for proper usage of -streamAggr.dedupInterval and -remoteWrite.streamAggr.dedupInterval from FEATURE to BUGFIX section
The previous behaviour was incorrect, since it is unexpected that the -streamAggr.dedupInterval
and -remoteWrite.streamAggr.dedupInterval is applied to processed samples only if -streamAggr.config isn't set.

This is a follow-up for d523015f27
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6711
2024-09-23 08:56:33 +02:00
Github Actions
4dc85613a2
Automatic update helm docs from VictoriaMetrics/helm-charts@428cb36 (#7072)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: AndrewChubatiuk <3162380+AndrewChubatiuk@users.noreply.github.com>
2024-09-23 13:17:34 +08:00
Aliaksandr Valialkin
0ada781cf2
docs/changelog/CHANGELOG.md: document bugfix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7009
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7064
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7009

This is a follow-up for 55febc0920
2024-09-22 21:57:48 +02:00
Github Actions
8ed9978591
Automatic update operator docs from VictoriaMetrics/operator@d661554 (#7069)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: AndrewChubatiuk <3162380+AndrewChubatiuk@users.noreply.github.com>
2024-09-22 18:39:52 +02:00
Artem Fetishev
55febc0920
lib/storage: restore ability to put empty metric ID list into tagFiltersToMetricIDsCache (#7064)
### Describe Your Changes

Currently it the metricID list is empty it won't be mashalled and as the
result won't be put into the tagFiltersToMetricIDsCache which causes the
cache misses for the corresponding tagFilters. In some setups this
causes severe search speed detradation (see #7009).

The empty metric IDs was covered before but then was accidentally
removed in 6c21439.

This PR restores the coverage of this case.

A new unit test can be used as a proof that empty metricID lists are not
added to the cache (just remove the fix in index_db.go and run the test
to see the result)

Also a benchmark has been added to see the implications of the
compression.

```
user@laptop:~/p/github.com/rtm0/VictoriaMetrics/01/src$ go test ./lib/storage/ -run=NONE -bench BenchmarkMarshalUnmarshalMetricIDs --loggerLevel=ERROR
goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage
cpu: 13th Gen Intel(R) Core(TM) i7-1355U
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-0-12             3237240               363.5 ns/op               0 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1-12             2831049               451.8 ns/op               0.4706 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10-12            1152764              1009 ns/op                 1.667 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100-12            297055              3998 ns/op                 5.755 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000-12            31172             34566 ns/op                 8.484 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000-12            4900            289659 ns/op                 9.416 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100000-12            447           2341173 ns/op                 9.456 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000000-12            42          24926928 ns/op                 9.468 compression-rate
BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000000-12            5         204098872 ns/op                 9.467 compression-rate
PASS
ok      github.com/VictoriaMetrics/VictoriaMetrics/lib/storage  15.018s
```

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-09-20 17:21:53 +02:00
Artem Navoiev
a2ecba4154
docs: cloud siplify menu items names for notification, user managemenr and notifs
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 16:16:25 +02:00
Github Actions
8645438a79
Automatic update helm docs from VictoriaMetrics/helm-charts@f0e007f (#7059)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: AndrewChubatiuk <3162380+AndrewChubatiuk@users.noreply.github.com>
2024-09-20 16:11:23 +02:00
Artem Navoiev
9e9583357d
docs add root menu to _index.md as homepage
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 16:06:15 +02:00
Github Actions
75a29655f5
Automatic update operator docs from VictoriaMetrics/operator@fdd3f9f (#7065)
Automated changes by
[create-pull-request](https://github.com/peter-evans/create-pull-request)
GitHub action

Signed-off-by: Github Actions <133988544+victoriametrics-bot@users.noreply.github.com>
Co-authored-by: tenmozes <1381404+tenmozes@users.noreply.github.com>
2024-09-20 16:03:52 +02:00
Andrii Chubatiuk
5708a85499
docs: updated root menu items (#7061)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 06:14:29 -07:00
Aliaksandr Valialkin
787b9cd9a0
lib/storage: improve performance for indexSearch.containsTimeRange()
The indexSearch.containsTimeRange() function is called for the current indexDB and the previous indexDB
every time when searching for metricIDs by label filters. This function consumes a lot of additional CPU time
for cases when queries with lightweight label filters are sent to VictoriaMetrics at high rate (e.g. thousands of RPS),
like in the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7009 .

Optimize indexSearch.containsTimeRange() function in the following ways:

- Unconditionally return true if this function is called for the current indexDB, since there are very high
  chances that the current indexDB contains the data with timestamps in the requested time range.

- Cache the minimum timestamp, which is missing in the indexed data for the previous indexDB.
  This is safe to do, since the previous indexDB is readonly.
  This optimization eliminates potentially slow lookup in the previous indexDB for typical
  use cases when the requested time range is close to the current time.
2024-09-20 13:07:20 +02:00
Aliaksandr Valialkin
6f61e9d49d
lib/storage: simplify indexDB.doExtDB() usage by removing the returned value
Previously indexDB.doExtDB() was returning boolean value, which was indicating whether f callback was called.
There is no need in returning this boolean value, since the f callback can determine on itself whether it was called.

This simplifies the code a bit.

While at it, document indexDB.doExtDB().
2024-09-20 11:59:57 +02:00
Roman Khavronenko
218c533874
lib/storage: follow-up after d8f8822fa5 (#7036)
Make function name and comments more clear.

d8f8822fa5

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-09-20 11:50:47 +02:00
Artem Navoiev
7596e239eb
docs: set operator menu item weight to 30
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 11:22:03 +02:00
Artem Navoiev
36d58588a7
docs: clarify operator name.2
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 11:20:16 +02:00
Artem Navoiev
cd384c7547
docs: clarify operator name
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-09-20 11:11:38 +02:00
Hui Wang
d6d02d7aeb
vmalert: fix variable $activeAt value when templating rule annotation in replay mode
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-09-20 11:07:40 +02:00
hagen1778
6167bccc5a
docs: fix more typos in the changelog
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-20 10:54:43 +02:00
hagen1778
59281d5358
docs: rm update node about loggerMaxArgLen as it doesn't have incompatibility effect
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-20 10:42:07 +02:00