Commit graph

10325 commits

Author SHA1 Message Date
Zakhar Bessarab
eb672560cf
docs/CHANGELOG.md: cut v1.114.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit e950846534)
2025-03-21 16:43:22 +04:00
Zakhar Bessarab
690959c8e3
app/{vmselect,vlselect}: run make vmui-update vmui-logs-update
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-03-21 16:37:57 +04:00
Zhu Jiekun
c0932566ab
doc: [guide] update k8s vmsingle vmcluster helm and operator guide
This commit updates some wording for the k8s guide here:
- [Kubernetes monitoring via VictoriaMetrics
Single](https://docs.victoriametrics.com/guides/k8s-monitoring-via-vm-single)
- [Kubernetes monitoring with VictoriaMetrics
Cluster](https://docs.victoriametrics.com/guides/k8s-monitoring-via-vm-cluster)
- [Getting started with VM Operator,
](https://docs.victoriametrics.com/guides/getting-started-with-vm-operator)
2025-03-21 16:29:15 +04:00
Fred Navruzov
6280fe6cd1
docs/vmanomaly: release v1.21.0 + HC/HA docs ()
### Describe Your Changes

PR updates docs to release v1.21.0, in particular, adjust docs and its
structure to High Availability (HA) and horizontal scalability (HS)
capabilities.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2025-03-21 16:29:15 +04:00
Max Kotliar
0778c90901
lib/promscrape: improve streamParse performance
Previously, performance of stream.Parse could be limited by mutex.Lock on callback function. It used shared writeContext. With complicated relabeling rules and any slowness at pushData function, it could significantly decrease parsed rows processing performance.

 This commit removes locks and makes parsed rows processing lock-free in the same manner as `stream.Parse` processing implemented at push ingestion processing.

 Implementation details:
- Removing global lock around stream.Parse callback.
- Using atomic operations for counters
- Creating write contexts per callback instead of sharing
- Improving series limit checking with sync.Once
- Optimizing labels hash calculation with buffer pooling
- Adding comprehensive tests for concurrency correctness

 Benchmark performance:
```
# before
BenchmarkScrapeWorkScrapeInternalStreamBigData-10             13          81973945 ns/op          37.68 MB/s    18947868 B/op        197 allocs/op

# after
goos: darwin
goarch: arm64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape
cpu: Apple M1 Pro
BenchmarkScrapeWorkScrapeInternalStreamBigData-10             74          15761331 ns/op         195.98 MB/s    15487399 B/op        148 allocs/op
PASS
ok      github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape       1.806s
```

Related issue:
 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8159
---------
Signed-off-by: Maksim Kotlyar <kotlyar.maksim@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2025-03-20 16:56:05 +01:00
Zakhar Bessarab
8cdcaa798a
app/vmbackupmanager: properly set vm_backup_last_run_failed metric
Previously, `getBackupsList` was appending `latest` backup in all cases without checking if it actually exists.
This lead to `vm_backup_last_run_failed` metric being set to `1` since folder did not contain successful completion marker.

This commit adds a check to handle a case when remote storage does not contain any backups yet.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8490

---
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-03-20 15:50:19 +01:00
Roman Khavronenko
e3aeefdf01
docs: fix a few typos 2025-03-20 15:38:59 +01:00
Zakhar Bessarab
2ee91f6c5a
lib/backup/s3remote: add retries for "IncompleteBody" errors
These errors could be caused by intermittent network issues, especially
in case of using proxies when accessing S3 storage. Previously, such
error would abort backup/restore process and require manual intervention
to ensure backups consistency.

This commit adds automatic retries to handle this to improve backups
reliability and resilience to network issues.
2025-03-20 15:36:50 +01:00
f41gh7
f3921192dd
app/vmgateway: remove vmalert dependency
1. remove "vmalert" word from vmgateway doc and exposed metrics;
2. remove unrelated flags like -datasource.roundDigits, remoteRead.disablePathAppend, -datasource.disableStepParam
2025-03-20 12:39:50 +01:00
Zakhar Bessarab
2201522ff9
app/vmbackupmanager: properly close run channel when stopping
vmbackupmanager uses `runC` channel for inter-goroutine communication between `scheduler` and `execute` goroutines.

 Previously, `runC` wasn't closed during graceful shutdown. And vmbackupmanager process couldn't gracefully stop. It could only be killed with-in configured timeout.

This commit properly closes `runC` by `scheduler` when stopping vmbackupmanager in order to avoid shutdown delay.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8554
2025-03-20 12:39:49 +01:00
f41gh7
c00e596ba9
app/vmbackupmanager: prevent backups being scheduled one after another
Previously, backup was first scheduled at 00:00:00 and `getSleepDuration` was immediately executed to get the sleep duration for the next backup. Since it was returning `1 * time.Second` the next backup was attempted and failed to be scheduled.

Update logic to wait for full backup interval in this case so that there will be no attempt to schedule an unneeded backup.

 It also adds the following changes:
* fix error log entry reference to type of policy
* add a message about retention completion similar to existing message for backups to make it more consistent

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8499

This is a follow-up for fb6d2e92e3a1cf412d1f7dee64a4852941a8aa1b
2025-03-20 12:39:49 +01:00
Andrii Chubatiuk
ba8708af34
lib/streamaggr: fix threshold update, when deduplication and windows are enabled ()
### Describe Your Changes

during initial flush with deduplication and windows enabled lower
timestamps threshold is set to an upper bound of the next deduplication
interval, which leads to ignoring all samples on subsequent intervals

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 511517f491)
2025-03-20 09:56:12 +01:00
Yury Molodov
a99f3b6996
vmui: show hidden common labels in group name ()
### Describe Your Changes

Changes:
1. When `hide common labels` is enabled, they will now be displayed in
the group name.
2. Legend settings toggles have been moved below the graph for better
accessibility.

![image](https://github.com/user-attachments/assets/fc8c7f4c-c155-4056-8862-301ad375d7ae)

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit db66ab1852)
2025-03-20 09:56:11 +01:00
Yury Molodov
4673104aac
vmui/logs: implement nanosecond log sorting ()
### Describe Your Changes

This PR adds nanosecond precision to log sorting, ensuring accurate
ordering of entries with sub-millisecond differences.

Related issue:  

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 31f662a0f7)
2025-03-20 09:56:11 +01:00
hagen1778
4985d8b58f
docs: fix typos in release versions
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7f69553230)
2025-03-20 09:56:11 +01:00
hagen1778
8c6631b18a
docs: mark LTS releases explicitly in changelog
The mark is important for readers to understand the type of release.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 90d547dec0)
2025-03-20 09:13:09 +01:00
Hui Wang
fc107d0a4a
vmgateway: fix vmgateway_ratelimit_refresh_duration_seconds
* vmgateway: fix `vmgateway_ratelimit_refresh_duration_seconds`

* revert reload ep change

---------

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-03-20 09:42:35 +04:00
Hui Wang
dc28491771
app/vmalert: properly register group and rules metrics
Commit 9ca74d1fff introduced an issue with metrics registration. Due to metrics.Summary type always registered at the global state of metrics package, vmalert had increased memory and CPU usage after multiple configuration reloads.

 This commit addresses this issue and properly registers metrics.Summary metric. Now metrics for group and rules must be explicitly registered before group.Start with group.Init method. It simplifies metrics usage an ensures that all needed metrics were registered and group is ready to start.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8532
2025-03-19 14:04:49 +01:00
Aliaksandr Valialkin
1aa0f9a28a
app/vlselect: move the code responsible for limiting the number of concurrently executed requests, into separate functions
This improves code readability a bit.
2025-03-19 14:04:49 +01:00
Aliaksandr Valialkin
1f17c7f397
lib/chunkedbuffer: add Buffer.Len() method, which returns the byte length of the data stored in the buffer 2025-03-19 14:04:48 +01:00
Aliaksandr Valialkin
04b23fba33
lib/logstorage: typo fix in the comment to Storage.GetStreamFieldValues() function 2025-03-19 14:04:48 +01:00
Hui Wang
bcf02fb5f8
app/vmalert: fix possible data race on group checksum
1. fix possible data race on group checksum when reload is called
concurrently. Before, it didn't affect much but might update the group
one more time.
2. remove the unnecessary g.mu.RLock() and compute group.id at newGroup creation. Changes to group.ID()
indicate that type and interval have changed, and the group is new.

Related PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8540
2025-03-19 14:04:48 +01:00
Aliaksandr Valialkin
661302325e
docs/VictoriaLogs/querying/README.md: fix the docs for /select/logsql/stream_field_values when the limit arg is set
It returns up to `limit` values for the given log stream field with the biggest number of hits.
2025-03-19 14:04:48 +01:00
Aliaksandr Valialkin
a93bb3c22d
lib/logstorage: support for {field in (*)} and {field not_in (*)} syntax in LogsQL
This is needed for https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238
to be consistent with `in(*)` feature, which has been added in the commit 84d5771b41
2025-03-19 14:04:48 +01:00
Hui Wang
4f3a6b85b9
app/vmalert: fix memory leak with -notifier.blackhole
Previous commit 9ca74d1fff added a regression for notifier's metrics exposed by vmalert. vmalert returned new notifier instances for the blackhole notifier type. And it registered new metrics each get notifiers function was called. It registered duplicate metrics and lead to OOM crash.

 This commit properly init blachole notifier instances and add metrics for it only once, during application start.

 Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8532
2025-03-19 14:04:47 +01:00
Nikolay
16972a078f
lib/promscrape: properly send staleness markers
Previously, vmagent may incorrectly store partial scrape response
in case of scrapping error. It may happen if `sw.ReadData` call fetched
some chunked response and store it at buffer. And later context deadline
exceed error happened.
 As a result, at the next scrape iteration this partial response could
 be forwarded to the `sw.sendStaleSeries(lastScrape...)` function call
 and lead to `Prometheus line` parsing error.

 This commit properly set response body to the empty value in case of
scrapping error. It prevents storing partial scrape response body. And
it no longer sends partial staleness markers to the remote storage.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8528
2025-03-19 14:04:47 +01:00
Aliaksandr Valialkin
c0e9b15606
lib/protoparser: rename lib/protoparser/common to lib/protoparser/protoparserutil
This improves readability of the code, which uses this package.
2025-03-18 16:40:06 +01:00
Aliaksandr Valialkin
847b554a52
app/vlinsert: do not start background flusher in the LogMessageProcessor used for synchornous processing of a single data block
This should reduce CPU usage a bit for data ingestion protocols,
which process a single message per every request without streaming.
2025-03-18 11:18:01 +01:00
Aliaksandr Valialkin
5cec930842
lib/protoparser/common: limit the maximum memory, which could be occupied by snappy-compressed message at ReadUncompressedData 2025-03-18 11:18:00 +01:00
Roman Khavronenko
2f30213352
dashboards: add Memory allocations rate to ResourceUsage tab ()
This panel should have help us to identify
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8501 during
release checks. While we already track the GC pressure, its value is
relative and change wasn't noticeable for the workloads that we
observed.

The absolute values of allocations rate could have helped to see the
anomaly.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3511e2e6af)
2025-03-17 16:44:16 +01:00
Alexander Frolov
51e293d351
lib/promrelabel: comment typo ()
### Describe Your Changes

`prasedRelabelConfig` -> `parsedRelabelConfig`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 127d4f37b8)
2025-03-17 16:44:16 +01:00
Yury Molodov
921c6ed582
vmui/logs: fix endless group expansion loop bug ()
### Describe Your Changes

Fix endless group expansion loop.
Related issue: 

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 44a54e4590)
2025-03-17 16:44:16 +01:00
hagen1778
4476588419
docs: add update note for renamed metric vm_mmapped_files
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d4560ee015)
2025-03-17 16:38:15 +01:00
Guillem Jover
1d8b7faf71
spelling and grammar fixes via codespell ()
### Describe Your Changes

Fix many spelling errors and some grammar, including misspellings in
filenames.

The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`.
While this is a breaking change, this metric isn't used in alerts or dashboards.
So it seems to have low impact on users.

The change also deprecates `cspell` as it is much heavier and less usable.
---------

Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>

(cherry picked from commit 76d205feae)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-03-17 16:38:11 +01:00
Jose Gómez-Sellés
9016e8a8d5
docs/cloud: add FAQ for VM Cloud ()
### Describe Your Changes

This PR adds the dedicated FAQ for VictoriaMetrics Cloud

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit d852e5e0b4)
2025-03-17 16:35:47 +01:00
Aliaksandr Valialkin
d7918d4caa
lib/logstorage: switch the type of LogRows.streamTagCanonicals from [][]byte to []string
This reduces the size of LogRows.streamTagCanonicals by 1/3 because of the eliminated `cap` field
in the slice header (reflect.SliceHeader) compared to the string header (reflect.StringHeader).
2025-03-17 15:04:27 +01:00
Aliaksandr Valialkin
0217198d5c
lib/prompb: use clear() function instead of loops for clearing WriteRequest fields inside WriteRequest.Reset
This makes the code shorter without lossing the clarity.
2025-03-17 14:32:02 +01:00
Fred Navruzov
f967825fff
docs/vmanomaly: update to patch release v1.20.1 ()
### Describe Your Changes

Doc updates to a patch release v1.20.1, fixing a bug in
`PeriodicScheduler` that may affect some of the customers' deployments

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2025-03-17 14:32:02 +01:00
Aliaksandr Valialkin
64ac868ba2
deployment: update VictoriaLogs Docker image tag from v1.16.0-victorialogs to v1.17.0-victorialogs
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.17.0-victorialogs
2025-03-16 01:39:22 +01:00
Aliaksandr Valialkin
548f621be2
docs/VictoriaLogs/CHANGELOG.md: cut v1.17.0-victorialogs 2025-03-16 01:16:06 +01:00
Aliaksandr Valialkin
d0cbf0ab9c
app/vlinsert/opentelemetry: follow-up for a884949aba
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8502
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8511
2025-03-16 01:09:38 +01:00
Devops
4fd2cb9102
fix:Fixed an issue where and were incorrectly displayed ()
### Describe Your Changes

Fixed an issue where and were incorrectly displayed when sent from
OpenTelemetry Collector to Victoria Logs

Fixes 
2025-03-16 01:09:38 +01:00
Aliaksandr Valialkin
dc21cd2784
docs/VictoriaLogs/querying/README.md: mention that /select/logsql/query endpoint may return arbitary number of logs matching the given query filter, and this is OK
This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8507
and https://github.com/VictoriaMetrics/victorialogs-datasource/issues/261
2025-03-16 00:05:05 +01:00
Aliaksandr Valialkin
f9effee6d7
app/vlinsert: send 204 No Content response code at /insert/loki/api/v1/push endpoint
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8505
2025-03-15 23:35:52 +01:00
Aliaksandr Valialkin
6f9d70ae89
lib/{mergeset,storage,logstorage}: use chunked buffer instead of bytesutil.ByteBuffer as a storage for in-memory parts
This commit adds lib/chunkedbuffer.Buffer - an in-memory chunked buffer
optimized for random access via MustReadAt() function.
It is better than bytesutil.ByteBuffer for storing large volumes of data,
since it stores the data in chunks of a fixed size (4KiB at the moment)
instead of using a contiguous memory region. This has the following benefits over bytesutil.ByteBuffer:

- reduced memory fragmentation
- reduced memory re-allocations when new data is written to the buffer
- reduced memory usage, since the allocated chunks can be re-used
  by other Buffer instances after Buffer.Reset() call

Performance tests show up to 2x memory reduction for VictoriaLogs
when ingesting logs with big number of fields (aka wide events) under high speed.
2025-03-15 21:20:04 +01:00
Aliaksandr Valialkin
9ef0d7002e
lib/logstorage: pre-allocate buffers for fields and rows inside block.appendRowsTo()
This reduces the number of memory re-allocations inside the loop, which copies the rows.
2025-03-15 21:20:03 +01:00
Aliaksandr Valialkin
22eec97422
lib/logstorage: pre-allocated buffers for fields and rows inside rows.appendRows()
This should reduce the number of memory re-allocations inside the loop, which copies the rows.
2025-03-15 21:20:03 +01:00
Aliaksandr Valialkin
0019621d38
lib/logstorage: pre-allocate the buffer needed for marshaling a block of strings inside marshalStringsBlock
This reduces the number of memory re-allocations when appending the strings to the buffer in the loop.
2025-03-15 21:20:02 +01:00
Aliaksandr Valialkin
2f3e55f41f
lib/logstorage: optimize copying dict values inside valuesDict.copyFrom a bit
Pre-allocate the needed slice of strings and then assign items to it by index
instead of appending them. This reduces the number of memory allocations
and improves performance a bit.
2025-03-15 21:20:02 +01:00
Aliaksandr Valialkin
b0ac8c1f35
lib/logstorage: intern column names instead of cloning them during data ingestion
This reduces the number of memory allocations when ingesting logs with big number of fields (aka wide events)
2025-03-15 21:20:01 +01:00