Commit graph

8074 commits

Author SHA1 Message Date
Aliaksandr Valialkin
e7dfcdfff6
lib/storage: consistently use atomic.* type for refCount and mustDrop fields in indexDB, table and partition structs
See ea9e2b19a5
2024-02-24 00:26:26 +02:00
Aliaksandr Valialkin
e2b0cc873b
lib/storage: convert dedupsDuringMerge from uint64 to atomic.Uint64
This should simplify code maintenance by gradually converting to atomic.* types instead of calling atomic.* functions
on int and bool types.

See ea9e2b19a5
2024-02-24 00:25:44 +02:00
Aliaksandr Valialkin
1eb3346ecc
lib/{storage,mergeset}: properly fix 'unaligned 64-bit atomic operation' panic on 32-bit architectures
The issue has been introduced in bace9a2501
The improper fix was in the d4c0615dcd ,
since it fixed the issue just by an accident, because Go comiler aligned the rawRowsShards field
by 4-byte boundary inside partition struct.

The proper fix is to use atomic.Int64 field - this guarantees that the access to this field
won't result in unaligned 64-bit atomic operation. See https://github.com/golang/go/issues/50860
and https://github.com/golang/go/issues/19057
2024-02-24 00:25:08 +02:00
Aliaksandr Valialkin
dc5b1e4dc1
lib/httpserver: return back the default value for -http.connTimeout to 2 minutes
It has been appeared that there are VictoriaMetrics users, who rely on the fact that
VictoriaMetrics components were closing incoming connections to -httpListenAddr every 2 minutes
by default. So let's return back this value by default in order to fix the breaking change
made at d8c1db7953 .

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1304#issuecomment-1961891450 .
2024-02-24 00:20:11 +02:00
Aliaksandr Valialkin
2f0b2b942b
docs/Troubleshooting.md: add Too much disk space used chapter` 2024-02-24 00:19:51 +02:00
hagen1778
0ac2df6e66
docs: mention missing change 521f9ffb430edf6aea7720a966b5b079cf15e34e
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-24 00:19:13 +02:00
hagen1778
ab4fae9dc2
lib/storage: cleanup after d4c0615dcd
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c8d1d2ab72)
2024-02-23 18:55:40 +01:00
Dmytro Kozlov
eb22083924
lib/storage: fix aligning (#5860)
(cherry picked from commit d4c0615dcd)
2024-02-23 18:55:39 +01:00
Roman Khavronenko
ce4fcd07ce
deployment: add topology visualizations (#5858)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 840ab60111)
2024-02-23 18:55:39 +01:00
Aliaksandr Valialkin
fe545dfc85
app/vmstorage: remove unused import after 2a5c6e1cd5 2024-02-23 04:53:42 +02:00
Aliaksandr Valialkin
2a5c6e1cd5
app/vmstorage: deprecate -snapshotCreateTimeout command-line flag
Creating snapshot shouldn't time out under normal conditions.
The timeout was related to the bug, which has been fixed in 6460475e3b .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
2024-02-23 04:51:57 +02:00
Aliaksandr Valialkin
42437e05c7
lib/storage: do not drop (date, metricID) entries for the date older than 2 days if samples are ingested at this date
Previously the (date, metricID) entries for dates older than the last 2 days were removed.
This could lead to slow check for the (date, metricID) entry in the indexdb during ingesting historical data (aka backfilling).

The issue has been introduced in 431aa16c8d
2024-02-23 04:06:54 +02:00
Aliaksandr Valialkin
83217b7473
app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values
This commit returns back limits for these endpoints, which have been removed at 5d66ee88bd ,
since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter
results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055
2024-02-23 02:56:58 +02:00
Aliaksandr Valialkin
0e08dee8a1
app/{vmselect,vlselect}/vmui: run make vmui-update vmui-logs-update after recent changes to app/vmui 2024-02-23 01:40:57 +02:00
Yury Molodov
140aaafad0
vmui: add a time picker to the "Logs Explorer" page (#5808)
* vmui: add a time picker to the "Logs Explorer" page #5673

* Update app/vmui/packages/vmui/src/pages/ExploreLogs/hooks/useFetchLogs.ts

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-23 01:38:54 +02:00
Yury Molodov
b2b9f6e900
vmui: fix display Popper.tsx (#5842)
* vmui: fix display Popper.tsx

* vmui/docs: fix display Popper.tsx

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-23 01:33:12 +02:00
Aliaksandr Valialkin
50ea081d75
docs/Single-server-VictoriaMetrics.md: sync with docs/README.md after 79b57f625c 2024-02-23 01:28:42 +02:00
Aliaksandr Valialkin
fbae1533fc
docs/CHANGELOG.md: document d68bb658ce
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5833
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5834
2024-02-23 01:26:02 +02:00
Aliaksandr Valialkin
21170e558c
lib/promutils: hide the math.Round() logic inside ParseTimeMsec() function
This should prevent from bugs similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5801 in the future

This is a follow-up for ce3ec3ff2e
2024-02-23 01:21:42 +02:00
Aliaksandr Valialkin
dfcbcf4368
lib/mergeset: run go fmt after bace9a2501 2024-02-23 01:21:31 +02:00
Aliaksandr Valialkin
19032f9913
lib/{mergeset,storage}: convert bufferred items to searchable parts more optimally
Do not convert shard items to part when a shard becomes full. Instead, collect multiple
full shards and then convert them to a searchable part at once. This reduces
the number of searchable parts, which, in turn, should increase query performance,
since queries need to scan smaller number of parts.
2024-02-23 01:21:03 +02:00
hagen1778
225fd781a5
docs: re-classify change of default http timeout to bugfix
The change introduced in d8c1db7953 (diff-2bfab3db5cc1baf4c6d3ff6b19901926e3bdf4411ec685dac973e5fcff1c723b)
was backported to v1.97.2. Therefore, it is a `bugfix` and should be explicitly
mentioned in the changelog of v1.97.2

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 01:17:40 +02:00
Nikolay
22762d7a69
app/vmselect: change export/csv timestamp format for rfc3339 to respect milliseconds (#5853)
* app/vmselect: adds milliseconds to the csv export response for rfc3339
* milliseconds is a standard prescion for VictoriaMetrics query request responses
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5837

* app/victoria-metrics: adds tests for csv export/import
follow-up after 3541a8d0cf96dd4f8563624c4aab6816615d0756

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 01:16:08 +02:00
Aliaksandr Valialkin
08c5250a7b
lib/storage: handle common case when the number of rows passed to flushRowsToInmemoryParts() doesnt exceed maxRawRowsPerShard 2024-02-23 01:12:18 +02:00
Aliaksandr Valialkin
8669584e9f
lib/{storage,mergeset}: convert beffered items into searchable in-memory parts exactly once per the given flush interval
Previously the interval between item addition and its conversion to searchable in-memory part
could vary significantly because of too coarse per-second precision. Switch from fasttime.UnixTimestamp()
to time.Now().UnixMilli() for millisecond precision. It is OK to use time.Now() for tracking
the time when buffered items must be converted to searchable in-memory parts, since time.Now()
calls aren't located in hot paths.

Increase the flush interval for converting buffered samples to searchable in-memory parts
from one second to two seconds. This should reduce the number of blocks, which are needed
to be processed during high-frequency alerting queries. This, in turn, should reduce CPU usage.

While at it, hardcode the maximum size of rawRows shard to 8Mb, since this size gives the optimal
data ingestion pefromance according to load tests. This reduces memory usage and CPU usage on systems
with big amounts of RAM under high data ingestion rate.
2024-02-23 01:11:57 +02:00
Aliaksandr Valialkin
5f1fa8e7f7
lib/storage: avoid superflouos copy of block header data 2024-02-23 01:11:31 +02:00
Fred Navruzov
c9d8627676
- v1.11 doc updates (#5852)
- fix dead links
2024-02-23 01:11:05 +02:00
Dan Dascalescu
6b5e8e7089
docs: CSV RFC3339 format uses server timezone (#5839) 2024-02-23 01:07:55 +02:00
Aliaksandr Valialkin
a982ab6bfb
app/vmstorage: expose vm_snapshots metric, which shows the current number of snapshots
While at it, refresh docs about snapshots - https://docs.victoriametrics.com/#how-to-work-with-snapshots
2024-02-23 01:07:04 +02:00
Aliaksandr Valialkin
3f9022bc08
lib/storage: do not pool rawRowsBlock when flushing rawRows to in-memory blocks
The pooled rawRowsBlock objects occupies big amounts of memory between flushes,
and the flushes are relatively rare. So it is better to don't use the pool
and to allocate rawRow blocks on demand. This should reduce the average
memory usage between flushes.
2024-02-23 01:06:28 +02:00
Aliaksandr Valialkin
bf07e2ac87
lib/storage: do not keep rawRows buffer across flush() calls
The buffer can be quite big under high ingestion rate (e.g. more than 100MB).
This leads to increased memory usage between buffer flushes.
So it is better to re-create the buffer on every flush in order to reduce memory usage
between buffer flushes.
2024-02-23 01:06:09 +02:00
Aliaksandr Valialkin
843f3ec94e
docs/MetricsQL.md: improve text formatting for better readability 2024-02-23 01:05:49 +02:00
Aliaksandr Valialkin
477fdc21aa
app/vmselect/promql: add count_values_over_time() MetricsQL function
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5847
2024-02-23 01:05:31 +02:00
Aliaksandr Valialkin
65fb54ab8f
app/vmselect/promql: move needSilenceIntervalForRollupFunc from eval.go to rollup.go
This should improve maintainability of the code related to rollup functions,
since it is located in rollup.go

While at it, properly return empty results from holt_winters(), rate_over_sum(),
sum2_over_time(), geomean_over_time() and distinct_over_time() when there are no real samples
on the selected lookbehind window. Previously the previous sample value was mistakenly
returned from these functions.
2024-02-23 01:05:11 +02:00
Alexander Marshalov
8322425364
[lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5814)
* [lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

* fixed tests

* fixed test

* Revert "fixed test"

This reverts commit 8a29764806.

* Revert "fixed tests"

This reverts commit 9ce13d1042.

* Revert "[lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)"

This reverts commit a7a04bd4

* [lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-02-23 00:58:26 +02:00
Artem Navoiev
b46014c8ab
docs: change header from h1 to h2 1.97.2. The markdown requires the proper structure in hierachy of title so h1 can not be a child of h1,h2... only be a separate item, in our structure title is the parent h1
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-02-23 00:42:15 +02:00
Github Actions
e2bbab072a
Automatic update operator docs from VictoriaMetrics/operator@d88157b (#5845) 2024-02-23 00:41:10 +02:00
Github Actions
f08f33cd5a
Automatic update operator docs from VictoriaMetrics/operator@c393852 (#5844) 2024-02-23 00:40:38 +02:00
Github Actions
786679135b
Automatic update operator docs from VictoriaMetrics/operator@4791fd1 (#5843) 2024-02-23 00:40:04 +02:00
hagen1778
10fbda60c8
deployment/docker: add comments to components in docker-compose manifests
This should help readers to understand interconnectivity between components.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 00:39:05 +02:00
Aliaksandr Valialkin
af366a1f97
README.md: sync with docs/Cluster-VictoriaMetrics.md after e0569a355b 2024-02-23 00:22:54 +02:00
Anton L
8b7ff0f66e
#5833 Fix Deadlock when using shardByURL of VMAgent (#5834) 2024-02-22 11:54:53 +02:00
Dan Dascalescu
0c7eda7c88
app/vmselect: simplify wording for too many samples error (#5827)
(cherry picked from commit 17cf031fa1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-20 16:29:11 +01:00
Roman Khavronenko
2e172b9361
vmctl : Provide TLS config options for Open TSDB datasource #5797 (#5832)
Originally implemented here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5797

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: khushijain21 <khushij393@gmail.com>
(cherry picked from commit bb1279bfc4)
2024-02-20 16:27:52 +01:00
Daria Karavaieva
550b589790
Vmanomaly Quickstart Fix absolute links (#5831)
* links fix

* typo fix

(cherry picked from commit 4034d081f4)
2024-02-20 16:27:52 +01:00
Daria Karavaieva
a68518c0e0
Vmanomaly QuickStart (#5800)
* first edit

* typo 1

* typo 2

* fixes 3

* fixes 4

* fixes 5

* fixes, cross links

* v1.10 config

* models why, self-monitoring fix

* config next steps

* fixes

* minor fix

(cherry picked from commit b2baf7d472)
2024-02-20 16:27:52 +01:00
hagen1778
c065287757
docs: move recent changes to Tip
These changes were mistakenly put to existing release

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit dc25c30fdc)
2024-02-20 16:27:51 +01:00
igorbernstein
b485e40823
deployment/docker: clean up loading of victoriametrics-datasource (#5793)
Currently the docker-compose examples for loading `victoriametrics-datasource` uses 2 environment variables:
-  `GF_ALLOW_LOADING_UNSIGNED_PLUGINS`
- `GF_DEFAULT_APP_MODE`

I believe both of the env vars are trying to achieve the same thing. `GF_DEFAULT_APP_MODE` disables code signing for all plugins and `GF_ALLOW_LOADING_UNSIGNED_PLUGINS` intends to disable code signing for just `victoriametrics-datasource`.
Keeping the scope narrowed to just `victoriametrics-datasource` would be preferable in this case.

Unfortunately `GF_ALLOW_LOADING_UNSIGNED_PLUGINS` is misspelled. According to [grafana docs](https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#override-configuration-with-environment-variables), the format is supposed to be `GF_<SectionName>_<KeyName>`. In other words the current env var is missing the section name.

This PR proposes to:
1. fix the typo
2. remove the global disablement of code signing

Alternatively, if you prefer to keep codesigning disabled globally, please remove `GF_ALLOW_LOADING_UNSIGNED_PLUGINS` env var as it confuses things

(cherry picked from commit cc5a274e4d)
2024-02-20 16:27:51 +01:00
hagen1778
4474c23aed
app/vmalert: consistently sort groups by name and filename on /groups page
This should prevent non-deterministic sorting for groups with identical names.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e2dad3a2ac)
2024-02-20 13:51:31 +01:00
hagen1778
6c63fd831d
app/vmalert: follow-up after b60dcbe11f
* support case-insensitive search
* reflect search condition in URL, so link can be sharable
* support filtering on /alerts page
* fix collapseAll/expandAll logic to respect only shown entries
* add changelog

b60dcbe11f
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 11b03d9fc8)
2024-02-20 13:35:02 +01:00