Commit graph

7723 commits

Author SHA1 Message Date
Aliaksandr Valialkin
9d886a2eb0
lib/storage: follow-up for 4b8088e377
- Clarify the bugfix description at docs/CHANGELOG.md
- Simplify the code by accessing prefetchedMetricIDs struct under the lock
  instead of using lockless access to immutable struct.
  This shouldn't worsen code scalability too much on busy systems with many CPU cores,
  since the code executed under the lock is quite small and fast.
  This allows removing cloning of prefetchedMetricIDs struct every time
  new metric names are pre-fetched. This should reduce load on Go GC,
  since the cloning of uin64set.Set struct allocates many new objects.
2024-01-16 15:29:57 +02:00
Aliaksandr Valialkin
b49b8fed3c
app/vmselect/promql: simplify the code after 388d020b7c
Add a test, which verifies the correct sorting of float64 slices with NaNs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
2024-01-16 15:06:33 +02:00
Hui Wang
3ac44baebe
exit vmagent if there is config syntax error in scrape_config_files when -promscrape.config.strictParse=true (#5560) 2024-01-16 17:30:02 +08:00
hagen1778
d0e4190969
deployment/alerts: add job label to DiskRunsOutOfSpace alerting rule
So it is easier to understand to which installation the triggered instance belongs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 09:49:39 +01:00
Aliaksandr Valialkin
388d020b7c
app/vmselect/promql: follow-up for ce4f26db02
- Document the bugfix at docs/CHANGELOG.md
- Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218
- Revert unrelated changes in lib/filestream and lib/fs
- Use simpler test at app/vmselect/promql/exec_test.go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
2024-01-16 02:55:08 +02:00
Zongyang
ce4f26db02
FIX bottomk doesn't return any data when there are no time range overlap between timeseries (#5509)
* FIX sort order in bottomk

* Add lessWithNaNsReversed for bottomk

* Add ut for TopK

* Move lt from loop

* FIX lint

* FIX lint

* FIX lint

* Mod log format

---------

Co-authored-by: xiaozongyang <xiaozngyang@kanyun.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-16 02:19:36 +02:00
Aliaksandr Valialkin
190a6565ae
app/vmselect/promql: consistently sort results of a or b query
Previously the order of results returned from `a or b` query could change with each request
because the sorting for such query has been disabled in order to satisfy
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 .

This commit executes `a or b` query as `sortByMetricName(a) or sortByMetricName(b)`.
This makes the order of returned time series consistent across requests,
while maintaining the requirement from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 ,
e.g. `b` results are consistently put after `a` results.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5393
2024-01-16 01:30:10 +02:00
Aliaksandr Valialkin
7fc2bd0412
app/vmstorage: expose proper types for storage metrics when -metrics.exposeMetadata command-line flag is set
This is a follow-up for 326a77c697
2024-01-16 00:20:37 +02:00
Aliaksandr Valialkin
bfa73ebdf3
lib/prompbmarshal: move WriteRequest proto definition to the correct place 2024-01-16 00:20:37 +02:00
Artem Navoiev
51cdf3676b
docs: vmanomaly fix image size
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 23:10:52 +01:00
Artem Navoiev
74219a1727
vmanomly: guide add diagramm
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 23:07:49 +01:00
Artem Navoiev
041a1966c5
docs: vmanomaly fix formatting and remove unnecessary elements
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 22:45:02 +01:00
Artem Navoiev
191e322879
vmanomaly - models a bit pretify docs (#5618)
* vmanomaly - models a bit pretify docs

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* typi

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix formatting

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 13:26:38 -08:00
Artem Navoiev
544da241e8
vmanomaly guides - fix formating, add missing piece, clarify statement, use common languagefor VM ecosystem (#5620)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 13:26:26 -08:00
Artem Navoiev
0c78b891b0
clarify cluster multitenacy in vmanomaly docs (#5619)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 12:36:28 -08:00
Artem Navoiev
f51b7fda8e
docs: fix markdown in how-to-monitor-k8s
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 18:16:26 +01:00
Aliaksandr Valialkin
4b42c8abbb
lib/promscrape/discovery/hetzner: fix golangci-lint warnings after 03a97dc678
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550
2024-01-15 17:12:40 +02:00
Roman Khavronenko
e14e3d9c8c
docs: mention requirements for latest backups (#5614)
Add docs to explain that `latest` backup folder can be accessed
more frequently than others. Hence, it is recommended to keep it
on storages with low access price.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-15 15:30:55 +01:00
Aliaksandr Valialkin
5106045048
app/vmstorage: deregister storage metrics before stopping the storage
This prevents from possible nil pointer dereference issues when the storage metrics
are read after the storage is stopped.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-15 16:11:45 +02:00
Aliaksandr Valialkin
be509b3995
lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop()
Previously the was a race condition when the background goroutine still could try collecting metrics
from already stopped resources after returning from pushmetrics.Stop().
Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning.

This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549
and the commit fe2d9f6646 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-15 13:50:36 +02:00
Aliaksandr Valialkin
9fd20202e1
vendor/github.com/VictoriaMetrics/easyproto: update from v0.1.3 to v0.1.4
This fixes vet error for 32-bit architectures:

https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7521709384/job/20472882877
2024-01-15 11:31:30 +02:00
Aleksandr Stepanov
03a97dc678
vmagent: added hetzner sd config (#5550)
* added hetzner robot and hetzner cloud sd configs

* remove gettoken fun and update docs

* Updated CHANGELOG and vmagent docs

* Updated CHANGELOG and vmagent docs

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-01-15 10:13:22 +01:00
Roman Khavronenko
4b8088e377
lib/storage: properly check for storage/prefetchedMetricIDs cache expiration deadline (#5607)
Before, this cache was limited only by size.
Cache invalidation by time happens with jitter to prevent thundering herd problem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-15 10:03:06 +01:00
rbizos
70cd09e736
Handling negative index in Graphite groupByNode/aliasByNode (#5581)
Handeling the error case with -1

Signed-off-by: Raphael Bizos <r.bizos@criteo.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-01-15 09:57:15 +01:00
Aliaksandr Valialkin
d2c94a0663
lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto 2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin
a47127c1a6
app/vmalert/remotewrite: properly calculate vmalert_remotewrite_dropped_rows_total
It was calculating the number of dropped time series instead of the number of dropped samples.

While at it, drop vmalert_remotewrite_dropped_bytes_total metric, since it was inconsistently calculated -
at one place it was calculating raw protobuf-encoded sample sizes, while at another place it was calculating
the size of snappy-compressed prompbmarshal.WriteRequest protobuf message.
Additionally, this metric has zero practical sense, so just drop it in order to reduce the level of confusion.
2024-01-14 22:55:11 +02:00
Aliaksandr Valialkin
c005245741
lib/prompb: switch to github.com/VictoriaMetrics/easyproto 2024-01-14 22:46:06 +02:00
Aliaksandr Valialkin
f2229c2e42
lib/prompb: change type of Label.Name and Label.Value from []byte to string
This makes it more consistent with lib/prompbmarshal.Label
2024-01-14 22:33:21 +02:00
Aliaksandr Valialkin
f405384c8c
lib/protoparser/datadogv2: simplify code for parsing protobuf messages after 0597718435
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
2024-01-14 21:43:01 +02:00
Aliaksandr Valialkin
dd25049858
lib/protoparser/opentelemetry: use github.com/VictoriaMetrics/easyproto for protobuf message unmarshaling and marshaling
This reduces VictoriaMetrics binary size by 100KB.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2424
2024-01-14 21:19:03 +02:00
Aliaksandr Valialkin
0597718435
lib/protoparser/datadogv2: add support for reading protobuf-encoded requests at /api/v2/series endpoint
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094
2024-01-14 21:09:05 +02:00
Dmytro Kozlov
828aca82e9
app/vmctl: add insecure skip verify flags for source and destination addresses for native protocol (#5606)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5595
2024-01-11 14:04:32 +01:00
Zakhar Bessarab
e5a767cff8
docs: explicitly mention "delete_series" endpoint accepts any HTTP method (#5605)
See: #5552

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-01-11 15:39:11 +04:00
hagen1778
eae585e8de
docs: make docs-sync
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-11 11:54:40 +01:00
Artem Navoiev
d374595e31
docs: mention staleNaN handling during deduplication
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-11 11:53:58 +01:00
hagen1778
91ccea236f
app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-09 13:34:54 +01:00
zhdd99
fe2d9f6646
lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549)
Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-09 13:24:34 +01:00
Dan Dascalescu
b79d4cc988
docs: fix English in keyConcepts.md, add instant query use case (#5547) 2024-01-09 11:29:21 +01:00
Fred Navruzov
e34f77aed4
docs: vmanomaly chapter hotfixes (#5583)
* update link to book a demo for AD & RCA

* fix invalid refs in components/model

* - fix staging -> prod links
- replace capitalized FAQ headers
- change the section order on main page
- replace :latest tag with current stable
2024-01-09 11:04:24 +01:00
Fred Navruzov
bbea02f82b
- fix 404 in /guides page (#5582)
- change back AD section title
2024-01-08 11:27:16 -08:00
Artem Navoiev
fdefc8a816
docs: properly close diff
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 20:20:09 +01:00
Artem Navoiev
095d982976
docs: vmagent specify default value for the compression level and its maximum. The question appers too often in our channel (#5575)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 20:14:23 +01:00
Dmytro Kozlov
105c6b2eb7
app/vmui: fix broken link for the statistic inaccuracy explanation (#5568) 2024-01-08 20:13:45 +01:00
Dan Dascalescu
33df9bee22
Link to start/end timestamp formats in url-examples (#5531) 2024-01-08 18:23:38 +04:00
hagen1778
463455665b
dashboards: update cluster dashboard
* add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect
components and their clients. New panels are available in the rows dedicated to specific components.

* update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries
served by vmselect. The percentage value should make it more clear for users whether there is a service degradation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-08 11:58:31 +01:00
Denys Holius
eb08f5c7e5
docs/anomaly-detection/guides/README.md: fixed markdown link to the vmanomaly guide (#5578) 2024-01-08 02:41:52 -08:00
Artem Navoiev
1aa39efec1
docs: vmanomaly add list of guides to guides page
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 10:44:48 +01:00
Artem Navoiev
07e5d6f0fb
docs: change sorting of anomaly deteciton
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 10:39:38 +01:00
Fred Navruzov
f75874f5df
docs: vmanomaly part 1 (#5558)
* add `AD` section, fix links, release docs and changelog

* - connect sections, refactor structure

* - resolve suggestions
- add FAQ section
- fix dead links

* - fix incorrect render of tables for Writer
- comment out internal readers/writers
- fix page ordering to some extent

* - link licensing requirements from v1.5.0 to main page

---------

Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 01:31:36 -08:00
Denys Holius
aecfabe318
CHANGELOG.md: fixed wrong links to vmalert-tool documentation page (#5570) 2024-01-05 07:16:46 -08:00