Commit graph

2456 commits

Author SHA1 Message Date
Aliaksandr Valialkin
52c46f49e1
all: update Go builder from Go1.20.2 to Go1.20.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
2023-04-05 13:38:44 -07:00
Artem Navoiev
c5397c0b77
update changelog
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-04-05 13:31:48 -07:00
Zakhar Bessarab
54edd6992a
app/vmalert: update Grafana URLs to match latest format (#4061)
See: #4019

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-05 13:31:06 -07:00
Artem Navoiev
1b05896ac5
fix closing divs in docs
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-04-05 13:30:14 -07:00
Aliaksandr Valialkin
ca4f94c704
docs/Troubleshooting.md: add General troubleshooting checklist
This checklist helps searching for the infromation related to some issue / question
about VictoriaMetrics
2023-04-05 13:29:12 -07:00
Aliaksandr Valialkin
286697b8b6
docs/CHANGELOG.md: document edb45d7fc1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4013
2023-04-02 21:26:28 -07:00
Aliaksandr Valialkin
02ceebccc0
lib/promscrape: do not re-use previously loaded scrape targets on failed attempt to load updated scrape targets at file_sd_configs
The logic employed for re-using the previously loaded scrape target was broken initially.
The commit cc0427897c tried to fix it, but the new logic
became too complex and fragile. So it is better to just remove this logic,
since the targets from temporarily broken file should be eventually loaded on next
attempts every -promscrape.fileSDCheckInterval

This also allows removing fragile hacks around __vm_filepath label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3989
2023-04-02 21:11:12 -07:00
Dmytro Kozlov
6f0512a81c
lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read (#4027)
* lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read

* lib/promscrape: clarified comment

* lib/promscrape: made better approach to handle a problem with growing []*ScrapeWork on each error when loading config

* lib/promscrape: added CHANGELOG.md

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-02 21:11:10 -07:00
Yury Molodov
17b4fc9470
vmui: tips for working with the graph and legend (#4045)
* feat: add tips for working with the graph and legend

* feat: add the ability to collapse the legend

* vmui/docs: add the ability to collapse the legend

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-01 00:41:01 -07:00
Roman Khavronenko
5f95f9d453
lib/storage: check for free disk space before opening tables (#4035)
* lib/storage: check for free disk space before opening tables

We check for free disk space before call to `openTable`,
so `Storage` can be set to ReadOnly before mergeWorkers start.

Before the change, there was a chance that merges will start
even if Storage has to start in ReadOnly mode because of
`-storage.minFreeDiskSpaceBytes` limit.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: chore

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update lib/storage/storage.go

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-31 23:50:56 -07:00
Aliaksandr Valialkin
db8fda4ec6
app/vmselect/graphite: open source Graphite Render API 2023-03-31 23:37:40 -07:00
Aliaksandr Valialkin
43b431d322
deployment/docker: update base Docker image from Alpine 3.17.2 to Alpine 3.17.3
This fixes security issues from https://alpinelinux.org/posts/Alpine-3.17.3-released.html

This is a follow-up for 59c350d0d2
2023-03-31 22:54:48 -07:00
Aliaksandr Valialkin
dad13c0a91
lib/streamaggr: follow-up for ff72ca14b9
- Make sure that the last successfully loaded config is used on hot-reload failure
- Properly cleanup resources occupied by already initialized aggregators
  when the current aggregator fails to be initialized
- Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config
  This should simplify monitoring and debugging failed reloads
- Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa
  could be in use at realoadSaConfig()
- Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability
  on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push().
- Remove fine-grained aggregator reload - reload all the aggregators on config change instead.
  This simplifies the code a bit. The fine-grained aggregator reload may be returned back
  if there will be demand from real users for it.
- Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag
- Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639
2023-03-31 22:54:10 -07:00
Zakhar Bessarab
743d1e6536
docs/vmctl: add examples of URLs used for migration in different modes (#4042)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-31 22:44:12 -07:00
Roman Khavronenko
0bde9722ed
vmalert: use missingkey=zero for templating (#4040)
Replace empty labels with "" instead of "<no value>"
during templating, as Prometheus does.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4012

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-31 22:43:39 -07:00
Zakhar Bessarab
46c8be4f98
lib/fs: verify response code when reading configuration over HTTP (#4036)
Verifying status code helps to avoid misleading errors caused by attempt to parse unsuccessful response.

Related issue: #4034

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-31 22:33:53 -07:00
Daria Karavaieva
83213e6786
Vmanomaly guide index fix (#4029)
* name and scrutture change

* fix indexing

* index fix

* name change

* line separator fix
2023-03-31 22:32:15 -07:00
Alexander Marshalov
8c14d17694
added hot reload support for stream aggregation configs (#3969) (#3970)
added hot reload support for stream aggregation configs (#3969)

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-03-31 22:31:38 -07:00
Eliran Barnoy
7c70fb0fb9
Fix operator links to include VMPrometheusConverter for added visibility 2023-03-30 12:33:04 -07:00
Aliaksandr Valialkin
85ca077a88
lib/flagutil: ArrayString: support commas inside quoted strings and inside [], {} and () braces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3915
2023-03-28 21:25:07 -07:00
Aliaksandr Valialkin
957d64e302
docs: mention that VictoriaMetrics rounds time range to UTC days at /api/v1/labels, /api/v1/label/.../values and /api/v1/series handlers
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3107
2023-03-28 17:01:09 -07:00
Aliaksandr Valialkin
6f5bbf096a
app/vmagent/remotewrite: cosmetic updates after f3a51e8b1d
- Compare directory names instead of paths to directory when determining which persistent queues must be deleted
  This is less error-prone solution, since paths to the same directory can differ, which could lead
  to accidental directory removal for the existing -remoteWrite.url

- Log the `removed %d dangling queues` message when at least a single queue has been removed

- Consistently use filepath.Join() for creating paths to persistent queues.
  This is needed for Windows support (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70 )

- Clarify the description of the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
2023-03-27 18:38:53 -07:00
Zakhar Bessarab
6ed6eb0c4c
app/vmagent: add -remoteWrite.removeDanglingQueues flag (#4017)
* app/vmagent: add `-remoteWrite.removeDanglingQueues` flag which allows to automatically remove dangling persistent queue contents

Related issue: #4014

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmagent: address review feedback

- remove persistent queues files by default
- rename `remoteWrite.removeDanglingQueues` to `remoteWrite.keepDanglingQueues`
- update docs to reflect changed behaviour

Related issue: #4014

* Apply suggestions from code review

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-27 18:38:51 -07:00
Aliaksandr Valialkin
54b9537a76
app/vmselect/promql: follow-up for 79e1c6a6fc
- Document the fix at docs/CHANGELOG.md
- Add tests with multiple adjancent zero buckets
- Simplify the fix a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4021
2023-03-27 18:04:30 -07:00
Yury Molodov
86a98fa131
vmui: heatmap (#3780)
* fix: add stroke and font for all axes

* feat: add util for generate gradient

* feat: add heatmap plugin

* feat: add heatmap legend

* feat: add heatmap graph (#3384)

* vmui: add heatmap graph (#3384)

* feat: add convert Prometheus to VictoriaMetrics histogram

* fix: prevent re-render graph

* feat: reset step for heatmap

* feat: normalize heatmap data

* fix: format heatmap legend

* wip

* app/vmselect/vmui: run `make vmui-update`

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-26 00:31:21 -07:00
Aliaksandr Valialkin
229b39ac7d
docs/MetricsQL.md: quote min, max and avg args for rollup_*() functions in order to reduce the level of confusion when users try to pass the second argument to these functions 2023-03-26 00:02:01 -07:00
Aliaksandr Valialkin
84b3b0fac2
docs/CHANGELOG.md: document v1.87.4 LTS release 2023-03-25 22:44:21 -07:00
Aliaksandr Valialkin
3698994953
app/{vmbackup,vmrestore}: publish vmbackup and vmrestore binaries for Windows
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-25 15:09:41 -07:00
Aliaksandr Valialkin
7aff6f872f
app/vmselect/promql: follow-up for 7205c79c5a
- Allocate and initialize seriesByWorkerID slice in a single go instead
  of initializing every item in the list separately.
  This should reduce CPU usage a bit.
- Properly set anti-false sharing padding at timeseriesWithPadding structure
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-24 23:39:43 -07:00
Zakhar Bessarab
3f38ed3171
app/vmbackup: delete created snapshot in case of error during backup (#4008)
Related issue: #2055

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-24 22:09:17 -07:00
Aliaksandr Valialkin
181e877092
docs/CHANGELOG.md: cosmetic fixes: remove trailing whitespace and consistently use -flag instead of --flag 2023-03-24 17:57:23 -07:00
Alexander Marshalov
0301b5018e
allowed using dashes and dots in environment variables names (#4009)
* allowed using dashes and dots in environment variables names for templating config files with envtemplate (#3999)

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* Apply suggestions from code review

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-24 17:57:19 -07:00
Aliaksandr Valialkin
c54b8acba2
docs/vmauth.md: follow-up for 36edba9bfb
- Document `-configCheckInterval` command-line flag in `quick start` section
- Clarify the addition of `-configCheckInterval` at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3990
2023-03-24 17:56:59 -07:00
Aliaksandr Valialkin
9d19c2d89b
docs/vmagent.md: clarify that there is no need to specify multiple -remoteWrite.url options when writing data to a single VictoriaMetrics cluster when data replication is needed
Also add a link to https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format from `getting started` section,
so users could quickly find how to write data to VictoriaMetrics cluster
2023-03-24 17:56:31 -07:00
Roman Khavronenko
ec6a20880c
vmalert: mention VMUI example for alert's source (#4005)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 17:55:30 -07:00
Roman Khavronenko
b866f334ed
docs: mention cluster URL for exporting series (#4002)
docs: mention cluster URL for exporting series

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 17:54:57 -07:00
Daria Karavaieva
53c3d6137f
Vmanomaly guide (#3834)
Setting up VMAnomaly on NodeExporter metrics with VictoriaMetrics and AlertManager.


* vmanomaly-guide-draft

* aletr graphs and description

* readme vmanomaly tutorial

* Added back fit_every param for performance

* vmanomaly guide fixes

* added spaces div

* spaces + resize image

* alert example grammar

* quotation marks

* docker link

* typo fixed

* more links

* reader section rephrased

* label change

* lower case for grafana service

* lower case for vm service

* yaml markdown

---------

Co-authored-by: Dima Lazerka <dima@victoriametrics.com>
2023-03-24 17:54:10 -07:00
Dmytro Kozlov
fe7acd5e8a
docs: follow up after dc2c712a29 (#4001) 2023-03-24 16:41:42 -07:00
Yury Molodov
5f77efa915
vmui: display errors for each query individually (#3987) (#3994) 2023-03-24 13:26:43 -07:00
Alexander Marshalov
b5027cff9c
added configCheckInterval flag for vmauth (#3990) (#3991)
* added configCheckInterval flag for vmauth (#3990)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-03-24 13:25:07 -07:00
Nikolay
9bb83cafa4
lib/netutil: log only parsing errors for proxy-protocol (#3985)
* lib/netutil: log only parsing errors for proxy-protocol

Previosly every error was logged. With configured TCP health checks at load-balancer or kubernetes, vmauth spams a lot of false positive error message into logs

* Update docs/CHANGELOG.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/netutil/tcplistener.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-03-21 10:23:08 -07:00
Aliaksandr Valialkin
8ed9295109
docs/vmagent.md: mention in docs that the target relabel debug page shows target url now 2023-03-20 22:20:13 -07:00
Dmytro Kozlov
85b01c4aa7
lib/promrelabel: make target url from labels on target relabel page (#3882)
* lib/promrelabel: make target url from labels on target relabel page

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-20 22:08:39 -07:00
Aliaksandr Valialkin
1580d8bda7
docs/CHANGELOG.md: cosmetic fixes 2023-03-20 14:33:46 -07:00
Aliaksandr Valialkin
531b35b6c0
docs/Troubleshooting.md: document an additional case, which could result in slow inserts
If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series,
then vm_slow_row_inserts_total` metric is increased.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183
2023-03-20 14:33:27 -07:00
Dmytro Kozlov
aed59b9029
app/vmctl: automatically check tty (#3938)
app/vmctl: automatically detect if TTY is available
2023-03-20 14:14:43 -07:00
Yury Molodov
95b60c2777
vmui: support for drag'n'drop in the "Trace analyzer" page (#3971)
vmui: add drag-and-drop support for the trace analyzer page
2023-03-20 14:09:45 -07:00
Yury Molodov
b66953d8e1
vmui: improve usability of date/time picker (#3968)
* vmui: allow manually set input date and time
* vmui/docs: improve usability of date/time picker
2023-03-20 13:57:47 -07:00
Aliaksandr Valialkin
fc3d826d7f
all: add Windows build for VictoriaMetrics
This commit changes background merge algorithm, so it becomes compatible with Windows file semantics.

The previous algorithm for background merge:

1. Merge source parts into a destination part inside tmp directory.
2. Create a file in txn directory with instructions on how to atomically
   swap source parts with the destination part.
3. Perform instructions from the file.
4. Delete the file with instructions.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since the remaining files with instructions is replayed on the next restart,
after that the remaining contents of the tmp directory is deleted.

Unfortunately this algorithm doesn't work under Windows because
it disallows removing and moving files, which are in use.

So the new algorithm for background merge has been implemented:

1. Merge source parts into a destination part inside the partition directory itself.
   E.g. now the partition directory may contain both complete and incomplete parts.
2. Atomically update the parts.json file with the new list of parts after the merge,
   e.g. remove the source parts from the list and add the destination part to the list
   before storing it to parts.json file.
3. Remove the source parts from disk when they are no longer used.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since incomplete partitions from step 1 or old source parts from step 3 are removed
on the next startup by inspecting parts.json file.

This algorithm should work under Windows, since it doesn't remove or move files in use.
This algorithm has also the following benefits:

- It should work better for NFS.
- It fits object storage semantics.

The new algorithm changes data storage format, so it is impossible to downgrade
to the previous versions of VictoriaMetrics after upgrading to this algorithm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-19 23:28:26 -07:00
Aliaksandr Valialkin
d2f85816ea
lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873
2023-03-19 00:19:02 -07:00