Commit graph

204 commits

Author SHA1 Message Date
Aliaksandr Valialkin
d739511f5b
lib/storage: replace OpenStorage() with MustOpenStorage()
Callers of OpenStorage() log the returned error and exit.
The error logging and exit can be performed inside MustOpenStorage()
alongside with printing the stack trace for better debuggability.
This simplifies the code at caller side.
2023-04-14 23:04:42 -07:00
Aliaksandr Valialkin
cf53ce83a0
app/vmstorage: deprecate -bigMergeConcurrency command-line flag
Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled
growth of unmerged parts, which, in turn, increases CPU usage and query durations.

So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag
can be used instead for controlling the concurrency of background merges.
2023-04-13 20:42:22 -07:00
Aliaksandr Valialkin
fc3d826d7f
all: add Windows build for VictoriaMetrics
This commit changes background merge algorithm, so it becomes compatible with Windows file semantics.

The previous algorithm for background merge:

1. Merge source parts into a destination part inside tmp directory.
2. Create a file in txn directory with instructions on how to atomically
   swap source parts with the destination part.
3. Perform instructions from the file.
4. Delete the file with instructions.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since the remaining files with instructions is replayed on the next restart,
after that the remaining contents of the tmp directory is deleted.

Unfortunately this algorithm doesn't work under Windows because
it disallows removing and moving files, which are in use.

So the new algorithm for background merge has been implemented:

1. Merge source parts into a destination part inside the partition directory itself.
   E.g. now the partition directory may contain both complete and incomplete parts.
2. Atomically update the parts.json file with the new list of parts after the merge,
   e.g. remove the source parts from the list and add the destination part to the list
   before storing it to parts.json file.
3. Remove the source parts from disk when they are no longer used.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since incomplete partitions from step 1 or old source parts from step 3 are removed
on the next startup by inspecting parts.json file.

This algorithm should work under Windows, since it doesn't remove or move files in use.
This algorithm has also the following benefits:

- It should work better for NFS.
- It fits object storage semantics.

The new algorithm changes data storage format, so it is impossible to downgrade
to the previous versions of VictoriaMetrics after upgrading to this algorithm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-19 23:28:26 -07:00
Aliaksandr Valialkin
54fe207cc0
all: follow-up for 7a3e16e774
- Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth,
  so it is consistent with the description at vmauth and victoria-metrics
- Add a sample of panic text to docs/CHANGELOG.md, so it could be googled
- Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-08 01:42:58 -08:00
Aliaksandr Valialkin
1ad0d22e80
lib/storage: follow-up for 39cdc546dd
- Use flag.Duration instead of flagutil.Duration for -snapshotCreateTimeout,
  since the flagutil.Duration is intended mostly for big durations, e.g. days, months and years,
  while the -snapshotCreateTimeout is usually smaller than one hour.
- Add links to https://docs.victoriametrics.com/#how-to-work-with-snapshots in docs/CHANGELOG.md,
  so readers could easily find the corresponding docs when reading the changelog.
- Properly remove all the created directories on unsuccessful attempt to create
  snapshot in Storage.CreateSnapshot().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
2023-02-27 13:11:10 -08:00
Zakhar Bessarab
26682e369e
lib/storage: enhancements for snapshots process (#3873)
* lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858)

* lib/{mergeset,storage}: add timeout configuration for snapshots creation, remove incomplete snapshots from storage

* docs: fix formatting

* app/vmstorage: add metrics to track status of snapshots

* app/vmstorage: use `vm_http_requests_total` metric for snapshot endpoints metrics, rename new flag to make name more clear

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: update flag name in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: reflect new metrics names change in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-27 13:11:06 -08:00
Aliaksandr Valialkin
bbd5914eb1
all: add makefile rules for GOARCH=s390x for all the VictoriaMetrics components
This is a follow-up for 007530f882
2023-02-26 12:38:48 -08:00
Aliaksandr Valialkin
0c60e4a30a
all: consistently use http.Method{Get,Post,Put} across the codebase
This is a follow-up after 9dec3c8f80
2023-02-22 19:01:09 -08:00
Aliaksandr Valialkin
086516a02b
lib/protoparser/clusternative: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:38:02 -08:00
Aliaksandr Valialkin
34379d4cf1
all: run apk update && apk upgrade in base Alpine Docker image in order to get all the recent security fixes 2023-02-09 14:03:02 -08:00
Nikolay
ebebaecd94
lib/netutil: init implimentation of proxy protocol (#3687)
* lib/netutil: init implimentation of proxy protocol
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-26 23:25:22 -08:00
Aliaksandr Valialkin
103dfd0525
lib/{mergeset,storage}: do not slow down concurrently executed queries during assisted merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2023-01-16 14:45:40 -08:00
Aliaksandr Valialkin
aa027529eb
lib/httpserver: directly pass flag value to CheckAuthFlag()
There is no sense in passing a pointer to flag value there.

This is a follow-up for 4225a0bd75
2023-01-10 15:59:55 -08:00
Zakhar Bessarab
10f314cdbd
Use httpAuth.* flags as a fallback for endpoints protected by *AuthKey flags (#3582)
* {lib/server, app/}: use `httpAuth.*` flag as fallback for `*AuthKey` if it is not set

* lib/ingestserver/opentsdbhttp: fix opentdb HTTP handler not respecting `httpAuth.*` flags

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-10 15:57:55 -08:00
Aliaksandr Valialkin
b275983403
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:07:16 -08:00
Aliaksandr Valialkin
20e9598254
lib/vmselectapi: limit the number of concurrently executed requests
This should prevent from out of memory errors when big number of vmselect
nodes send many concurrent requests to vmstorage

The limit can be controlled at vmstorage via the following command-line flags:
- search.maxConcurrentRequests
- search.maxQueueDuration

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits
2023-01-06 18:39:46 -08:00
Aliaksandr Valialkin
1a88fe5b1f
lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures
This fixes handling of values bigger than 2GiB for the following command-line flags:

- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
2022-12-14 19:29:57 -08:00
Aliaksandr Valialkin
2a190f6451
lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead
Blocked small merges may result into big number of small parts, which, in turn,
may result in increased CPU and memory usage during queries, since queries need to inspect
all the existing small parts.

The issue has been introduced in 8189770c50

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-12 17:01:33 -08:00
Aliaksandr Valialkin
7d5c64eb7a
all: add -inmemoryDataFlushInterval command-line flag for controlling the frequency of saving in-memory data to disk
The main purpose of this command-line flag is to increase the lifetime of low-end flash storage
with the limited number of write operations it can perform. Such flash storage is usually
installed on Raspberry PI or similar appliances.

For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations
to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data.

The in-memory data is searchable in the same way as the data stored on disk.
VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal.
The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 15:28:09 -08:00
Zakhar Bessarab
e407e7243a
{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348)
* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* app/vmselect: fix error message

* {app/vmstorage,app/vmselect}: fix error messages

* app/vmselect: change log level for error handling

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-25 10:32:45 -08:00
Dmytro Kozlov
6f7956503f
app/vmstorage: fix potential file inclusion via variable (#3339)
* app/vmstorage: fix potential file inclusion via variable

* app/vmstorage: cleanup
2022-11-17 01:31:37 +02:00
Aliaksandr Valialkin
a6d4711ac6
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:41:59 +03:00
Aliaksandr Valialkin
2dd93449d8
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin
98bd843f3c
app/{vminsert,vmselect,vmstorage}: add package-{vminsert,vmselect,vmstorage}-{goarch} Makefile rules for building arch-specific docker images 2022-10-07 20:17:01 +03:00
Aliaksandr Valialkin
916f1ab86c
app/vmstorage/servers: properly copy MetricName into MetricBlock inside blockIterator.NextBlock
This should fix the issue at cf36bfa189
2022-10-07 02:57:54 +03:00
Aliaksandr Valialkin
7b9ba456ff
app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:41:57 +03:00
Aliaksandr Valialkin
c0c9f30870
lib/pushmetrics: properly handle errors when initializing pushmetrics 2022-07-22 13:38:25 +03:00
Aliaksandr Valialkin
fe68bb3ba7
all: follow-up after 46f803fa7a
Add -pushmetrics.* command-line flags to all the VictoriaMetrics apps
2022-07-21 20:18:25 +03:00
Aliaksandr Valialkin
70b9925bf7
app: fix make publish-* after ed93330e66
Add missing `-linux` substring to built binary names for copying into Docker images
2022-07-14 11:01:34 +03:00
Aliaksandr Valialkin
da6c85a2f6
all: follow-up for d99ba3481b 2022-07-13 17:17:08 +03:00
Dmytro Kozlov
4e4def9df8
Rename release packages (#2810)
* makefile: add os to each release file

* makefile: update vmutils arm64

* makefile: update victoria-metrics release process

* makefile: update publish with os

* makefile: update publish with os

* makefile: change tar library

* update release logic

* copy all releases

* sort command by GOOS

* rollback commands

* rollback OSARCH

* fix commands

* cleanup

* fix windows build

* sort build by GOOS, update README.md
2022-07-13 17:11:01 +03:00
guidao
f2d24a660b
add next retention metric (#2863)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2022-07-13 12:41:22 +03:00
Aliaksandr Valialkin
1ec4dfd678
lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics
This reduces the number of args to vmselectapi calls
2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin
2e721f7d16
lib/vmselectapi: rename Server.MustClose to more clear Server.MustStop 2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin
78eeca6f0d
lib/vmselectapi: rename deleteMetrics to more correct deleteSeries 2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin
5afa54e845
lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes()
This improves the API consistency
2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin
7d5d33fd71
lib/storage: return marshaled metric names from SearchMetricNames
Previously SearchMetricNames was returning unmarshaled metric names.
This wasn't great for vmstorage, which should spend additional CPU time
for marshaling the metric names before sending them to vmselect.

While at it, remove possible duplicate metric names, which could occur when
multiple samples for new time series are ingested via concurrent requests.

Also sort the metric names before returning them to the client.
This simplifies debugging of the returned metric names across repeated requests to /api/v1/series
2022-06-28 18:16:32 +03:00
Aliaksandr Valialkin
36edb1912b
app/vmstorage: rename "transport" package to "servers" package for better clarity 2022-06-28 14:04:14 +03:00
Aliaksandr Valialkin
399d4c36ae
app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day 2022-06-28 12:55:20 +03:00
Aliaksandr Valialkin
64505e924d
app/vmstorage: extract vmselect api server into a separate package - lib/vmselectapi
This opens doors for implementing vmselect api server at vmselect level,
so top-level vmselect could query lower-level vmselect nodes in the same way
as it queries vmstorage nodes.

This will create the ability to create highly available querying architecture
when multiple independent VictoriaMetrics clusters with the same data
are located in distinct availability zones. In this case we can use top-level
vmselect instead of Promxy for simultaneous querying of all the clusters
in all the AZs.
2022-06-27 14:20:41 +03:00
Aliaksandr Valialkin
926fccbb8d
lib/storage: add querytracer to more contexts
querytracer has been added to the following storage.Storage methods:
- RegisterMetricNames
- DeleteMetrics
- SearchTagValueSuffixes
- SearchGraphitePaths
2022-06-27 12:53:49 +03:00
Aliaksandr Valialkin
e0ce6c0ff8
app/vmstorage/transport: refactoring: split Server into VMInsertServer and VMStorageServer
This makes the code more clear
2022-06-23 19:20:09 +03:00
Aliaksandr Valialkin
b2cfb8faf7
app/vmstorage/transport: call vmselectRequestCtx.readSearchQuery() in processVMSelectDeleteMetrics
Previously the processVMSelectDeleteMetrics was calling separate functions from readSearchQuery().
It is better from readability and maintenance PoV to substitute it with readSearchQuery call.
2022-06-20 14:23:17 +03:00
Aliaksandr Valialkin
da1d1e83df
app/{vmselect,vmstorage}: properly pass seriesCountByLabelName and seriesCountByFocusLabelValue entries from vmstorage to vmselect 2022-06-16 10:44:29 +03:00
Aliaksandr Valialkin
45fa9d798d
app/vmselect: accept focusLabel query arg at /api/v1/status/tsdb 2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin
61e03f172b
app/vmselect: optimize /api/v1/labels and /api/v1/label/.../values handlers when match[] query arg is passed to them 2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin
4a94cd81ce
app/vmselect: add optional limit query arg to /api/v1/labels and /api/v1/label_values endpoints
This arg allows limiting the number of sample values returned from these APIs
2022-06-10 10:24:07 +03:00
Aliaksandr Valialkin
cb39eada77
all: improve query tracing coverage for indexdb search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-09 20:04:02 +03:00
Aliaksandr Valialkin
a9ea3fee38
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:16:12 +03:00
Aliaksandr Valialkin
2b343d8bd0
app: properly collect and merge /api/v1/status/tsdb info from vmstorage nodes
The collection has been broken in f2754c3e90

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233
2022-06-08 19:26:09 +03:00