Commit graph

179 commits

Author SHA1 Message Date
Aliaksandr Valialkin
7b9ba456ff
app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:41:57 +03:00
Aliaksandr Valialkin
c0c9f30870
lib/pushmetrics: properly handle errors when initializing pushmetrics 2022-07-22 13:38:25 +03:00
Aliaksandr Valialkin
fe68bb3ba7
all: follow-up after 46f803fa7a
Add -pushmetrics.* command-line flags to all the VictoriaMetrics apps
2022-07-21 20:18:25 +03:00
Aliaksandr Valialkin
70b9925bf7
app: fix make publish-* after ed93330e66
Add missing `-linux` substring to built binary names for copying into Docker images
2022-07-14 11:01:34 +03:00
Aliaksandr Valialkin
da6c85a2f6
all: follow-up for d99ba3481b 2022-07-13 17:17:08 +03:00
Dmytro Kozlov
4e4def9df8
Rename release packages (#2810)
* makefile: add os to each release file

* makefile: update vmutils arm64

* makefile: update victoria-metrics release process

* makefile: update publish with os

* makefile: update publish with os

* makefile: change tar library

* update release logic

* copy all releases

* sort command by GOOS

* rollback commands

* rollback OSARCH

* fix commands

* cleanup

* fix windows build

* sort build by GOOS, update README.md
2022-07-13 17:11:01 +03:00
guidao
f2d24a660b
add next retention metric (#2863)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2022-07-13 12:41:22 +03:00
Aliaksandr Valialkin
1ec4dfd678
lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics
This reduces the number of args to vmselectapi calls
2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin
2e721f7d16
lib/vmselectapi: rename Server.MustClose to more clear Server.MustStop 2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin
78eeca6f0d
lib/vmselectapi: rename deleteMetrics to more correct deleteSeries 2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin
5afa54e845
lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes()
This improves the API consistency
2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin
7d5d33fd71
lib/storage: return marshaled metric names from SearchMetricNames
Previously SearchMetricNames was returning unmarshaled metric names.
This wasn't great for vmstorage, which should spend additional CPU time
for marshaling the metric names before sending them to vmselect.

While at it, remove possible duplicate metric names, which could occur when
multiple samples for new time series are ingested via concurrent requests.

Also sort the metric names before returning them to the client.
This simplifies debugging of the returned metric names across repeated requests to /api/v1/series
2022-06-28 18:16:32 +03:00
Aliaksandr Valialkin
36edb1912b
app/vmstorage: rename "transport" package to "servers" package for better clarity 2022-06-28 14:04:14 +03:00
Aliaksandr Valialkin
399d4c36ae
app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day 2022-06-28 12:55:20 +03:00
Aliaksandr Valialkin
64505e924d
app/vmstorage: extract vmselect api server into a separate package - lib/vmselectapi
This opens doors for implementing vmselect api server at vmselect level,
so top-level vmselect could query lower-level vmselect nodes in the same way
as it queries vmstorage nodes.

This will create the ability to create highly available querying architecture
when multiple independent VictoriaMetrics clusters with the same data
are located in distinct availability zones. In this case we can use top-level
vmselect instead of Promxy for simultaneous querying of all the clusters
in all the AZs.
2022-06-27 14:20:41 +03:00
Aliaksandr Valialkin
926fccbb8d
lib/storage: add querytracer to more contexts
querytracer has been added to the following storage.Storage methods:
- RegisterMetricNames
- DeleteMetrics
- SearchTagValueSuffixes
- SearchGraphitePaths
2022-06-27 12:53:49 +03:00
Aliaksandr Valialkin
e0ce6c0ff8
app/vmstorage/transport: refactoring: split Server into VMInsertServer and VMStorageServer
This makes the code more clear
2022-06-23 19:20:09 +03:00
Aliaksandr Valialkin
b2cfb8faf7
app/vmstorage/transport: call vmselectRequestCtx.readSearchQuery() in processVMSelectDeleteMetrics
Previously the processVMSelectDeleteMetrics was calling separate functions from readSearchQuery().
It is better from readability and maintenance PoV to substitute it with readSearchQuery call.
2022-06-20 14:23:17 +03:00
Aliaksandr Valialkin
da1d1e83df
app/{vmselect,vmstorage}: properly pass seriesCountByLabelName and seriesCountByFocusLabelValue entries from vmstorage to vmselect 2022-06-16 10:44:29 +03:00
Aliaksandr Valialkin
45fa9d798d
app/vmselect: accept focusLabel query arg at /api/v1/status/tsdb 2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin
61e03f172b
app/vmselect: optimize /api/v1/labels and /api/v1/label/.../values handlers when match[] query arg is passed to them 2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin
4a94cd81ce
app/vmselect: add optional limit query arg to /api/v1/labels and /api/v1/label_values endpoints
This arg allows limiting the number of sample values returned from these APIs
2022-06-10 10:24:07 +03:00
Aliaksandr Valialkin
cb39eada77
all: improve query tracing coverage for indexdb search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-09 20:04:02 +03:00
Aliaksandr Valialkin
a9ea3fee38
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:16:12 +03:00
Aliaksandr Valialkin
2b343d8bd0
app: properly collect and merge /api/v1/status/tsdb info from vmstorage nodes
The collection has been broken in f2754c3e90

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233
2022-06-08 19:26:09 +03:00
Dmytro Kozlov
f2754c3e90
Cardinality explorer (#2625)
* Cardinality explorer

* vmui, vmselect: updated field name, added description to spinner

* make vmui-update

* updated const name, make vmui-update

* lib/storage: changes calculation for totalSeries values

* added static files

* wip

* wip

* wip

* wip

* docs/CHANGELOG.md: document cardinality explorer feature

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-08 18:54:27 +03:00
Roman Khavronenko
e9ee043879
lib/storage: make indexdb/tagFilters cache size configurable (#2667)
The default size of `indexdb/tagFilters` now can be overridden via
`storage.cacheSizeIndexDBTagFilters` flag.
Please, be careful with changing default size since it may
lead to inefficient work of the vmstorage or OOM exceptions.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 14:57:39 +03:00
Aliaksandr Valialkin
afced37c0b
all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:31:44 +03:00
Aliaksandr Valialkin
38beb9fe04
lib/storage: add ability to change the indexdb rotation time offset with -retentionTimezoneOffset command-line flag
This is a follow-up for 0fbf59199a

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574
2022-05-25 16:07:14 +03:00
Aliaksandr Valialkin
e961aec551
app/vmstorage: do not allow to set -retentionPeriod smaller than one day
VictoriaMetrics doesn't support retention periods smaller than one day,
so do not allow to set it to small values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2496
2022-05-07 00:54:42 +03:00
Aliaksandr Valialkin
925fa9a7de
docs/Cluster-VictoriaMetrics.md: make the description for -rpc.disableCompression command-line flag more clear 2022-05-06 16:24:56 +03:00
Roman Khavronenko
c41ae2db2c
vmstorage: switch to rich duration parser for flag snapshotsMaxAge (#2542)
The switch suppose to allow setting `d`, `w`, `y` duration units.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-05 21:13:55 +03:00
Aliaksandr Valialkin
361b08c30e
lib/storage: leave the last sample per each discrete interval during the deduplicaton
This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
2022-05-02 21:59:31 +03:00
Artem Navoiev
11db05a4ff
lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487)
* lib/{storage,flagutil} - Add option for snapshot autoremoval

- add prometheus-like duration as command flag
- add option to delete stale snapshots
- update duration.go flag to re-use own code

* wip

* lib/flagutil: re-use Duration.Set() call in NewDuration

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 11:24:12 +03:00
Aliaksandr Valialkin
a436836402
lib/flagutil: re-use Duration.Set() call in NewDuration 2022-05-02 10:58:08 +03:00
Aliaksandr Valialkin
ed1b394a1a
app/vmstorage: expose vm_indexdb_items_added_total and vm_indexdb_items_added_size_bytes_total counters at /metrics page
These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index).

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471
2022-04-21 13:19:42 +03:00
Aliaksandr Valialkin
81b7a31cb1
app/vmstorage: properly handle maxSeries limit passed from vmselect to vmstorage 2022-04-12 11:19:07 +03:00
Aliaksandr Valialkin
a4a15a462b
app/vmselect/netstorage: bump RPC API versions for vmselect->vmstorage communications
This is a follow-up after b843f0e229
2022-04-08 12:36:04 +03:00
Nikolay
4cf6219e07
lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache (#2293)
* lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache

It should decrease memory usage for regexp caching
with storing cacheEntry by pointer - golang map should be able to effectivly shrink it's size
original issue with this case - unexpected map grows and storage OOM

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Adds missing metrics for regexp cache and regexpPrefixes cache

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-26 12:57:27 +02:00
Aliaksandr Valialkin
b843f0e229
app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs 2022-03-26 11:28:14 +02:00
Aliaksandr Valialkin
698458b742
lib/httpserver: extract the code responsible for initializing server-side TLS config into netutil.GetServerTLSConfig 2022-03-17 19:46:20 +02:00
Roman Khavronenko
bd7837d524
lib: allow to configure cache size by type (#2206)
* lib: allow to configure cache size by type

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1940
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 13:55:51 +02:00
Roman Khavronenko
d107f86fbc
lib/index: reduce read/write load after indexDB rotation (#2177)
* lib/index: reduce read/write load after indexDB rotation

IndexDB in VM is responsible for storing TSID - ID's used for identifying
time series. The index is stored on disk and used by both ingestion and read path.

IndexDB is stored separately to data parts and is global for all stored data.
It can't be deleted partially as VM deletes data parts. Instead, indexDB is
rotated once in `retention` interval.

The rotation procedure means that `current` indexDB becomes `previous`,
and new freshly created indexDB struct becomes `current`. So in any time,
VM holds indexDB for current and previous retention periods.
When time series is ingested or queried, VM checks if its TSID is present
in `current` indexDB. If it is missing, it checks the `previous` indexDB.
If TSID was found, it gets copied to the `current` indexDB. In this way
`current` indexDB stores only series which were active during the retention
period.

To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both
write and read path consult `tsidCache` and on miss the relad lookup happens.

When rotation happens, VM resets the `tsidCache`. This is needed for ingestion
path to trigger `current` indexDB re-population. Since index re-population
requires additional resources, every index rotation event may cause some extra
load on CPU and disk. While it may be unnoticeable for most of the cases,
for systems with very high number of unique series each rotation may lead
to performance degradation for some period of time.

This PR makes an attempt to smooth out resource usage after the rotation.
The changes are following:
1. `tsidCache` is no longer reset after the rotation;
2. Instead, each entry in `tsidCache` gains a notion of indexDB to which
they belong;
3. On ingestion path after the rotation we check if requested TSID was
found in `tsidCache`. Then we have 3 branches:
3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID.
3.2 Slow path. It wasn't found, so we generate it from scratch,
add to `current` indexDB, add it to `tsidCache`.
3.3 Smooth path. It was found but does not belong to the `current` indexDB.
In this case, we add it to the `current` indexDB with some probability.
The probability is based on time passed since the last rotation with some threshold.
The more time has passed since rotation the higher is chance to re-populate `current` indexDB.
The default re-population interval in this PR is set to `1h`, during which entries from
`previous` index supposed to slowly re-populate `current` index.

The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs
were moved from `previous` indexDB to the `current` indexDB. This metric supposed to
grow only during the first `1h` after the last rotation.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin
34d14c4940
all: substitute zeroTime with time.Time{}, since this generates more optimal binary code 2022-02-07 14:36:41 +02:00
Aliaksandr Valialkin
5f266370c5
all: follow-up after 4bdd10ab90
Properly use new bytesutil.Resize* functions
2022-02-01 17:49:28 +02:00
Aliaksandr Valialkin
6232eaa938
lib/bytesutil: split Resize() into ResizeNoCopy() and ResizeWithCopy() functions
Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice.
This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed
after slize resizing. Switch such places to bytesutil.ResizeNoCopy().

Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability.

Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice
exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls.
This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-25 15:28:42 +02:00
Aliaksandr Valialkin
6ae584b9b3
lib/{mergeset,storage}: properly limit cache sizes for indexdb
Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`,
since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded,
which could result to out of memory errors.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:45:03 +02:00
Aliaksandr Valialkin
cdfe854c9b
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:52:29 +02:00
Aliaksandr Valialkin
ab4be24397
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:

   vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin
4fb19fe34b
all: consistently return application/json content-type without charset=utf-8
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:07:22 +02:00