This commit introduces common readers for multiple compression encoding algorithms.
Currently, supported encodings are:
* zstd
* gzip
* deflat
* snappy
It adds new common reader to the all VictoriaLogs ingestion protocols.
And updates opentelemetry metrics parsing for VictoriaMetrics components.
Also, it ports zstd stream parses from cluster branch.
Related issues:
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8380
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8300
---------
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
Previously, if the command-line flag value `-remoteWrite.showURL` changed, vmagent dropped content of persistent queues. It's not expected behavior and may lead to data-loss at queue.
Further more if command-line flag value `-remoteWrite.showURL` is set to `true`, any changes to url query arguments will lead to persistent queue drop. The most common uses is kafka and gcp pub-sub integration. It uses url query arguments for client configuration.
Also, it complicates copy content of persistent queue between vmagents. Since it requires to properly change name inside metainfo.json.
This commit removes persistent queue name equality check from `lib/persistentqueue`. This check was added as an additional protection from on-disk data corruption.
It's safe to skip this check for vmagent, because vmagent encodes remoteWrite.url as part of path to the queue. It guarantees that there will be no collision.
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8477.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
Previously, TLS config was only created for URLs with `https` scheme.
This could lead to unexpected errors when original URL was redirecting
to `https` one as TLS config is not applied.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8494
Add an integration test to confirm that deduplication works for the
current month. See #6965.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Almost all storage API operations, both ingestion and retrieval, involve
writing and/or reading the indexdb. However, during these operations,
the indexdb refcount is not incremented. This may lead to panics if
indexdb is rotated more than once during these operations.
This commit increments the refcount before using indexdb and decrements it
after use.
Note that rotating indexdb more than once during some operation is an
impossible case under normal circumstances as the min retention period
is 1 day (i.e. the indexdb will be rotated once per day). However, we
want the storage to behave correctly in all cases.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Fix metric that shows number of active time series when per-day index is disabled. Previously, once per-day index was disabled, the active time series metric would stop being populated and the `Active time series` chart would show 0.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8411.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
### Fix scrapePool name
If in the scrape file, I do some magic and manipulate the job name then
Prometheus will show scrapePool as the original job name in the targets
API, but vmagent will set it to the final value which is wrong.
example
```
job: consul-targets
...
- source_labels: [ __meta_consul_service ]
regex: (\w+)[_-]exporter
target_label: job
replacement: $1
```
curl to prom API will show
`"scrapePool": "consul-targets",`
vmagent:
`""scrapePool": "node",`
before changes:
```
curl -s 'http://localhost:8429/api/v1/targets' | jq -r '.data.activeTargets[].scrapePool'| sort|uniq
blackbox
pgbackrest
postgres
```
after changes
```
curl -s 'http://localhost:8429/api/v1/targets' | jq -r '.data.activeTargets[].scrapePool'| sort|uniq
blackbox
consul-targets
```
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 486b9e1c64)
### Describe Your Changes
fixes#8469
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit c174a046e2)
This feature allows to track query requests by metric names. Tracker
state is stored in-memory, capped by 1/100 of allocated memory to the
storage. If cap exceeds, tracker rejects any new items add and instead
registers query requests for already observed metric names.
This feature is disable by default and new flag:
`-storage.trackMetricNamesStats` enables it.
New API added to the select component:
* /api/v1/status/metric_names_stats - which returns a JSON
object
with usage statistics.
* /admin/api/v1/status/metric_names_stats/reset - which resets internal
state of the tracker and reset tsid/cache.
New metrics were added for this feature:
* vm_cache_size_bytes{type="storage/metricNamesUsageTracker"}
* vm_cache_size{type="storage/metricNamesUsageTracker"}
* vm_cache_size_max_bytes{type="storage/metricNamesUsageTracker"}
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4458
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Previously, opentelemetry attribute parsed added extra field names according to
golang JSON parser spec for structs:
```
struct AnyValue{
StringValue string
}
```
Was serialized into:
```
{"StringValue": "some-string"}
```
While opentelemetry-collector serializes it as
```
"some-string"
```
This commit changes this behaviour it makes parses compatible with opentelemetry-collector format. See test cases for examples.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8384
Since funcs `ParseDuration` and `ParseTimeMsec` are used in vlogs,
vmalert, victoriametrics and other components, importing promutils only
for this reason makes them to export irrelevant
`vm_rows_invalid_total{type="prometheus"}` metric.
This change removes `vm_rows_invalid_total{type="prometheus"}` metric
from /metrics page for these components.
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 63f6ac3ff8)
Fix parsing of IPv6 addresses after discovery. Previously, it could lead
to target being discovered and discarded afterwards.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8374
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 99de272b72)
`MustParsePromMetrics` imports `lib/protoparser/prometheus`, and this
package exposes the following metrics:
```
vm_protoparser_rows_read_total{type="promscrape"}
vm_rows_invalid_total{type="prometheus"}
```
It means every package that uses `lib/prompbmarshal` will start exposing
these metrics. For example, vlogs imports `lib/protoparser/common` which
uses `lib/prompbmarshal.Label`. And only because of this vlogs starts
exposing unrelated prometheus metrics on /metrics page.
Moving `MustParsePromMetrics` to `lib/protoparser/prometheus` seems like
the leas intrusive change.
-----------
Depends on another change
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Previously, if indexDB search failed for some reason during search at previous indexDB (aka extDB), VictoriaMetrics stored empty search result at cache. It could cause incorrect search results at subsequent requests.
This commit checks search error and stores request results only on success.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8345
For example, `field:~".+"`, `field:~".*"` or `field:""`
Replace such filters to faster ones. For example, `field:~".*"` is replaced with `*`,
while `field:~".+"` is replaced with `field:*`.
These filters can be used for selecting logs where one field value is less than another field value.
These filter complement `<=` and `<` filters for constant literals.
(cherry picked from commit 30974e7f3f)
This allows using different intervals for flushing in-memory data among different mergeset.Table instances.
The initial user of this feature is lib/logstorage.Storage, which explicitly passes Storage.flushInterval
to every created mereset.Table instance. Previously mergeset.Table instances were using 5 seconds
flush interval, which didn't depend on the Storage.flushInterval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775
Make sure that the maximum log field name, which can be generated by JSONParser.ParseLogMessage,
doesn't exceed the hardcoded limit maxFieldNameSize. Stop flattening of nested JSON objects
when the resulting field name becomes longer than maxFieldNameSize, and return the nested JSON object
as a string instead.
This should prevent from parse errors when ingesting deeply nested JSON logs with long field names.
- `contains_any` selects logs with fields containing at least one word/phrase from the provided list.
The provided list can be generated by a subquery.
- `contains_all` selects logs with fields containing all the words and phrases from the provided list.
The provided list can be generated by a subquery.
Examples:
- `top 5 x, y` is equivalent to `top 5 by (x, y)`
- `uniq foo, bar` is equivalent to `uniq by (foo, bar)`
- `unroll foo, bar` is equivalent to `unroll (foo, bar)`
_time:<=max_time filter must include logs with timestamps matching max_time.
For example, _time:<=2025-02-24Z must include logs with timestamps until the end of February 24, 2025.
Examples:
_time:>=2025-02-24Z selects logs with timestamps bigger or equal to 2025-02-24 UTC
_time:>1d selects logs with timestamps older than one day comparing to the current time
This simplifies writing queries with _time filters.
See https://docs.victoriametrics.com/victorialogs/logsql/#time-filter
Inside PARAM-VALUE, the characters '"' (ABNF %d34), '\' (ABNF %d92),
and ']' (ABNF %d93) MUST be escaped. This is necessary to avoid
parsing errors. Escaping ']' would not strictly be necessary but is
REQUIRED by this specification to avoid syslog application
implementation errors. Each of these three characters MUST be
escaped as '\"', '\\', and '\]' respectively. The backslash is used
for control character escaping for consistency with its use for
escaping in other parts of the syslog message as well as in traditional syslog.
Related RFC:
https://datatracker.ietf.org/doc/html/rfc5424#section-6.3.3
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8282
### Describe Your Changes
By default, stream aggregation and deduplication stores a single state
per each aggregation output result.
The data for each aggregator is flushed independently once per
aggregation interval. But there's no guarantee that
incoming samples with timestamps close to the aggregation interval's end
will get into it. For example, when aggregating
with `interval: 1m` a data sample with timestamp 1739473078 (18:57:59)
can fall into aggregation round `18:58:00` or `18:59:00`.
It depends on network lag, load, clock synchronization, etc. In most
scenarios it doesn't impact aggregation or
deduplication results, which are consistent within margin of error. But
for metrics represented as a collection of series,
like
[histograms](https://docs.victoriametrics.com/keyconcepts/#histogram),
such inaccuracy leads to invalid aggregation results.
For this case, streaming aggregation and deduplication support mode with
aggregation windows for current and previous state. With this mode,
flush doesn't happen immediately but is shifted by a calculated samples
lag that improves correctness for delayed data.
Enabling of this mode has increased resource usage: memory usage is
expected to double as aggregation will store two states
instead of one. However, this significantly improves accuracy of
calculations. Aggregation windows can be enabled via
the following settings:
- `-streamAggr.enableWindows` at [single-node
VictoriaMetrics](https://docs.victoriametrics.com/single-server-victoriametrics/)
and [vmagent](https://docs.victoriametrics.com/vmagent/). At
[vmagent](https://docs.victoriametrics.com/vmagent/)
`-remoteWrite.streamAggr.enableWindows` flag can be specified
individually per each `-remoteWrite.url`.
If one of these flags is set, then all aggregators will be using fixed
windows. In conjunction with `-remoteWrite.streamAggr.dedupInterval` or
`-streamAggr.dedupInterval` fixed aggregation windows are enabled on
deduplicator as well.
- `enable_windows` option in [aggregation
config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config).
It allows enabling aggregation windows for a specific aggregator.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c8fc903669)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
A follow-up after 9bb5ba5d2f
that impacted compression ratio for data compressed with native GO zstd lib (`make test-pure`).
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 38bded4e58)
It has been appeared these optimizatios do not give measurable performance improvements,
while they complicate the code too much and may result in slowdown when the ingested logs have
different sets of fields.
This is a follow-up for 630601488e
(cherry picked from commit dce5eb88d3)
It has been appeared that 256 files increase RAM usage too much comparing to 128 files
when ingesting logs with hundreds of fields (aka wide events). So let's return back 128 files
limit for now.
This is a follow-up for 9bb5ba5d2f
(cherry picked from commit a50ab10998)
There is little sense in keeping too big buffers - they just waste RAM and do not reduce
the load on GC too much. So it is better dropping such buffers at Reset instead of keeping them around.
(cherry picked from commit b58e2ab214)
The number of filestream readers is proportional to the number of parts to be merged,
while the number of filestream writers is proportional to the number of concurrent merges.
Usually around 4-16 parts are merged at once, so the number of active filestream readers is ~8x
bigger than the number of active filestream writers.
So it is a good idea to use smaller size of read buffers comparing to the size of write buffers.
Limit read buffer size by 64Kb, while write buffer size is limited by 128Kb.
This should reduce the overall memory usage when merging parts with big number of files.
This is the case for VictoriaLogs, which works with logs containing hundreds of fields (aka wide events).
(cherry picked from commit 659251beaa)
This should improve query performance for logs with hundreds of fields (aka wide events).
Previously there was a high chance that the data for multiple log fields is stored in the same file.
This could result in query performance slowdown and/or increased disk read IO,
since the operating system could read unnecessary data for the fields, which aren't used in the query.
Now log fields are guaranteed to be stored in separate files until the number of fields exceeds 256.
After that multiple log fields start sharing files.
(cherry picked from commit 9bb5ba5d2f)
This reduces memory usage when many filestreams are processed simultaneously.
This is the case for VictoriaLogs when it processes logs with hundreds of fields.
(cherry picked from commit 2a681f2e8d)
- Re-use column names and values from the previously added rows if possible.
This increases locality of reference for field names and values, while improving
access speed for the field names and values.
- Postpone sorting fields in the added rows until creating inmemory part from them.
This allows optimizing the sorting for log fields with the same set of fields.
This is usually the case for logs, which belong to the same logs stream.
(cherry picked from commit 630601488e)
Commit 68791f9ccc introduced regression.
It performed basicAuth check before built-in routes. It made impossible
to bypass basic authorization with `authKey` param.
This commit fixeds that issue and removes unneeded check. It also adds
integration tests for this case.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7345
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
This eliminates a class of potential bugs with incorrect stats calculations when an additional filter
is applied to the blockResult before passing it to the stats function, and this filter removes
all the rows from blockResult.
Allow disabling the per-day index using the `-disablePerDayIndex` flag.
This should significantly improve the ingestion rate and decrease the
disk space usage for the use cases that assume small or no churn rate.
See the docs added to `docs/README.md` for details.
Both improvements are due to no data written to the per-day index.
Benchmark results:
```shell
rm -Rf ./lib/storage/Benchmark*; go test ./lib/storage -run=NONE -bench=BenchmarkStorageInsertWithAndWithoutPerDayIndex --loggerLevel=ERROR
goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage
cpu: 13th Gen Intel(R) Core(TM) i7-1355U
BenchmarkStorageInsertWithAndWithoutPerDayIndex/HighChurnRate/perDayIndexes-12 1 3850268120 ns/op 39.56 data-MiB 28.20 indexdb-MiB 259722 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/HighChurnRate/noPerDayIndexes-12 1 2916865725 ns/op 39.57 data-MiB 25.73 indexdb-MiB 342834 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/NoChurnRate/perDayIndexes-12 1 2218073474 ns/op 9.772 data-MiB 13.73 indexdb-MiB 450842 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/NoChurnRate/noPerDayIndexes-12 1 1295140898 ns/op 9.771 data-MiB 0.3566 indexdb-MiB 772119 rows/s
PASS
ok github.com/VictoriaMetrics/VictoriaMetrics/lib/storage 11.421s
```
Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
MustOpenStorage function may accept variable number of optional
arguments. This commit combines optional arguments into dedicated OpenOptions
struct. It reduces complexity of adding new optional arguments.
Related PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8118
1. Use distinct code paths for blockResult.getValues() and blockResult.getValuesBucketed().
This should simplify debugging and maintenance of the resulting code.
2. Do not load column values if all the values in the block fit the same bucket.
Use blockResultColumn.minValue and blockResultColumn.maxValue for determining whether
column values must be loaded via blockResultColumn.getValuesEncoded().
This signiciantly improves performance for big buckets, which cover all the column
values in a block.
3. Properly calculate buckets for negative values.
4. Properly adjust weekly buckets by Monday.
### Describe Your Changes
fix typo for description of flag -pprofAuthKey
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 5fad3c8492)
Previously the `callbackErr` is silently ignored in clusternative parser, which is used at vminsert for parsing clusterNative requests and at vmstorage for parsing vminsert requests.
This commit fixes that by properly return callbackError after reading all block metrics. This aligns
with other parsers in `lib/protoparser`.
This improves performance for queries, which use `sort by (...) limit N` without mentioning _time field.
For example, the following query must work faster now
_time:1d | rm _time | sort by (request_duration desc) limit 10
(cherry picked from commit 422caf6bd7)
- Add a fast path for timestamps ending with 'Z'
- Use strings.LastIndexAny instead of strings.IndexAny for searching
for timezone offset at the end of the string. This works faster
for timestamps with sub-second precision.
(cherry picked from commit 335071cf3d)
Update the pipe state only once per each series of repeated strings, uint8 values and tuples.
This improves performance a bit for the following `top` pipes:
- top (string_field)
- top (uint8_field)
- top (field1, ..., fieldN)
Do not apply the optimization for uint16, uint32, uint64 and int64 fields, since they
usually contain big number of unique values, which do not repeat most of the time.
Previously RFC3339 timestamps with sub-second precision could be incorrectly compared by lessString().
For example, 2025-01-20T10:20:30.1Z was incorrectly treated as smaller than 2025-01-20T10:20:30.09Z,
because the first timestamp has smaller decimal number after the last dot than the second timestamp.
(cherry picked from commit 81d359507d)
Split unique values (groups) into shards according to the configured concurrency
during processing of the matching rows if the number of unique values exceeds the hardcoded threshold.
Previously this splitting was performed unconditionally at the merge stage when merging independently
calculated per-CPU states into a single state. It is faster to perform the split during rows processing
if the number of unique values is big.
This gives up to 30% perfromance improvements when these pipes are applied to big number of unique values (groups).
(cherry picked from commit 48602a1ae8)
The number of worker shards per each pipe processor is created during query initialization.
This number equals to the `options(concurrency=N)` if this option is set or to the number of available CPU cores.
This means that all the pipes must adhere the given concurrency when passing data blocks
to the next pipe.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8201
The bug has been introduced in 0214aa328e
vmauth uses 'lib/httpserver' for serving HTTP requests. This server
unconditionally defines built-in routes (such as '/metrics',
'/health', etc). It makes impossible to proxy `HTTP` requests to backends with the same routes.
Since vmauth's httpserver matches built-in route and return local
response.
This commit adds new flag `httpInternalListenAddr` with
default empty value. Which removes internal API routes from public
router and exposes it at separate http server.
For example given configuration disables private routes at `0.0.0.0:8427` address and serves it at `0.0.0.0:8426`:
`./bin/vmauth --auth.config=config.yaml --httpListenAddr=:8427 --httpInternalListenAddr=127.0.0.1:8426`
Related issues:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6468
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7345
### Describe Your Changes
The cardinality limiter in this case does not receive the actual
metricID but some other value found in r.TSID.MetricID and is not
initialized. Depending on the system and/or go runtime implementation,
this value can be 0 or some garbage value (which shouldn't have too wide
a range). Thus, there basically no limit for inserted metricIDs.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 631b736bc2)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
At this time `bufferedwriter` [silently ignores connection close
errors](78eaa056c0/lib/bufferedwriter/bufferedwriter.go (L67)).
It may be very convenient in some situations (to not log such
unimportant errors), but it's too implicit and unsafe for the others.
For example, if you close [export
API](https://docs.victoriametrics.com/#how-to-export-time-series) client
connection in the middle of communication, VictoriaMetrics won't notice
it and will start to hog CPU by exporting all the data into nowhere
until it process all of them. If you'll make a few retries, it will be
effectively a DoS on the server.
This commit replaces this implicit error suppressing with explicit error
handling which fixes the issue with export API.
Issue was introduced at e78f3ac8ac
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7988
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 13c4324bb5)
It is better to use the AppendFieldsToJSON function directly
instead of hiding it under RowsFormatter abstraction.
(cherry picked from commit 95f182053b)
This should reduce the time needed for opening the storage with retentions exceeding a few months.
While at at, limit the concurrency of opening partitions in parallel to the number of available CPU cores,
since higher concurrency may increase RAM usage and CPU usage without performance improvements
if opening a single partition is CPU-bound task.
This is a follow-up for 17988942ab
Use the testing.Testing() function in order to determine whether the code runs in test.
This allows running tests and fast speed without the need to specify DISABLE_FSYNC_FOR_TESTING
environment variable.
This is a follow-up for the commit 334cd92a6c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6871
The purpose of extra filters ( https://docs.victoriametrics.com/victorialogs/querying/#extra-filters )
is to limit the subset of logs, which can be queried. For example, it is expected that all the queries
with `extra_filters={tenant=123}` can access only logs, which contain `123` value for the `tenant` field.
Previously this wasn't the case, since the provided extra filters weren't applied to subqueries.
For example, the following query could be used to select all the logs outside `tenant=123`, for any `extra_filters` arg:
* | union({tenant!=123})
This commit fixes this by propagating extra filters to all the subqueries.
While at it, this commit also properly propagates [start, end] time range filter from HTTP querying APIs
into all the subqueries, since this is what most users expect. This behaviour can be overriden on per-subquery
basis with the `options(ignore_global_time_filter=true)` option - see https://docs.victoriametrics.com/victorialogs/logsql/#query-options
Also properly apply apply optimizations across all the subqueries. Previously the optimizations at Query.optimize()
function were applied only to the top-level query.
logger.Fatalf("BUG: ...") complicates investigating the bug, since it doesn't show the call stack,
which led to the bug. So it is better to consistently use logger.Panicf("BUG: ...") for logging programming bugs.
Previously, NewChild elements of querytracer could be referenced by concurrent
storageNode goroutines. After earlier return ( if search.skipSlowReplicas is set), it is
possible, that tracer objects could be still in-use by concurrent workers.
It may cause panics and data races. Most probable case is when parent tracer is finished, but children
still could write data to itself via Donef() method. It triggers read-write data race at trace
formatting.
This commit adds a new methods to the querytracer package, that allows to
create children not referenced by parent and add it to the parent later.
Orphaned child must be registered at the parent, when goroutine returns. It's done synchronously by the single caller via finishQueryTracer call.
If child didn't finished work and reference for it is used by concurrent goroutine, new child must be created instead with
context message.
It prevents panics and possible data races.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8114
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
This is done via 'options(concurrency=N)' prefix for the query.
For example, the following query is executed on at most 4 CPU cores:
options(concurrency=4) _time:1d | count_uniq(user_id)
This allows reducing RAM and CPU usage at the cost of longer query execution times,
since by default every query is executed in parallel on all the available CPU cores.
See https://docs.victoriametrics.com/victorialogs/logsql/#query-options
Also always initialize Query.timestamp with the timestamp from the lexer.
This should avoid potential problems with relative timestamps inside inner queries.
For example, the `_time:1h` filter in the following query is correctly executed
relative to the current timestamp:
foo:in(_time:1h | keep foo)
Despite requirement in OpenTelemetry spec that histograms should contain
sum, [OpenTelemetry collector promremotewrite
translator](37c8044abf/pkg/translator/prometheusremotewrite/helper.go (L222))
and [Prometheus OpenTelemetry
parsing](d52e689a20/storage/remote/otlptranslator/prometheusremotewrite/helper.go (L264))
skip only sum if it's absent. Our current implementation drops buckets
if sum is absent, which causes issues for users, that are expecting a
similar to Prometheus behaviour
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 2adb5fe014)
### Describe Your Changes
optimize for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6226
for user who set `*AuthKey` flag, they will receive new response in
body:
```go
// query arg not set
The provided authKey '' doesn't match -search.resetCacheAuthKey
// incorrect query arg
The provided authKey '5dxd71hsz==' doesn't match -search.resetCacheAuthKey
```
previously, they receive:
```
The provided authKey doesn't match -search.resetCacheAuthKey
```
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 1f0b03aebe)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Commit eef6943084 added new test
functions. Which checks various cases for metricName registration at
data ingestion.
Initial dataset size had 4 batches with 100 rows each. It works fine at
machines with 5GB+ memory.
But i386 architecture supports only 4GB of memory per process.
Due to given limitations, batch size should be reduced to 3 batches and
30 rows. It keeps the same
test funtionality, but reduces overall memory usage to ~3GB.
Signed-off-by: f41gh7 <nik@victoriametrics.com>
(cherry picked from commit 277fdd1070)
Samples parsing is a hot path. Bad client could easily overwhelm
receiver with bad or unsupported data. So it is better to throttle such
messages.
Follow-up after
b26a68641c
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d290efb849)
This reduces memory usage by up to 2x for the map used for tracking hits.
This also reduces CPU usage for tracking integer fields.
(cherry picked from commit cc29692e27)
Process values in batches instead of passing every value in the callback.
This improves performance of reading the encoded values from storage by up to 50%.
(cherry picked from commit f018aa33cb)
New version has additional checks and reduced resource consumption, so
it doesn't timeout for our internal repos.
To make linter happy, I addressed "redefinition of the built-in
function" lint error.
----
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- url encoding / decoding with <urlencode:field> and <urldecode:field>
- base64 encoding / decoding with <base64encode:field> and <base64decode:field>
- hex encoding / decoding with <hexencode:field> and <hexdecode:field>
- hex encoding for integers with <hexnumencode:field> and <hexnumdecode:field>
This allows reducing the state of every statsProcessor by removing pointer to the corresponding statsFunc.
For example, this reduces statsCountProcessor size by 2x.
Previoysly finalizeStats() for some functions such as count_uniq() could run for long periods
of time after the query is canceled, since stopCh wan't propagated to finalizeStats().
Prevsiously integer values were converted to strings before being passed to `updateState()` function at `count_uniq`
and `count_uniq_hash`. Later such values are converted back to integers in order to track them via integer map of unique values.
This commit avoids the int -> string -> int conversion. Instead, it passes integers directly to the integer map of unique values.
This improves performance of `count_uniq` and `count_uniq_hash` functions even further.
This filter can be used when debugging and exploring logs in order to understand better
which value types are used for storing the particular log fields.
The `value_type` filter complements `block_stats` pipe.
- Pass the calculated results to the next pipe in float64 columns.
Previously the results were converted to string columns. This could slow down further calculations.
- Use custom optimized logic for processing numeric columns, which are passed to math pipe.
Previously all the input columns were converted to string and then converted to float64
before math pipe calculations.
- Initialize the newly added columns at blockResult as soon as they are added.
This improves performance when big number of columns are calculated by math pipe.
Previously integer values were tracked in string maps. Now every input value is parsed as integer.
On success the parsed integer is tracked via specialized maps, which hold only integers.
This reduces CPU usage and memory usage in general case.
Use the column name attached to the corresponding part. The lifetime of this column name exceed the blockSearch lifetime,
so it is safe using it here.
This is a follow-up for 8d968acd0a
Previously columns with negative int64 values were stored either as float64 or string
depending on whether the negative int64 values are bigger or smaller than -2^53.
If the integer values are smaller than -2^53, then they are stored as string, since float64 cannot
hold such values without precision loss. Now such values are stored as int64.
This should improve compression ratio and query performance over columns with negative int64 values.
Previously field values could be automatically converted to float64 with precision loss.
This could lead to unexpected results when querying such field values.
For example, "10007199254740992" was incorrectly represented as 10007199254740993.
This commit prevents from such lossy conversions when storing field values.
While at it, prevent from int64 overflow at tryParseBytes and tryParseDuration functions,
which are used for parsing constants in queries for byte sizes and durations.
Now these functions return 1<<63-1 (the maximum int64 value) for constants exceeding
this value. Previously they could return arbitrary garbage for such constants.
Hint allows to choose type of cache to be used for index search:
- in-memory parts are storing recently ingested samples and should use
main cache. This improves ingestion speed and cache hit ration for
queries accessing recently ingested samples.
- merges of file parts is performed in background, using a separate
cache allows avoiding pollution of the main cache with irrelevant
entries.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
(cherry picked from commit e9f86af7f5)
This commit makes configurable interval for checking if final dedup
process for the historical data should be started. It allows to spread
resource utilisation for multiple vmstorage/vmsingle instances in time.
Since final dedup may add additional preasure on disk, backup systems
and make cluster less stable. Storage unconditionally adds 25% jitter to
the provided value, it should simplify configuration management at
Kubernetes ecosystem. Because Kubernetes application pods must have the
same configuration.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7880
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 9ada784983)
### Describe Your Changes
fix function name in comment
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>
(cherry picked from commit d064e14933)
previously vmstorage ignored limit values from vmselect component.
This behavior is prohibited starting from v1.105.0, with
85f60237e2.
This breaks the original intent of the -search.maxUniqueTimeseries command-line flag, which has been added at vmselect nodes in the commit b843f0e : to be able to override the default limit at vmstorage on the number of unique time series, at different subsets of vmselect nodes.
The behavior should be the following:
* If -search.maxUniqueTimeseries command-line flag isn't set at both vmselect and vmstorage nodes, then the limit on the number of unique time series must be automatically detected at vmstorage nodes according to
* vmstorage: automatically adjust -search.maxUniqueTimeseries max value . This simplifies configuration of VictoriaMetrics cluster for the typical case.
* If -search.maxUniqueTimeseries command-line flag is explicitly set at vmstorage node, then it must be used as the limit on the number of unique time series, without automatic detection of the limit. Explicitly set limit at vmstorage node cannot be exceeded by the limit from vmselect nodes.
* If the -search.maxUniqueTimeseries command-line flag is explicitly set at vmselect node, then it must override the automatically detected limit at vmstorage node. For example, if vmselect node provides the limit, which exceeds the automatically detected limit at vmstorage node, then the limit from the vmselect node must be applied during query execution at vmstorage node. This will allow properly executing queries from the subset of vmselect nodes for reporting queries described above.
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7852
### Describe Your Changes
Currently if multiple msgFields are present in a log row it's not
obvious which field is selected as a _msg field. With this PR and order
of msgfield values defined either via headers or query arg params
defines a priority of these values
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7761
### Describe Your Changes
- datadog /api/v2/logs api supports message field in json format, which
is not documented and is used by serverless extension. This PR allows
message field to be both string and object type. Also added support of
not documented timestamp field
- added `-datadog.streamFields` and `-datadog.ignoreFields` flags to
configure default stream fields for datadog logs, where there's no
alternative option to pass extra headers and query args
- added ingest `max` and `min` values of data, which are ingested using
`datadogsketches` API, which is also actively used by serverless
extensions
- use default `.` separator instead of `_` for sketches metric names
until metrics are not sanitized
This should prevent from excess usage of CPU, RAM and other resources when too many logs
are passed to 'stream_context' pipe.
It is expected that 'stream_context' pipe results are investigated by humans, who cannot inspect
surrounding logs for millions of initial logs. That's why it is OK to limit the number of logs
and/or log streams, which can be passed to 'stream_context' pipe.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7766
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7903
While at it, reduce memory allocations at Storage.getFieldValuesNoHits and make it more scalable on multi-CPU systems.
This improves performance of in(<query>) filter when the <query> returns big number of values.
Use chunked allocator in order to reduce memory allocations. It allocates objects from slices of up to 64Kb size.
This improves performance for `stats` and `top` pipes by up to 2x when they are applied to big number of `by (...)` groups.
Also parallelize execution of `count_uniq`, `count_uniq_hash` and `uniq_values` stats functions,
so they are executed faster on hosts with many CPU cores when applied to fields with big number
of unique values.
The commit 4599429f51 improperly set br.cs to nil,
while it should set br.bs to nil instead. This resulted in excess memory allocations
at br.csInit() and br.csInitFast().
Historically some of VictoriaMetrics components were optimized for the low rate of memory allocations.
These are: vmagent, single-node VictoriaMetrics and vmstorage. These components benefit from the low
GOGC value, since this allow reducing their memory usage in steady state on typical workloads.
Other VictoriaMetrics components aren't optimized for the reduced rate of memory allocations.
This results in the increased CPU usage spent on garbage collection (GC) in these components,
since it must be triggered at higher rate. See https://tip.golang.org/doc/gc-guide#GOGC for details.
These components do not use too much memory, so it is OK increasing the GOGC for these components
from 30 to 100 - this won't affect the most users.
Keep GOGC to 30 only for vmagent, single-node VictoriaMetrics and vmstorage components.
See 077193d87c and 54b9e1d3cb .
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7902
Numeric fields can be stored as const values in the block of logs. In this case the `sort` pipe
was incorrectly comparing such values as strings instead of numbers. This results in incorrect
sort results. For example, 123 was smaller than 2. Fix this by removing the incorrect case
for comparing const fields.
While at it, replace lessString() with strings.LessNatural() in the sortBlockLess.
This improves sorting performance a bit, since the sortBlockLess function already tried
comparing numeric values, and it doesn't need to spend CPU time on such a comparison again inside lessString() call.
The commit 42c9183281 wasn't correct by replacing strings.LessNatural() with lessString()
inside the sortBlockLess() function.
While at at, allow passing an array of string values per each JSON entry at extra_filters and extra_stream_filters.
For example, `extra_filters={"foo":["bar","baz"]}` is converted into `foo:in("bar", "baz")` extra filter,
while `extra_stream_fitlers={"foo":["bar","baz"]}` is converted into `{foo=~"bar|baz"}` extra filter.
This should simplify creating faceted search when multiple values per a single log field must be selected.
This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7365#issuecomment-2447964259
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5542
Such log fields do not give any useful information during logs' exploration.
They just clutter the output of the `facets` pipe. So it is better to drop such fields by default.
If these fields are needed, then `keep_const_fields` option can be added to `facets` pipe.
Previously service labels won't be attached when `role: tasks` is set.
Because the `addServicesLabels` function is shared by `role: tasks` and
`role: services`, and it will return nothing when `vip.Addr` is invalid
or empty.
In Prometheus, even if `vip.Addr` is empty, it attach common service
labels with [a standalone
function](f10c3454e9/discovery/moby/services.go (L129)),
which offers:
- `__meta_dockerswarm_service_id`: the id of the service.
- `__meta_dockerswarm_service_name`: the name of the service.
- `__meta_dockerswarm_service_mode`: the mode of the service.
- `__meta_dockerswarm_service_label_<labelname>`: each label of the
service, with any unsupported characters converted to an underscore.
This PR add a `addServicesLabelsForTask`, to replace the usage of
`addServicesLabels` when `role: tasks` is set. This function offers
common service labels listed above.
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7800
Previously cluster with the following vmselect configuration:
./bin/vmselect
-storageNode=gr1/:8211,gr1/:8212
-storageNode=gr2/:8213,gr2/:8214
-search.skipSlowReplicas=true
-globalReplicationFactor=2
Here we have two vmstorage groups and -globalReplicationFactor=2, which effectively means that "every ingested sample is replicated across multiple vmstorage groups". Hence, gr1 and gr2 contain identical data set. And when we set -search.skipSlowReplicas=true it is expected vmselect should return result as soon as at least one storage group returned the full result.
In current state, -search.skipSlowReplicas is ignored on the storage group level. It is only respected within the group (with -replicationFactor flag).
This commit fixes global replication for skipSlowReplicas.
To ensure that the fix works and does not break
anything replication tests have been added. For checking the fix for
skipping slow replicas see `testGroupSkipSlowReplicas()`.
To emulate storage groups, the integration test creates a cluster with
multilevel vminsert. The L1 inserts are group-level inserts, each writes
to its own group of vmstorages. The L2 vminsert is a global vminsert
that writes replicated to the L1 vminserts.
To enable multilevel inserts changes in apptest framework and
`lib/ingestserver/clusternative/server.go` were necessary.
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6924
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Previously, during de-duplication staleness markers could be removed due to incorrect logic at
values equality check.
During the evaluation of read query vmselect deduplicates samples using dedupInterval option. It picks the highest value across all points with the same timestamp next to the border of dedupInterval. The issue is any comparison with NaN via <, > returns false. This means that the position of NaN in srcValues could affect the result.
This commit changes this logic with additional step, that explicitly checks for staleness marker for the following cases:
1. Deduplication on vmselect
2. Deduplication in vmstorage during merges
3. Deduplication in stream aggregation
check performed only for stale markers, because other NaNs are rejected on ingestion
by vmstorage or by stream aggregation.
Checking for stale markers in general slows down dedup speed by 3%:
```
benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage
cpu: Apple M4 Pro
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
DeduplicateSamples/minScrapeInterval=1s-14 462.8n ± ∞ ¹ 425.2n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamples/minScrapeInterval=2s-14 905.6n ± ∞ ¹ 903.3n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamples/minScrapeInterval=5s-14 710.0n ± ∞ ¹ 698.9n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamples/minScrapeInterval=10s-14 632.7n ± ∞ ¹ 638.5n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamplesDuringMerge/minScrapeInterval=1s-14 439.7n ± ∞ ¹ 409.9n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamplesDuringMerge/minScrapeInterval=2s-14 908.9n ± ∞ ¹ 882.2n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamplesDuringMerge/minScrapeInterval=5s-14 721.2n ± ∞ ¹ 684.7n ± ∞ ¹ ~ (p=1.000 n=1) ²
DeduplicateSamplesDuringMerge/minScrapeInterval=10s-14 659.1n ± ∞ ¹ 630.6n ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean 659.5n 636.0n -3.56%
```
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7674
---------
Co-authored-by: hagen1778 <roman@victoriametrics.com>
### Describe Your Changes
rfc5424 doesn't allow structured data to be started from whitespace, but
it can be present in the end of this section
related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7776
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit e0ab3fccaf)
Changed enabled limit condition to `or` instead of `and`. Since labels must checked if at least one of the limits is defined.
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Previously, time series with labels exceeding the configured limits were truncated and written to storage, potentially causing data inconsistency. This could lead to collisions between time series and make it difficult to identify the source due to truncated labels.
This commit changes the behavior:
* Such time series are now rejected outright.
* Rejected time series are logged to stdout, and corresponding counters are incremented.
* removes `vm_too_long_label_values_total`, `vm_too_long_label_names_total`, `vm_metrics_with_dropped_labels_total` metrics.
* adds new values `[too_many_labels,too_long_label_name,too_long_label_value]` to `reason` label of the `vm_rows_ignored_total` metric name
related issues:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6928
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7661
This commit aligns behaviour of docker service discovery with Prometheus implementation.
It adds the following changes:
* introduce new config param `match_first_network` with default value of `true`. It uses the first network if the container has multiple networks
defined. It should help to avoid collecting duplicate targets error with multi network setups.
* add `networks` for the containers with linked network to the other containers with `network_mode: container:id` setting. It resolve an issue with attached containers aka `pods` in Kubernetes.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7398
This function calculates the number of unique value hashes. This number is a good approximation
for the number of unique values. The `count_uniq_hash` function uses less memory and works faster
than `count_uniq` when applied to fields with big number of unique values.
The panic may occur when the surrounding logs for some original log entry are empty.
This is possible when these logs were included into surrounding logs for the previous original log entry.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7762
(cherry picked from commit 843fae3419)
The following patterns are detected:
- `<N>-<N>-<N>-<N>-<N>` is replaced with `<UUID>`.
- `<N>.<N>.<N>.<N>` is replaced with `<IP4>`.
- `<N>:<N>:<N>` is replaced with `<TIME>`. Optional fractional seconds after the time are treated as a part of `<TIME>`.
- `<N>-<N>-<N>` and `<N>/<N>/<N>` is replaced with `<DATE>`.
- `<N>-<N>-<N>T<N>:<N>:<N>` and `<N>-<N>-<N> <N>:<N>:<N>` is replaced with `<DATETIME>`. Optional timezone after the datetime is treated as a part of `<DATETIME>`.
(cherry picked from commit db961f8609)
This is useful for detecting patterns across log messages, which differ by various numeric fields,
with the following query:
_time:1h | collapse_nums | top 10 by (_msg)
(cherry picked from commit 65d831a0ee)
The max_value_len query arg allows controlling the maximum length of values
per every log field. If the length is exceeded, then the log field is dropped
from the results, since it contains incomplete (misleading) set of most frequently seen field values.
(cherry picked from commit 48540ac409)
It is impossible to count all the empty value per every seen field,
since they aren't counted for data blocks, which do not contain the given field.
So it is better ignoring empty values in order to reduce the level of confusion
when users see incorrect hits for empty per-field values.
(cherry picked from commit 3cef820cba)
It is very confusing to see incomplete set of values for fields, which contain a subset of short values,
while the rest of values are too long. It is better to ignore all the values in such fields.
It is also very confusing if the list of most frequently values has no an empty value.
So it is better counting hits for an empty value.
(cherry picked from commit b4f3861690)
`stream_context` is implemented in the way, which needs scanning all the logs for the selected log streams.
The scan performance is usually fast, since the majority of blocks are skipped, since they do not contain
rows with the needed timestamps. But there was a pathological case with `stream_context before N`:
VictoriaLogs usually scans blocks in chronological order. That means that the `before` context logs are constantly
updated with the new logs. This requires reading the actual data for the requested log fields from disk.
The workaround is to split the process of obtaining stream context logs into two phases:
1. Select only timestamps for the stream context logs, whithout selecting other log fields.
This operation is usually much faster than reading the requested log fields.
2. Select stream context logs for the selected timestamps. This operation is usually fast,
since the requested number of context logs is usually not so big.
Performance testing for the new algorithm shows up to 30x speed improvement for `stream_context before N`
and up to 5x speed improvement for `stream_context after N` when applied to log stream with 50M logs.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7637
(cherry picked from commit bddb0e369f)
This endpoint returns the most frequent values per each field seen in the selected logs.
This endpoint is going to be used by VictoriaLogs web UI for faceted search.
(cherry picked from commit 740548ccfc)
Previously such expressions were improperly formatted, which could result
in incorrect calculations at vlogscli.
For example, 'x / (y / z)' was formatted as 'x / y / z',
while 'x - (y + z)' was formatted as 'x - y + z'.
(cherry picked from commit 80c5066ef3)
`offset` and `limit` pipes cannot be applied individually per every step on the [start ... end] time range,
so they must be disallowed at /select/logsql/stats_query_range.
This is a follow-up for 534371031e
The `first N by (field)` pipe is a shorthand to `sort by (field) limit N`,
while the `last N by (field)` pipe is a shorthand to `sort by (field) desc limit N`.
While at it, add support for partitioning sort results by log groups and applying
individual limit per each group.
For example, the following query returns up to 3 logs per each host with the biggest value
for the `request_duration` field:
_time:5m | last 3 by (request_duration) partition by (host)
This query is equivalent to the following one:
_time:5m | sort by (request_duration) desc limit 3 partition by (host)
Automatically add the 'partition by (_time)` into `sort`, `first` and `last` pipes
used in the query to `/select/logsql/stats_query_range` API.
This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7699
Previously streamFields were unconditionally added to log stream fields, even if they were listed in the ignoreFields.
Also do not add extraStreamFields to log stream fields if streamFields is non-nil, since this may confuse users.
This is a follow-up for 17b813ba28
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7554
The storage isn't designed to work efficiently with logs containing too many log fields.
It is better to emit a warning to the user and ignore such logs instead of trying to store them.
This will allow fixing the issue by the user ASAP, and won't lead to excess resource usage
at VictoriaLogs side, such as RAM, CPU, disk IO and disk space.
While at it, ignore too long logs with the size exceeding the maximum block size during data ingestion.
This should prevent from possible issues when dealing with such long logs if they were stored in the storage.
Emit a warning in this case, so the user could identify and fix the issue ASAP.
This is a follow-up for 22e6385f56
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7568
Previously Block columns wasn't properly limited by maxColumnsPerBlock.
And it was possible a case, when more columns per block added than
expected.
For example, if ingested log stream has many unuqie fields
and it's sum exceed maxColumnsPerBlock.
We only enforce fieldsPerBlock limit during row parsing, which limits
isn't enough to mitigate this issue. Also it
would be very expensive to apply maxColumnsPerBlock limit during
ingestion, since it requires to track all possible field tags
combinations.
This commit adds check for maxColumnsPerBlock limit during
MustInitFromRows function call. And it returns offset of the rows and
timestamps added to the block.
Function caller must create another block and ingest remaining rows
into it.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7568
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>