Commit 68791f9ccc introduced regression.
It performed basicAuth check before built-in routes. It made impossible
to bypass basic authorization with `authKey` param.
This commit fixeds that issue and removes unneeded check. It also adds
integration tests for this case.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7345
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
This eliminates a class of potential bugs with incorrect stats calculations when an additional filter
is applied to the blockResult before passing it to the stats function, and this filter removes
all the rows from blockResult.
### Describe Your Changes
Allow disabling the per-day index using the `-disablePerDayIndex` flag.
This should significantly improve the ingestion rate and decrease the
disk space usage for the use cases that assume small or no churn rate.
See the docs added to `docs/README.md` for details.
Both improvements are due to no data written to the per-day index.
Benchmark results:
```shell
rm -Rf ./lib/storage/Benchmark*; go test ./lib/storage -run=NONE -bench=BenchmarkStorageInsertWithAndWithoutPerDayIndex --loggerLevel=ERROR
goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage
cpu: 13th Gen Intel(R) Core(TM) i7-1355U
BenchmarkStorageInsertWithAndWithoutPerDayIndex/HighChurnRate/perDayIndexes-12 1 3850268120 ns/op 39.56 data-MiB 28.20 indexdb-MiB 259722 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/HighChurnRate/noPerDayIndexes-12 1 2916865725 ns/op 39.57 data-MiB 25.73 indexdb-MiB 342834 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/NoChurnRate/perDayIndexes-12 1 2218073474 ns/op 9.772 data-MiB 13.73 indexdb-MiB 450842 rows/s
BenchmarkStorageInsertWithAndWithoutPerDayIndex/NoChurnRate/noPerDayIndexes-12 1 1295140898 ns/op 9.771 data-MiB 0.3566 indexdb-MiB 772119 rows/s
PASS
ok github.com/VictoriaMetrics/VictoriaMetrics/lib/storage 11.421s
```
Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
MustOpenStorage function may accept variable number of optional
arguments. This commit combines optional arguments into dedicated OpenOptions
struct. It reduces complexity of adding new optional arguments.
Related PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8118
1. Use distinct code paths for blockResult.getValues() and blockResult.getValuesBucketed().
This should simplify debugging and maintenance of the resulting code.
2. Do not load column values if all the values in the block fit the same bucket.
Use blockResultColumn.minValue and blockResultColumn.maxValue for determining whether
column values must be loaded via blockResultColumn.getValuesEncoded().
This signiciantly improves performance for big buckets, which cover all the column
values in a block.
3. Properly calculate buckets for negative values.
4. Properly adjust weekly buckets by Monday.
### Describe Your Changes
fix typo for description of flag -pprofAuthKey
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
This improves performance for queries, which use `sort by (...) limit N` without mentioning _time field.
For example, the following query must work faster now
_time:1d | rm _time | sort by (request_duration desc) limit 10
- Add a fast path for timestamps ending with 'Z'
- Use strings.LastIndexAny instead of strings.IndexAny for searching
for timezone offset at the end of the string. This works faster
for timestamps with sub-second precision.
Update the pipe state only once per each series of repeated strings, uint8 values and tuples.
This improves performance a bit for the following `top` pipes:
- top (string_field)
- top (uint8_field)
- top (field1, ..., fieldN)
Do not apply the optimization for uint16, uint32, uint64 and int64 fields, since they
usually contain big number of unique values, which do not repeat most of the time.
Previously RFC3339 timestamps with sub-second precision could be incorrectly compared by lessString().
For example, 2025-01-20T10:20:30.1Z was incorrectly treated as smaller than 2025-01-20T10:20:30.09Z,
because the first timestamp has smaller decimal number after the last dot than the second timestamp.
Split unique values (groups) into shards according to the configured concurrency
during processing of the matching rows if the number of unique values exceeds the hardcoded threshold.
Previously this splitting was performed unconditionally at the merge stage when merging independently
calculated per-CPU states into a single state. It is faster to perform the split during rows processing
if the number of unique values is big.
This gives up to 30% perfromance improvements when these pipes are applied to big number of unique values (groups).
The number of worker shards per each pipe processor is created during query initialization.
This number equals to the `options(concurrency=N)` if this option is set or to the number of available CPU cores.
This means that all the pipes must adhere the given concurrency when passing data blocks
to the next pipe.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8201
The bug has been introduced in 0214aa328e
vmauth uses 'lib/httpserver' for serving HTTP requests. This server
unconditionally defines built-in routes (such as '/metrics',
'/health', etc). It makes impossible to proxy `HTTP` requests to backends with the same routes.
Since vmauth's httpserver matches built-in route and return local
response.
This commit adds new flag `httpInternalListenAddr` with
default empty value. Which removes internal API routes from public
router and exposes it at separate http server.
For example given configuration disables private routes at `0.0.0.0:8427` address and serves it at `0.0.0.0:8426`:
`./bin/vmauth --auth.config=config.yaml --httpListenAddr=:8427 --httpInternalListenAddr=127.0.0.1:8426`
Related issues:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6468
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7345
### Describe Your Changes
The cardinality limiter in this case does not receive the actual
metricID but some other value found in r.TSID.MetricID and is not
initialized. Depending on the system and/or go runtime implementation,
this value can be 0 or some garbage value (which shouldn't have too wide
a range). Thus, there basically no limit for inserted metricIDs.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
At this time `bufferedwriter` [silently ignores connection close
errors](78eaa056c0/lib/bufferedwriter/bufferedwriter.go (L67)).
It may be very convenient in some situations (to not log such
unimportant errors), but it's too implicit and unsafe for the others.
For example, if you close [export
API](https://docs.victoriametrics.com/#how-to-export-time-series) client
connection in the middle of communication, VictoriaMetrics won't notice
it and will start to hog CPU by exporting all the data into nowhere
until it process all of them. If you'll make a few retries, it will be
effectively a DoS on the server.
This commit replaces this implicit error suppressing with explicit error
handling which fixes the issue with export API.
Issue was introduced at e78f3ac8ac
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7988
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
This should reduce the time needed for opening the storage with retentions exceeding a few months.
While at at, limit the concurrency of opening partitions in parallel to the number of available CPU cores,
since higher concurrency may increase RAM usage and CPU usage without performance improvements
if opening a single partition is CPU-bound task.
This is a follow-up for 17988942ab
Use the testing.Testing() function in order to determine whether the code runs in test.
This allows running tests and fast speed without the need to specify DISABLE_FSYNC_FOR_TESTING
environment variable.
This is a follow-up for the commit 334cd92a6c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6871
The purpose of extra filters ( https://docs.victoriametrics.com/victorialogs/querying/#extra-filters )
is to limit the subset of logs, which can be queried. For example, it is expected that all the queries
with `extra_filters={tenant=123}` can access only logs, which contain `123` value for the `tenant` field.
Previously this wasn't the case, since the provided extra filters weren't applied to subqueries.
For example, the following query could be used to select all the logs outside `tenant=123`, for any `extra_filters` arg:
* | union({tenant!=123})
This commit fixes this by propagating extra filters to all the subqueries.
While at it, this commit also properly propagates [start, end] time range filter from HTTP querying APIs
into all the subqueries, since this is what most users expect. This behaviour can be overriden on per-subquery
basis with the `options(ignore_global_time_filter=true)` option - see https://docs.victoriametrics.com/victorialogs/logsql/#query-options
Also properly apply apply optimizations across all the subqueries. Previously the optimizations at Query.optimize()
function were applied only to the top-level query.
logger.Fatalf("BUG: ...") complicates investigating the bug, since it doesn't show the call stack,
which led to the bug. So it is better to consistently use logger.Panicf("BUG: ...") for logging programming bugs.
Previously, NewChild elements of querytracer could be referenced by concurrent
storageNode goroutines. After earlier return ( if search.skipSlowReplicas is set), it is
possible, that tracer objects could be still in-use by concurrent workers.
It may cause panics and data races. Most probable case is when parent tracer is finished, but children
still could write data to itself via Donef() method. It triggers read-write data race at trace
formatting.
This commit adds a new methods to the querytracer package, that allows to
create children not referenced by parent and add it to the parent later.
Orphaned child must be registered at the parent, when goroutine returns. It's done synchronously by the single caller via finishQueryTracer call.
If child didn't finished work and reference for it is used by concurrent goroutine, new child must be created instead with
context message.
It prevents panics and possible data races.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8114
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
This is done via 'options(concurrency=N)' prefix for the query.
For example, the following query is executed on at most 4 CPU cores:
options(concurrency=4) _time:1d | count_uniq(user_id)
This allows reducing RAM and CPU usage at the cost of longer query execution times,
since by default every query is executed in parallel on all the available CPU cores.
See https://docs.victoriametrics.com/victorialogs/logsql/#query-options
Also always initialize Query.timestamp with the timestamp from the lexer.
This should avoid potential problems with relative timestamps inside inner queries.
For example, the `_time:1h` filter in the following query is correctly executed
relative to the current timestamp:
foo:in(_time:1h | keep foo)
Despite requirement in OpenTelemetry spec that histograms should contain
sum, [OpenTelemetry collector promremotewrite
translator](37c8044abf/pkg/translator/prometheusremotewrite/helper.go (L222))
and [Prometheus OpenTelemetry
parsing](d52e689a20/storage/remote/otlptranslator/prometheusremotewrite/helper.go (L264))
skip only sum if it's absent. Our current implementation drops buckets
if sum is absent, which causes issues for users, that are expecting a
similar to Prometheus behaviour
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
### Describe Your Changes
optimize for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6226
for user who set `*AuthKey` flag, they will receive new response in
body:
```go
// query arg not set
The provided authKey '' doesn't match -search.resetCacheAuthKey
// incorrect query arg
The provided authKey '5dxd71hsz==' doesn't match -search.resetCacheAuthKey
```
previously, they receive:
```
The provided authKey doesn't match -search.resetCacheAuthKey
```
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Commit eef6943084 added new test
functions. Which checks various cases for metricName registration at
data ingestion.
Initial dataset size had 4 batches with 100 rows each. It works fine at
machines with 5GB+ memory.
But i386 architecture supports only 4GB of memory per process.
Due to given limitations, batch size should be reduced to 3 batches and
30 rows. It keeps the same
test funtionality, but reduces overall memory usage to ~3GB.
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Samples parsing is a hot path. Bad client could easily overwhelm
receiver with bad or unsupported data. So it is better to throttle such
messages.
Follow-up after
b26a68641c
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Process values in batches instead of passing every value in the callback.
This improves performance of reading the encoded values from storage by up to 50%.
New version has additional checks and reduced resource consumption, so
it doesn't timeout for our internal repos.
To make linter happy, I addressed "redefinition of the built-in
function" lint error.
----
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- url encoding / decoding with <urlencode:field> and <urldecode:field>
- base64 encoding / decoding with <base64encode:field> and <base64decode:field>
- hex encoding / decoding with <hexencode:field> and <hexdecode:field>
- hex encoding for integers with <hexnumencode:field> and <hexnumdecode:field>
This allows reducing the state of every statsProcessor by removing pointer to the corresponding statsFunc.
For example, this reduces statsCountProcessor size by 2x.
Previoysly finalizeStats() for some functions such as count_uniq() could run for long periods
of time after the query is canceled, since stopCh wan't propagated to finalizeStats().
Prevsiously integer values were converted to strings before being passed to `updateState()` function at `count_uniq`
and `count_uniq_hash`. Later such values are converted back to integers in order to track them via integer map of unique values.
This commit avoids the int -> string -> int conversion. Instead, it passes integers directly to the integer map of unique values.
This improves performance of `count_uniq` and `count_uniq_hash` functions even further.
This filter can be used when debugging and exploring logs in order to understand better
which value types are used for storing the particular log fields.
The `value_type` filter complements `block_stats` pipe.
- Pass the calculated results to the next pipe in float64 columns.
Previously the results were converted to string columns. This could slow down further calculations.
- Use custom optimized logic for processing numeric columns, which are passed to math pipe.
Previously all the input columns were converted to string and then converted to float64
before math pipe calculations.
- Initialize the newly added columns at blockResult as soon as they are added.
This improves performance when big number of columns are calculated by math pipe.
Previously integer values were tracked in string maps. Now every input value is parsed as integer.
On success the parsed integer is tracked via specialized maps, which hold only integers.
This reduces CPU usage and memory usage in general case.
Use the column name attached to the corresponding part. The lifetime of this column name exceed the blockSearch lifetime,
so it is safe using it here.
This is a follow-up for 8d968acd0a
Previously columns with negative int64 values were stored either as float64 or string
depending on whether the negative int64 values are bigger or smaller than -2^53.
If the integer values are smaller than -2^53, then they are stored as string, since float64 cannot
hold such values without precision loss. Now such values are stored as int64.
This should improve compression ratio and query performance over columns with negative int64 values.
Previously field values could be automatically converted to float64 with precision loss.
This could lead to unexpected results when querying such field values.
For example, "10007199254740992" was incorrectly represented as 10007199254740993.
This commit prevents from such lossy conversions when storing field values.
While at it, prevent from int64 overflow at tryParseBytes and tryParseDuration functions,
which are used for parsing constants in queries for byte sizes and durations.
Now these functions return 1<<63-1 (the maximum int64 value) for constants exceeding
this value. Previously they could return arbitrary garbage for such constants.
Hint allows to choose type of cache to be used for index search:
- in-memory parts are storing recently ingested samples and should use
main cache. This improves ingestion speed and cache hit ration for
queries accessing recently ingested samples.
- merges of file parts is performed in background, using a separate
cache allows avoiding pollution of the main cache with irrelevant
entries.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
This commit makes configurable interval for checking if final dedup
process for the historical data should be started. It allows to spread
resource utilisation for multiple vmstorage/vmsingle instances in time.
Since final dedup may add additional preasure on disk, backup systems
and make cluster less stable. Storage unconditionally adds 25% jitter to
the provided value, it should simplify configuration management at
Kubernetes ecosystem. Because Kubernetes application pods must have the
same configuration.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7880
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
### Describe Your Changes
fix function name in comment
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>
### Describe Your Changes
Currently if multiple msgFields are present in a log row it's not
obvious which field is selected as a _msg field. With this PR and order
of msgfield values defined either via headers or query arg params
defines a priority of these values
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7761
### Describe Your Changes
- datadog /api/v2/logs api supports message field in json format, which
is not documented and is used by serverless extension. This PR allows
message field to be both string and object type. Also added support of
not documented timestamp field
- added `-datadog.streamFields` and `-datadog.ignoreFields` flags to
configure default stream fields for datadog logs, where there's no
alternative option to pass extra headers and query args
- added ingest `max` and `min` values of data, which are ingested using
`datadogsketches` API, which is also actively used by serverless
extensions
- use default `.` separator instead of `_` for sketches metric names
until metrics are not sanitized
This should prevent from excess usage of CPU, RAM and other resources when too many logs
are passed to 'stream_context' pipe.
It is expected that 'stream_context' pipe results are investigated by humans, who cannot inspect
surrounding logs for millions of initial logs. That's why it is OK to limit the number of logs
and/or log streams, which can be passed to 'stream_context' pipe.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7766
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7903
While at it, reduce memory allocations at Storage.getFieldValuesNoHits and make it more scalable on multi-CPU systems.
This improves performance of in(<query>) filter when the <query> returns big number of values.
Use chunked allocator in order to reduce memory allocations. It allocates objects from slices of up to 64Kb size.
This improves performance for `stats` and `top` pipes by up to 2x when they are applied to big number of `by (...)` groups.
Also parallelize execution of `count_uniq`, `count_uniq_hash` and `uniq_values` stats functions,
so they are executed faster on hosts with many CPU cores when applied to fields with big number
of unique values.
The commit 4599429f51 improperly set br.cs to nil,
while it should set br.bs to nil instead. This resulted in excess memory allocations
at br.csInit() and br.csInitFast().
Historically some of VictoriaMetrics components were optimized for the low rate of memory allocations.
These are: vmagent, single-node VictoriaMetrics and vmstorage. These components benefit from the low
GOGC value, since this allow reducing their memory usage in steady state on typical workloads.
Other VictoriaMetrics components aren't optimized for the reduced rate of memory allocations.
This results in the increased CPU usage spent on garbage collection (GC) in these components,
since it must be triggered at higher rate. See https://tip.golang.org/doc/gc-guide#GOGC for details.
These components do not use too much memory, so it is OK increasing the GOGC for these components
from 30 to 100 - this won't affect the most users.
Keep GOGC to 30 only for vmagent, single-node VictoriaMetrics and vmstorage components.
See 077193d87c and 54b9e1d3cb .
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7902
Numeric fields can be stored as const values in the block of logs. In this case the `sort` pipe
was incorrectly comparing such values as strings instead of numbers. This results in incorrect
sort results. For example, 123 was smaller than 2. Fix this by removing the incorrect case
for comparing const fields.
While at it, replace lessString() with strings.LessNatural() in the sortBlockLess.
This improves sorting performance a bit, since the sortBlockLess function already tried
comparing numeric values, and it doesn't need to spend CPU time on such a comparison again inside lessString() call.
The commit 42c9183281 wasn't correct by replacing strings.LessNatural() with lessString()
inside the sortBlockLess() function.
While at at, allow passing an array of string values per each JSON entry at extra_filters and extra_stream_filters.
For example, `extra_filters={"foo":["bar","baz"]}` is converted into `foo:in("bar", "baz")` extra filter,
while `extra_stream_fitlers={"foo":["bar","baz"]}` is converted into `{foo=~"bar|baz"}` extra filter.
This should simplify creating faceted search when multiple values per a single log field must be selected.
This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7365#issuecomment-2447964259
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5542