The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit
is exceeded by a query. This metric should help to identify expensive and heavy queries
without inspecting the logs.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* app/vmselect: limit the number of parallel workers by 32
The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* wip
- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md
- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag
- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md
- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
because bigger values may worsen query performance and increase CPU usage
- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
since it is closer to a feature than to a bugfix.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
This can be useful in the following queries:
drop_empty_series(temperature <= 30) default 40
This query drops temperature series with all the values bigger than 30 on the selected time range,
while replacing gaps in the remaining series with 40.
The query without drop_empty_series:
(temperature <= 30) default 40
would leave all the temperature series with all the values bigger than 30 on the selected time range,
and replace all their values with 40. This is not what could be epxected in some cases
like here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5071
- Reduce vertical space usage, so more information is available on the screen without the need to scroll.
- Show information for lines with higher values at the top of the legend under the graph.
This should simplify graph analysis when it contains many lines.
reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087
`median_over_time` is handled by predefined WITH template in MetricsQL library which translates it to `quantile_over_time(0.5)`
This makes it impossble to use `median_over_time` as a usual rollup function for `aggr_over_time`.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5034
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* Add button to prettify query
Just capitalizes query text for now
* Add /prettify-query API handler
* Replace UI pretiffier using prettifier API
* Add showing server errors
Had to pass setQueryErrors from useFetchQuery.ts
* Use serverUrl from global AppState
* Change icon to AutoAwsome icon + added style change color when button is active
* Add sync/await to prettifyQuery function
* Doc public function for lint
* Minor async fix
* Removed extra blank lines
* Extract usePrettifyQuery hook
* Made more generic style for :active button
* Refactor usePrettifyQuery
However, prettify errors don't clean up query errors, but should
* Add prettyQuery functionality to CHANGELOG.md
* Reuse queryErrors
* Unhide errors on start
---------
Co-authored-by: Tamara <toma.vashchuk@gmail.com>
This reverts commit 252643d100.
Reason for revert: the commit incorrectly fixes the the issue.
The `remoteAddr` must be properly quoted inside lib/httpserver.GetQuotedRemoteAddr().
It isn't quoted properly if the request contains X-Forwarded-For header.
The proper fix will be included in the follow-up commit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676
This eliminates the need in .(*T) casting for results obtained from Load()
Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map,
because map is already a pointer type.
- Add `Active queries` chapter to VMUI docs
- Set `Content-Type: json` header inside promql.WriteActiveQueries() handler,
in order to be consistent with other request handlers called at app/vmselect/main.go
- Pass the request to promql.WriteActiveQueries() handler, so it can change its output
depending on the provided request params. This also improves consistency of
promql.WriteActiveQueries() args with other request hanlers at app/vmselect/main.go
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4653
* feat: add page to display a list of active queries (#4598)
* app/vmagent: code formatting
* fix: remove console
---------
Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
The `a op b keep_metric_names` is ambigouos to `a op (b keep_metric_names)` when `b` is a transform or rollup function.
For example, `a + rate(b) keep_metric_names`. So it is better to use more clear syntax: `(a op b) keep_metric_names`
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3710
* metricsql: add support of using keep_metric_names for binary operations
This should help to avoid confusion with queries like one in the issue #3710.
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* wip
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Previously all the newly ingested time series were registered in global `MetricName -> TSID` index.
This index was used during data ingestion for locating the TSID (internal series id)
for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names).
The `MetricName -> TSID` index is stored on disk in order to make sure that the data
isn't lost on VictoriaMetrics restart or unclean shutdown.
The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding
data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache,
and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics
uses in-memory cache for speeding up the lookup for active time series.
This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested
active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk.
VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases:
- If `storage/tsid` cache capacity isn't enough for active time series.
Then just increase available memory for VictoriaMetrics or reduce the number of active time series
ingested into VictoriaMetrics.
- If new time series is ingested into VictoriaMetrics. In this case it cannot find
the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index,
since it doesn't know that the index has no the corresponding entry too.
This is a typical event under high churn rate, when old time series are constantly substituted
with new time series.
Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index,
are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics.
Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName`
for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod.
This index can become very large under high churn rate and long retention. VictoriaMetrics
caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups.
The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing
recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series.
This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly
reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics
consults only the index for the current day when new time series is ingested into it.
The downside of this change is increased indexdb size on disk for workloads without high churn rate,
e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store
identical `MetricName -> TSID` entries for static time series for every day.
This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation,
since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 .
At the same time the change fixes the issue, which could result in lost access to time series,
which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed
in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685
This is a follow-up for 1f28b46ae9
This simplifies routing at auth proxies such as vmauth to vlselect component,
which serves VMUI - just route all the requests, which start with /select/, to vlselect.