From b673fe28e996de6d44e573e4dcf15bca82b4687d Mon Sep 17 00:00:00 2001 From: Fred Navruzov Date: Sat, 10 Aug 2024 15:54:27 +0200 Subject: [PATCH] docs: vmanomaly - release v1.15.1 (#6782) ### Describe Your Changes vmanomaly - release v1.15.1 updates to docs: - changelog page - reader page (new arguments docs) - typos & fixes ### Checklist The following checks are **mandatory**: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit 985e4f0b99bbacf36ad7175024db8afd2d9362f3) --- docs/anomaly-detection/CHANGELOG.md | 14 +++++ docs/anomaly-detection/components/models.md | 4 +- docs/anomaly-detection/components/reader.md | 70 +++++++++------------ 3 files changed, 46 insertions(+), 42 deletions(-) diff --git a/docs/anomaly-detection/CHANGELOG.md b/docs/anomaly-detection/CHANGELOG.md index 726c4ff79..1c0b96c17 100644 --- a/docs/anomaly-detection/CHANGELOG.md +++ b/docs/anomaly-detection/CHANGELOG.md @@ -13,6 +13,20 @@ Please find the changelog for VictoriaMetrics Anomaly Detection below. > **Important note: Users are strongly encouraged to upgrade to `vmanomaly` [v1.9.2](https://hub.docker.com/repository/docker/victoriametrics/vmanomaly/tags?page=1&ordering=name) or newer for optimal performance and accuracy.

This recommendation is crucial for configurations with a low `infer_every` parameter [in your scheduler](./components/scheduler.md#parameters-1), and in scenarios where data exhibits significant high-order seasonality patterns (such as hourly or daily cycles). Previous versions from v1.5.1 to v1.8.0 were identified to contain a critical issue impacting model training, where models were inadvertently trained on limited data subsets, leading to suboptimal fits, affecting the accuracy of anomaly detection.

Upgrading to v1.9.2 addresses this issue, ensuring proper model training and enhanced reliability. For users utilizing Helm charts, it is recommended to upgrade to version [1.0.0](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/CHANGELOG.md#100) or newer.** +## v1.15.1 +Released: 2024-08-08 +- FEATURE: Introduced backward-compatible `data_range` [query-specific parameter](https://docs.victoriametrics.com/anomaly-detection/components/reader/#per-query-parameters) to the [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader). It enables the definition of **valid** data ranges for input per individual query in `queries`, resulting in: + - **High anomaly scores** (>1) when the *data falls outside the expected range*, indicating a data constraint violation. + - **Lowest anomaly scores** (=0) when the *model's predictions (`yhat`) fall outside the expected range*, signaling uncertain predictions. + - For more details, please refer to the [documentation](https://docs.victoriametrics.com/anomaly-detection/components/reader/?highlight=data_range#per-query-parameters). + +- IMPROVEMENT: Added `latency_offset` argument to the [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) to override the default `-search.latencyOffset` [flag of VictoriaMetrics](https://docs.victoriametrics.com/?highlight=search.latencyOffset#list-of-command-line-flags) (30s). The default value is set to 1ms, which should help in cases where `sampling_frequency` is low (10-60s) and `sampling_frequency` equals `infer_every` in the [PeriodicScheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/?highlight=infer_every#periodic-scheduler). This prevents users from receiving `service - WARNING - [Scheduler [scheduler_alias]] No data available for inference.` warnings in logs and allows for consecutive `infer` calls without gaps. To restore the backward compatible behavior, set it equal to your `-search.latencyOffset` value in [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) config section. + +- FIX: Ensure the `use_transform` argument of the [`OnlineQuantileModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/ +#online-seasonal-quantile) functions as intended. +- FIX: Add a docstring for `query_from_last_seen_timestamp` arg of [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader). + + ## v1.15.0 Released: 2024-08-06 - FEATURE: Introduced models that support [online learning](https://en.wikipedia.org/wiki/Online_machine_learning) for stream-like input. These models significantly reduce the amount of data required for the initial fit stage. For example, they enable reducing `fit_every` from **weeks to hours** and increasing `fit_every` from **hours to weeks** in the [PeriodicScheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#periodic-scheduler), significantly reducing the **peak amount** of data queried from VictoriaMetrics during `fit` stages. The next models were added: diff --git a/docs/anomaly-detection/components/models.md b/docs/anomaly-detection/components/models.md index 1c25872de..9bb53a03e 100644 --- a/docs/anomaly-detection/components/models.md +++ b/docs/anomaly-detection/components/models.md @@ -633,7 +633,7 @@ It uses the `quantiles` triplet to calculate `yhat_lower`, `yhat`, and `yhat_upp *Parameters specific for vmanomaly*: -* `class` (string) - model class name `"model.online.OnlineSeasonalQuantile"` (or `quantile_online` starting from [v1.13.0](../CHANGELOG.md#1130) with class alias support) +* `class` (string) - model class name `"model.online.OnlineQuantileModel"` (or `quantile_online` starting from [v1.13.0](../CHANGELOG.md#1130) with class alias support) * `quantiles` (list[float], optional) - The quantiles to estimate. `yhat_lower`, `yhat`, `yhat_upper` are the quantile order. By default (0.01, 0.5, 0.99). * `seasonal_interval` (string, optional) - the interval for the seasonal adjustment. If not set, the model will equal to a simple online quantile model. By default not set. * `min_subseason` (str, optional) - the minimum interval to estimate quantiles for. By default not set. Note that the minimum interval should be a multiple of the seasonal interval, i.e. if seasonal_interval='2h', then min_subseason='15m' is valid, but '37m' is not. @@ -651,7 +651,7 @@ Suppose we have a data with strong intraday (hourly) and intraweek (daily) seaso ```yaml models: your_desired_alias_for_a_model: - class: "quantile_online" # or 'model.online.OnlineSeasonalQuantile' + class: "quantile_online" # or 'model.online.OnlineQuantileModel' quantiles: [0.025, 0.5, 0.975] # lowered to exclude anomalous edges, can be compensated by `scale` param > 1 seasonal_interval: '7d' # longest seasonality (week, day) = week, starting from `season_starts_from` min_subseason: '1h' # smallest seasonality (week, day, hour) = hour, will have its own quantile estimates diff --git a/docs/anomaly-detection/components/reader.md b/docs/anomaly-detection/components/reader.md index 883d08778..1943d7427 100644 --- a/docs/anomaly-detection/components/reader.md +++ b/docs/anomaly-detection/components/reader.md @@ -62,6 +62,11 @@ Starting from [v1.13.0](/anomaly-detection/changelog#v1130) there is change of [ > **Note**: having **different** individual `step` args for queries (i.e. `30s` for `q1` and `2m` for `q2`) is not yet supported for [multivariate model](/anomaly-detection/components/models/index.html#multivariate-models) if you want to run it on several queries simultaneously (i.e. setting [`queries`](/anomaly-detection/components/models/#queries) arg of a model to [`q1`, `q2`]). +- `data_range` (list[float | string]): Introduced in [v1.15.1](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1151), it allows defining **valid** data ranges for input per individual query in `queries`, resulting in: + - **High anomaly scores** (>1) when the *data falls outside the expected range*, indicating a data constraint violation. + - **Lowest anomaly scores** (=0) when the *model's predictions (`yhat`) fall outside the expected range*, meaning uncertain predictions. + + ### Per-query config example ```yaml reader: @@ -72,6 +77,7 @@ reader: ingestion_rate: expr: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0' step: '2m' # overrides global `sampling_period` of 1m + data_range: [10, 'inf'] # meaning only positive values > 10 are expected, i.e. a value `y` < 10 will trigger anomaly score > 1 ``` ### Config parameters @@ -91,182 +97,166 @@ reader: `class` - `reader.vm.VmReader` (or `vm` starting from [v1.13.0](../CHANGELOG.md#v1130)) - Name of the class needed to enable reading from VictoriaMetrics or Prometheus. VmReader is the default option, if not specified. - `queries` - See [per-query config example](#per-query-config-example) above +See [per-query config example](#per-query-config-example) above - - - + See [per-query config section](#per-query-parameters) above - `datasource_url` - `http://localhost:8481/` - Datasource URL address - `tenant_id` - `0:0` - For VictoriaMetrics Cluster version only, tenants are identified by accountID or accountID:projectID. See VictoriaMetrics Cluster [multitenancy docs](../../Cluster-VictoriaMetrics.md#multitenancy) - `sampling_period` - `1h` - Frequency of the points returned. Will be converted to `/query_range?step=%s` param (in seconds). **Required** since [v1.9.0](../CHANGELOG.md#v190). - `query_range_path` - `/api/v1/query_range` - Performs PromQL/MetricsQL range query - `health_path` - `health` - Absolute or relative URL address where to check availability of the datasource. - `user` - `USERNAME` - BasicAuth username - `password` - `PASSWORD` - BasicAuth password - `timeout` - `30s` - Timeout for the requests, passed as a string - `verify_tls` - `false` - Allows disabling TLS verification of the remote certificate. - `bearer_token` - `token` - Token is passed in the standard format with header: `Authorization: bearer {token}` - `extra_filters` - `[]` - List of strings with series selector. See: [Prometheus querying API enhancements](../../README.md##prometheus-querying-api-enhancements) + + +`query_from_last_seen_timestamp` + + +`True` + + +If True, then query will be performed from the last seen timestamp for a given series. If False, then query will be performed from the start timestamp, based on a schedule period. Defaults to `True`. (`False` prior to [v1.15.1](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1151)). Useful for `infer` stages in case there were skipped `infer` calls prior to given. + + + + +`latency_offset` + + +`1ms` + + +Introduced in [v1.15.1](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1151), it allows overriding the default `-search.latencyOffset` [flag of VictoriaMetrics](https://docs.victoriametrics.com/?highlight=search.latencyOffset#list-of-command-line-flags) (30s). The default value is set to 1ms, which should help in cases where `sampling_frequency` is low (10-60s) and `sampling_frequency` equals `infer_every` in the [PeriodicScheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/?highlight=infer_every#periodic-scheduler). This prevents users from receiving `service - WARNING - [Scheduler [scheduler_alias]] No data available for inference.` warnings in logs and allows for consecutive `infer` calls without gaps. To restore the old behavior, set it equal to your `-search.latencyOffset` [flag value]((https://docs.victoriametrics.com/?highlight=search.latencyOffset#list-of-command-line-flags)). + +