docs/vmanomaly - typos fix & clarity (#6793)

### Describe Your Changes

typos fix & clarity improvement of vmanomaly docs after v1.15.1 release

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit a99fcfbf1a)
This commit is contained in:
Fred Navruzov 2024-08-10 16:43:05 +02:00 committed by hagen1778
parent b673fe28e9
commit b084b4fb0f
No known key found for this signature in database
GPG key ID: 3BF75F3741CA9640
4 changed files with 31 additions and 15 deletions

View file

@ -14,7 +14,7 @@ Please find the changelog for VictoriaMetrics Anomaly Detection below.
> **Important note: Users are strongly encouraged to upgrade to `vmanomaly` [v1.9.2](https://hub.docker.com/repository/docker/victoriametrics/vmanomaly/tags?page=1&ordering=name) or newer for optimal performance and accuracy. <br><br> This recommendation is crucial for configurations with a low `infer_every` parameter [in your scheduler](./components/scheduler.md#parameters-1), and in scenarios where data exhibits significant high-order seasonality patterns (such as hourly or daily cycles). Previous versions from v1.5.1 to v1.8.0 were identified to contain a critical issue impacting model training, where models were inadvertently trained on limited data subsets, leading to suboptimal fits, affecting the accuracy of anomaly detection. <br><br> Upgrading to v1.9.2 addresses this issue, ensuring proper model training and enhanced reliability. For users utilizing Helm charts, it is recommended to upgrade to version [1.0.0](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/CHANGELOG.md#100) or newer.**
## v1.15.1
Released: 2024-08-08
Released: 2024-08-10
- FEATURE: Introduced backward-compatible `data_range` [query-specific parameter](https://docs.victoriametrics.com/anomaly-detection/components/reader/#per-query-parameters) to the [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader). It enables the definition of **valid** data ranges for input per individual query in `queries`, resulting in:
- **High anomaly scores** (>1) when the *data falls outside the expected range*, indicating a data constraint violation.
- **Lowest anomaly scores** (=0) when the *model's predictions (`yhat`) fall outside the expected range*, signaling uncertain predictions.
@ -22,8 +22,7 @@ Released: 2024-08-08
- IMPROVEMENT: Added `latency_offset` argument to the [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) to override the default `-search.latencyOffset` [flag of VictoriaMetrics](https://docs.victoriametrics.com/?highlight=search.latencyOffset#list-of-command-line-flags) (30s). The default value is set to 1ms, which should help in cases where `sampling_frequency` is low (10-60s) and `sampling_frequency` equals `infer_every` in the [PeriodicScheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/?highlight=infer_every#periodic-scheduler). This prevents users from receiving `service - WARNING - [Scheduler [scheduler_alias]] No data available for inference.` warnings in logs and allows for consecutive `infer` calls without gaps. To restore the backward compatible behavior, set it equal to your `-search.latencyOffset` value in [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) config section.
- FIX: Ensure the `use_transform` argument of the [`OnlineQuantileModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/
#online-seasonal-quantile) functions as intended.
- FIX: Ensure the `use_transform` argument of the [`OnlineQuantileModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/#online-seasonal-quantile) functions as intended.
- FIX: Add a docstring for `query_from_last_seen_timestamp` arg of [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader).

View file

@ -38,30 +38,42 @@ schedulers:
# what model types and with what hyperparams to run on your data
# https://docs.victoriametrics.com/anomaly-detection/components/models/
models:
zscore: # alias
zscore: # we can set up alias for model
class: 'zscore' # model class
z_threshold: 3.5
provide_series: ['anomaly_score'] # what series to produce
queries: ['host_network_receive_errors'] # what queries to run particular model on
schedulers: ['periodic_1d'] # will be attached to 1-day schedule, fit every 10m and infer every 30s
prophet: # alias
min_dev_from_expected: 0.0 # turned off. if |y - yhat| < min_dev_from_expected, anomaly score will be 0
detection_direction: 'above_expected' # detect anomalies only when y > yhat, "peaks"
prophet: # we can set up alias for model
class: 'prophet'
provide_series: ['anomaly_score', 'yhat', 'yhat_lower', 'yhat_upper']
queries: ['cpu_seconds_total']
schedulers: ['periodic_1w'] # will be attached to 1-week schedule, fit every 1h and infer every 15m
min_dev_from_expected: 0.01 # if |y - yhat| < 0.01, anomaly score will be 0
detection_direction: 'above_expected'
args: # model-specific arguments
interval_width: 0.98
# where to read data from
# https://docs.victoriametrics.com/anomaly-detection/components/reader/
# https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader
reader:
datasource_url: "http://victoriametrics:8428/"
datasource_url: "https://play.victoriametrics.com/"
tenant_id: "0:0"
class: 'vm'
sampling_period: "30s" # what data resolution of your data to have
sampling_period: "30s" # what data resolution to fetch from VictoriaMetrics' /query_range endpoint
latency_offset: '1ms'
query_from_last_seen_timestamp: False
queries: # aliases to MetricsQL expressions
cpu_seconds_total: 'avg(rate(node_cpu_seconds_total[5m])) by (mode)'
host_network_receive_errors: 'rate(node_network_receive_errs_total[3m]) / rate(node_network_receive_packets_total[3m])'
cpu_seconds_total:
expr: 'avg(rate(node_cpu_seconds_total[5m])) by (mode)'
# step: '30s' # if not set, will be equal to sampling_period
data_range: [0, 'inf'] # expected value range, anomaly_score > 1 if y (real value) is outside
host_network_receive_errors:
expr: 'rate(node_network_receive_errs_total[3m]) / rate(node_network_receive_packets_total[3m])'
step: '15m' # here we override per-query `sampling_period` to request way less data from VM TSDB
data_range: [0, 'inf']
# where to write data to
# https://docs.victoriametrics.com/anomaly-detection/components/writer/

View file

@ -623,7 +623,7 @@ Resulting metrics of the model are described [here](#vmanomaly-output).
### Online Seasonal Quantile
> **Note**: `OnlineSeasonalQuantile` is [univariate](#univariate-models), [non-rolling](#non-rolling-models), [online](#online-models) model.
> **Note**: `OnlineQuantileModel` is [univariate](#univariate-models), [non-rolling](#non-rolling-models), [online](#online-models) model.
Online (seasonal) quantile utilizes a set of approximate distributions, based on [t-digests](https://www.sciencedirect.com/science/article/pii/S2665963820300403) for online quantile estimation. Introduced in [v1.15.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1150).

View file

@ -240,10 +240,10 @@ List of strings with series selector. See: [Prometheus querying API enhancements
`query_from_last_seen_timestamp`
</td>
<td>
`True`
`False`
</td>
<td>
If True, then query will be performed from the last seen timestamp for a given series. If False, then query will be performed from the start timestamp, based on a schedule period. Defaults to `True`. (`False` prior to [v1.15.1](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1151)). Useful for `infer` stages in case there were skipped `infer` calls prior to given.
If True, then query will be performed from the last seen timestamp for a given series. If False, then query will be performed from the start timestamp, based on a schedule period. Defaults to `False`. Useful for `infer` stages in case there were skipped `infer` calls prior to given.
</td>
</tr>
<tr>
@ -265,11 +265,16 @@ Config file example:
```yaml
reader:
class: "vm" # or "reader.vm.VmReader" until v1.13.0
datasource_url: "http://localhost:8428/"
datasource_url: "https://play.victoriametrics.com/"
tenant_id: "0:0"
queries:
ingestion_rate: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
ingestion_rate:
expr: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
step: '1m' # can override global `sampling_period` on per-query level
data_range: [0, 'inf']
sampling_period: '1m'
query_from_last_seen_timestamp: True # false by default
latency_offset: '1ms'
```
### Healthcheck metrics