diff --git a/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml b/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml index 3993ebf4a..3a9671819 100644 --- a/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml +++ b/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml @@ -73,7 +73,7 @@ services: restart: always vmanomaly: container_name: vmanomaly - image: victoriametrics/vmanomaly:v1.11.0 + image: victoriametrics/vmanomaly:latest depends_on: - "victoriametrics" ports: @@ -87,7 +87,7 @@ services: platform: "linux/amd64" command: - "/config.yaml" - - "--license-file=/license" + - "--licenseFile=/license" alertmanager: container_name: alertmanager image: prom/alertmanager:v0.27.0 diff --git a/docs/anomaly-detection/CHANGELOG.md b/docs/anomaly-detection/CHANGELOG.md index e58bef0dc..13ab40cc1 100644 --- a/docs/anomaly-detection/CHANGELOG.md +++ b/docs/anomaly-detection/CHANGELOG.md @@ -11,7 +11,18 @@ aliases: --- Please find the changelog for VictoriaMetrics Anomaly Detection below. -> **Important note: Users are strongly encouraged to upgrade to `vmanomaly` [v1.9.2](https://hub.docker.com/repository/docker/victoriametrics/vmanomaly/tags?page=1&ordering=name) or newer for optimal performance and accuracy.

This recommendation is crucial for configurations with a low `infer_every` parameter [in your scheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#parameters-1), and in scenarios where data exhibits significant high-order seasonality patterns (such as hourly or daily cycles). Previous versions from v1.5.1 to v1.8.0 were identified to contain a critical issue impacting model training, where models were inadvertently trained on limited data subsets, leading to suboptimal fits, affecting the accuracy of anomaly detection.

Upgrading to v1.9.2 addresses this issue, ensuring proper model training and enhanced reliability. For users utilizing Helm charts, it is recommended to upgrade to version [1.0.0](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/CHANGELOG.md#100) or newer.** +### v1.16.0 +Released: 2024-10-01 +- FEATURE: Introduced data dumps to a host filesystem for [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader#vm-reader). Resource-intensive setups (multiple queries returning many metrics, bigger `fit_window` arg) will have RAM consumption reduced during fit calls. +- IMPROVEMENT: Added a `groupby` argument for logical grouping in [multivariate models](https://docs.victoriametrics.com/anomaly-detection/components/models#multivariate-models). When specified, a separate multivariate model is trained for each unique combination of label values in the `groupby` columns. For example, to perform multivariate anomaly detection on metrics at the machine level without cross-entity interference, you can use `groupby: [host]` or `groupby: [instance]`, ensuring one model per entity being trained (e.g., per host). Please find more details [here](https://docs.victoriametrics.com/anomaly-detection/components/models/#group-by). +- IMPROVEMENT: Improved performance of [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader#vm-reader) on multicore instances for reading and data processing. +- IMPROVEMENT: Introduced new CLI argument aliases to enhance compatibility with [Helm charts](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/README.md) (i.e. using secrets) and better align with [VictoriaMetrics flags](https://docs.victoriametrics.com/#list-of-command-line-flags): + - `--licenseFile` as an alias for `--license-file` + - `--license.forceOffline` as an alias for `--license-verify-offline` + - `--loggerLevel` as an alias for `--log-level` + - The previous argument format is retained for backward compatibility. + +- FIX: The `provide_series` [common argument](https://docs.victoriametrics.com/anomaly-detection/components/models/#provide-series) now correctly filters the written time series in the [IsolationForestMultivariate](https://docs.victoriametrics.com/anomaly-detection/components/models/#isolation-forest-multivariate) model. ## v1.15.9 Released: 2024-08-27 diff --git a/docs/anomaly-detection/FAQ.md b/docs/anomaly-detection/FAQ.md index b6aa386fd..87753ce54 100644 --- a/docs/anomaly-detection/FAQ.md +++ b/docs/anomaly-detection/FAQ.md @@ -120,7 +120,9 @@ Configuration above will produce N intervals of full length (`fit_window`=14d + ## Resource consumption of vmanomaly `vmanomaly` itself is a lightweight service, resource usage is primarily dependent on [scheduling](https://docs.victoriametrics.com/anomaly-detection/components/scheduler) (how often and on what data to fit/infer your models), [# and size of timeseries returned by your queries](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader), and the complexity of the employed [models](https://docs.victoriametrics.com/anomaly-detection/components/models). Its resource usage is directly related to these factors, making it adaptable to various operational scales. -> **Note**: Starting from [v1.13.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1130), there is a mode to save anomaly detection models on host filesystem after `fit` stage (instead of keeping them in-memory by default). **Resource-intensive setups** (many models, many metrics, bigger [`fit_window` arg](https://docs.victoriametrics.com/anomaly-detection/components/scheduler#periodic-scheduler-config-example)) and/or 3rd-party models that store fit data (like [ProphetModel](https://docs.victoriametrics.com/anomaly-detection/components/models#prophet) or [HoltWinters](https://docs.victoriametrics.com/anomaly-detection/components/models#holt-winters)) will have RAM consumption greatly reduced at a cost of slightly slower `infer` stage. To enable it, you need to set environment variable `VMANOMALY_MODEL_DUMPS_DIR` to desired location. [Helm charts](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/README.md) are being updated accordingly ([`StatefulSet`](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) for persistent storage starting from chart version `1.3.0`). +> **Note**: Starting from [v1.13.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1130), there is an option to save anomaly detection models to the host filesystem after the `fit` stage (instead of keeping them in memory by default). This is particularly useful for **resource-intensive setups** (e.g., many models, many metrics, or larger [`fit_window` argument](https://docs.victoriametrics.com/anomaly-detection/components/scheduler#periodic-scheduler-config-example)) and for 3rd-party models that store fit data (such as [ProphetModel](https://docs.victoriametrics.com/anomaly-detection/components/models#prophet) or [HoltWinters](https://docs.victoriametrics.com/anomaly-detection/components/models#holt-winters)). This reduces RAM consumption significantly, though at the cost of slightly slower `infer` stages. To enable this, set the environment variable `VMANOMALY_MODEL_DUMPS_DIR` to the desired location. If using [Helm charts](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/README.md), starting from chart version `1.3.0` `.persistentVolume.enabled` should be set to `true` in [values.yaml](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/values.yaml). + +> **Note**: Starting from [v1.16.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1160), a similar optimization is available for data read from VictoriaMetrics TSDB. To use this, set the environment variable `VMANOMALY_DATA_DUMPS_DIR` to the desired location. Here's an example of how to set it up in docker-compose using volumes: ```yaml @@ -138,17 +140,20 @@ services: - ./vmanomaly_license:/license # map the host directory to the container directory - vmanomaly_model_dump_dir:/vmanomaly/tmp/models + - vmanomaly_data_dump_dir:/vmanomaly/tmp/data environment: # set the environment variable for the model dump directory - VMANOMALY_MODEL_DUMPS_DIR=/vmanomaly/tmp/models/ + VMANOMALY_DATA_DUMPS_DIR=/vmanomaly/tmp/data/ platform: "linux/amd64" command: - "/config.yaml" - - "--license-file=/license" + - "--licenseFile=/license" volumes: # ... vmanomaly_model_dump_dir: {} + vmanomaly_data_dump_dir: {} ``` > **Note**: Starting from [v1.15.0](https://docs.victoriametrics.com/anomaly-detection/changelog#v1150) with the introduction of [online models](https://docs.victoriametrics.com/anomaly-detection/components/models/#online-models), you can additionally reduce resource consumption (e.g., flatten `fit` stage peaks by querying less data from VictoriaMetrics at once). diff --git a/docs/anomaly-detection/Overview.md b/docs/anomaly-detection/Overview.md index 916ce4565..193a9bbc6 100644 --- a/docs/anomaly-detection/Overview.md +++ b/docs/anomaly-detection/Overview.md @@ -250,26 +250,25 @@ docker run -it --net [YOUR_NETWORK] \ -v YOUR_LICENSE_FILE_PATH:/license \ -v YOUR_CONFIG_FILE_PATH:/config.yml \ vmanomaly /config.yml \ - --license-file=/license + --licenseFile=/license ``` ### Licensing The license key can be passed via the following command-line flags: ``` - --license LICENSE See https://victoriametrics.com/products/enterprise/ - for trial license - --license-file LICENSE_FILE - See https://victoriametrics.com/products/enterprise/ - for trial license - --license-verify-offline {true,false} - Force offline verification of license code. License is - verified online by default. This flag runs license - verification offline. + --license STRING License key for VictoriaMetrics Enterprise. + See https://victoriametrics.com/products/enterprise/trial/ to obtain a trial license. + --licenseFile STRING Path to file with license key for VictoriaMetrics Enterprise. + See https://victoriametrics.com/products/enterprise/trial/ to obtain a trial license. + --license.forceOffline + Whether to force offline verification for VictoriaMetrics Enterprise license key, + which has been passed either via -license or via -licenseFile command-line flag. + The issued license key must support offline verification feature. + Contact info@victoriametrics.com if you need offline license verification. ``` - In order to make it easier to monitor the license expiration date, the following metrics are exposed(see [Monitoring](#monitoring) section for details on how to scrape them): diff --git a/docs/anomaly-detection/QuickStart.md b/docs/anomaly-detection/QuickStart.md index b0ed85e95..05d6b88b4 100644 --- a/docs/anomaly-detection/QuickStart.md +++ b/docs/anomaly-detection/QuickStart.md @@ -50,7 +50,7 @@ export YOUR_CONFIG_FILE_PATH=path/to/config/file docker run -it -v $YOUR_LICENSE_FILE_PATH:/license \ -v $YOUR_CONFIG_FILE_PATH:/config.yml \ vmanomaly /config.yml \ - --license-file=/license + --licenseFile=/license ``` In case you found `PermissionError: [Errno 13] Permission denied:` in `vmanomaly` logs, set user/user group to 1000 in the run command above / in a docker-compose file: @@ -62,7 +62,7 @@ docker run -it --user 1000:1000 \ -v $YOUR_LICENSE_FILE_PATH:/license \ -v $YOUR_CONFIG_FILE_PATH:/config.yml \ vmanomaly /config.yml \ - --license-file=/license + --licenseFile=/license ``` ```yaml @@ -76,7 +76,7 @@ services: $YOUR_CONFIG_FILE_PATH:/config.yml command: - "/config.yml" - - "--license-file=/license" + - "--licenseFile=/license" # ... ``` diff --git a/docs/anomaly-detection/components/models.md b/docs/anomaly-detection/components/models.md index ffc2f0a22..af16f18af 100644 --- a/docs/anomaly-detection/components/models.md +++ b/docs/anomaly-detection/components/models.md @@ -126,7 +126,7 @@ models: provide_series: ['anomaly_score'] # only `anomaly_score` metric will be available for writing back to the database ``` -**Note** If `provide_series` is not specified in model config, the model will produce its default [model-dependent output](#vmanomaly-output). The output can't be less than `['anomaly_score']`. Even if `timestamp` column is omitted, it will be implicitly added to `provide_series` list, as it's required for metrics to be properly written. +> **Note**: If `provide_series` is not specified in model config, the model will produce its default [model-dependent output](#vmanomaly-output). The output can't be less than `['anomaly_score']`. Even if `timestamp` column is omitted, it will be implicitly added to `provide_series` list, as it's required for metrics to be properly written. ### Detection direction Introduced in [1.13.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#1130), `detection_direction` arg can help in reducing the number of [false positives](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#false-positive) and increasing the accuracy, when domain knowledge suggest to identify anomalies occurring when actual values (`y`) are *above, below, or in both directions* relative to the expected values (`yhat`). Available choices are: `both`, `above_expected`, `below_expected`. @@ -224,7 +224,46 @@ models: z_threshold: 3 # if not set, equals to setting min_dev_from_expected == 0 queries: ['normal_behavior'] # use the default where it's not needed -``` +``` + +### Group By + +> **Note**: The `groupby` argument works only in combination with [multivariate models](#multivariate-models). + +Introduced in [v1.16.0](https://docs.victoriametrics.com/anomaly-detection/changelog#v1160), the `groupby` argument (`list[string]`) enables logical grouping within [multivariate models](#multivariate-models). When specified, **a separate multivariate model is trained for each unique combination of label values present in the `groupby` columns**. + +For example, to perform multivariate anomaly detection at the machine level while avoiding interference between different entities, you can set `groupby: [host]` or `groupby: [instance]`. This ensures that a **separate multivariate** model is trained for each individual entity (e.g., per host). Below is a simplified example illustrating how to track multivariate anomalies using CPU, RAM, and network data for each host. + +```yaml +# other config sections ... +reader: + # other reader params ... + # assume there are M unique hosts identified by the `host` label + queries: + # return one timeseries for each CPU mode per host, total = N*M timeseries + cpu: sum(rate(node_cpu_seconds_total[5m])) by (host, mode) + # return one timeseries per host, total = 1*M timeseries + ram: | + ( + (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) + / node_memory_MemTotal_bytes + ) * 100 by (host) + # return one timeseries per host for both network receive and transmit data, total = 1*M timeseries + network: | + sum(rate(node_network_receive_bytes_total[5m])) by (host) + + sum(rate(node_network_transmit_bytes_total[5m])) by (host) + +models: + iforest: # alias for the model + class: isolation_forest_multivariate + contamination: 0.01 + # the multivariate model can be trained on 2+ timeseries returned by 1+ queries + queries: [cpu, ram, network] + # train a distinct multivariate model for each unique value found in the `host` label + # a single multivariate model will be trained on (N + 1 + 1) timeseries, total = M models + groupby: [host] +``` + ## Model types @@ -260,6 +299,8 @@ For a multivariate type, **one shared model** is fit/used for inference on **all For example, if you have some **multivariate** model to use 3 [MetricQL queries](https://docs.victoriametrics.com/metricsql/), each returning 5 time series, there will be one shared model created in total. Once fit, this model will expect **exactly 15 time series with exact same labelsets as an input**. This model will produce **one shared [output](#vmanomaly-output)**. +> **Note:** Starting from [v1.16.0](https://docs.victoriametrics.com/anomaly-detection/changelog#v1160), N models — one for each unique combination of label values specified in the `groupby` [common argument](#group-by) — can be trained. This allows for context separation (e.g., one model per host, region, or other relevant grouping label), leading to improved accuracy and faster training. See an example [here](#group-by). + If during an inference, you got a **different amount of series** or some series having a **new labelset** (not present in any of fitted models), the inference will be skipped until you get a model, trained particularly for such labelset during forthcoming re-fit step. **Implications:** Multivariate models are a go-to default, when your queries returns **fixed** amount of **individual** time series (say, some aggregations), to be used for adding cross-series (and cross-query) context, useful for catching [collective anomalies](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/#collective-anomalies) or [novelties](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/#novelties) (expanded to multi-input scenario). For example, you may set it up for anomaly detection of CPU usage in different modes (`idle`, `user`, `system`, etc.) and use its cross-dependencies to detect **unseen (in fit data)** behavior. @@ -935,7 +976,7 @@ docker run -it \ -v $(PWD)/custom_model.py:/vmanomaly/model/custom.py \ -v $(PWD)/custom.yaml:/config.yaml \ victoriametrics/vmanomaly:latest /config.yaml \ ---license-file=/license +--licenseFile=/license ``` Please find more detailed instructions (license, etc.) [here](https://docs.victoriametrics.com/anomaly-detection/overview/#run-vmanomaly-docker-container) diff --git a/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md b/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md index 375eb4c23..85e9b4caa 100644 --- a/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md +++ b/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md @@ -385,7 +385,7 @@ services: restart: always vmanomaly: container_name: vmanomaly - image: victoriametrics/vmanomaly:v1.11.0 + image: victoriametrics/vmanomaly:latest depends_on: - "victoriametrics" ports: @@ -399,7 +399,7 @@ services: platform: "linux/amd64" command: - "/config.yaml" - - "--license-file=/license" + - "--licenseFile=/license" alertmanager: container_name: alertmanager image: prom/alertmanager:v0.25.0