mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-12-01 14:47:38 +00:00
docs: vmanomaly - updates of v1.10.0 and model type section (#5813)
* - apply v1.10 changes
- chapter on model types (uni/multivariate and rolling)
* - update self-monitoring labels description
- fix typos
* fix duplicated text and link rendering
(cherry picked from commit 172e196ac9
)
This commit is contained in:
parent
5e4732cc2d
commit
d8de87aeb0
7 changed files with 264 additions and 91 deletions
|
@ -73,7 +73,7 @@ services:
|
||||||
restart: always
|
restart: always
|
||||||
vmanomaly:
|
vmanomaly:
|
||||||
container_name: vmanomaly
|
container_name: vmanomaly
|
||||||
image: victoriametrics/vmanomaly:v1.9.2
|
image: victoriametrics/vmanomaly:v1.10.0
|
||||||
depends_on:
|
depends_on:
|
||||||
- "victoriametrics"
|
- "victoriametrics"
|
||||||
ports:
|
ports:
|
||||||
|
|
|
@ -3,7 +3,8 @@ scheduler:
|
||||||
fit_every: "2m"
|
fit_every: "2m"
|
||||||
fit_window: "3h"
|
fit_window: "3h"
|
||||||
|
|
||||||
model:
|
models:
|
||||||
|
prophet:
|
||||||
class: "model.prophet.ProphetModel"
|
class: "model.prophet.ProphetModel"
|
||||||
args:
|
args:
|
||||||
interval_width: 0.98
|
interval_width: 0.98
|
||||||
|
|
|
@ -17,6 +17,23 @@ Please find the changelog for VictoriaMetrics Anomaly Detection below.
|
||||||
|
|
||||||
> **Important note: Users are strongly encouraged to upgrade to `vmanomaly` [v1.9.2](https://hub.docker.com/repository/docker/victoriametrics/vmanomaly/tags?page=1&ordering=name) or later versions for optimal performance and accuracy. <br><br> This recommendation is crucial for configurations with a low `infer_every` parameter [in your scheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#parameters-1), and in scenarios where data exhibits significant high-order seasonality patterns (such as hourly or daily cycles). Previous versions from v1.5.1 to v1.8.0 were identified to contain a critical issue impacting model training, where models were inadvertently trained on limited data subsets, leading to suboptimal fits, affecting the accuracy of anomaly detection. <br><br> Upgrading to v1.9.2 addresses this issue, ensuring proper model training and enhanced reliability. For users utilizing Helm charts, it is recommended to upgrade to version [1.0.0](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/CHANGELOG.md#100).**
|
> **Important note: Users are strongly encouraged to upgrade to `vmanomaly` [v1.9.2](https://hub.docker.com/repository/docker/victoriametrics/vmanomaly/tags?page=1&ordering=name) or later versions for optimal performance and accuracy. <br><br> This recommendation is crucial for configurations with a low `infer_every` parameter [in your scheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#parameters-1), and in scenarios where data exhibits significant high-order seasonality patterns (such as hourly or daily cycles). Previous versions from v1.5.1 to v1.8.0 were identified to contain a critical issue impacting model training, where models were inadvertently trained on limited data subsets, leading to suboptimal fits, affecting the accuracy of anomaly detection. <br><br> Upgrading to v1.9.2 addresses this issue, ensuring proper model training and enhanced reliability. For users utilizing Helm charts, it is recommended to upgrade to version [1.0.0](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-anomaly/CHANGELOG.md#100).**
|
||||||
|
|
||||||
|
## v1.10.0
|
||||||
|
Released: 2024-02-15
|
||||||
|
- FEATURE: Multi-model support. Now users can specify multiple [model specs](https://docs.victoriametrics.com/anomaly-detection/components/models/) in a single config (via aliasing), as well as to reference what [queries from VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/?highlight=queries#config-parameters) it should be run on.
|
||||||
|
- Introduction of `queries` arg in model spec:
|
||||||
|
- It allows the model to be executed only on a particular query subset from `reader` section.
|
||||||
|
- Passing an empty list or not specifying this param implies that each model is run on results from **all** queries, which is a backward-compatible behavior.
|
||||||
|
- Please find more details in docs on [Model section](https://docs.victoriametrics.com/anomaly-detection/components/models/#queries)
|
||||||
|
|
||||||
|
- DEPRECATION: slight refactor of a model config section
|
||||||
|
- Now models are passed as a mapping of `model_alias: model_spec` under [model](https://docs.victoriametrics.com/anomaly-detection/components/models/) sections. Using old format (<= [1.9.2](https://docs.victoriametrics.com/anomaly-detection/changelog/#v192)) will produce warnings for now and will be removed in future versions.
|
||||||
|
- Please find more details in docs on [Model section](https://docs.victoriametrics.com/anomaly-detection/components/models/)
|
||||||
|
- IMPROVEMENT: now logs from [`monitoring.pull`](https://docs.victoriametrics.com/anomaly-detection/components/monitoring/#monitoring-section-config-example) GET requests to `/metrics` endpoint are shown only in DEBUG mode
|
||||||
|
- IMPROVEMENT: labelset for multivariate models is deduplicated and cleaned, resulting in better UX
|
||||||
|
|
||||||
|
> **Note**: These updates support more flexible setup and effective resource management in service, as now it's not longer needed to spawn several instances of `vmanomaly` to split queries/models context across.
|
||||||
|
|
||||||
|
|
||||||
## v1.9.2
|
## v1.9.2
|
||||||
Released: 2024-01-29
|
Released: 2024-01-29
|
||||||
- BUGFIX: now multivariate models (like [`IsolationForestMultivariateModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/#isolation-foresthttpsenwikipediaorgwikiisolation_forest-multivariate)) are properly handled throughout fit/infer phases.
|
- BUGFIX: now multivariate models (like [`IsolationForestMultivariateModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/#isolation-foresthttpsenwikipediaorgwikiisolation_forest-multivariate)) are properly handled throughout fit/infer phases.
|
||||||
|
|
|
@ -162,7 +162,8 @@ scheduler:
|
||||||
fit_every: "2h"
|
fit_every: "2h"
|
||||||
fit_window: "14d"
|
fit_window: "14d"
|
||||||
|
|
||||||
model:
|
models:
|
||||||
|
prophet: # or use a model alias of your choice here
|
||||||
class: "model.prophet.ProphetModel"
|
class: "model.prophet.ProphetModel"
|
||||||
args:
|
args:
|
||||||
interval_width: 0.98
|
interval_width: 0.98
|
||||||
|
@ -217,7 +218,7 @@ This will expose metrics at `http://0.0.0.0:8080/metrics` page.
|
||||||
To use *vmanomaly* you need to pull docker image:
|
To use *vmanomaly* you need to pull docker image:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
docker pull victoriametrics/vmanomaly:v1.9.2
|
docker pull victoriametrics/vmanomaly:v1.10.0
|
||||||
```
|
```
|
||||||
|
|
||||||
> Note: please check what is latest release in [CHANGELOG](/anomaly-detection/CHANGELOG.html)
|
> Note: please check what is latest release in [CHANGELOG](/anomaly-detection/CHANGELOG.html)
|
||||||
|
@ -227,7 +228,7 @@ docker pull victoriametrics/vmanomaly:v1.9.2
|
||||||
You can put a tag on it for your convinience:
|
You can put a tag on it for your convinience:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
docker image tag victoriametrics/vmanomaly:v1.9.2 vmanomaly
|
docker image tag victoriametrics/vmanomaly:v1.10.0 vmanomaly
|
||||||
```
|
```
|
||||||
Here is an example of how to run *vmanomaly* docker container with [license file](#licensing):
|
Here is an example of how to run *vmanomaly* docker container with [license file](#licensing):
|
||||||
|
|
||||||
|
|
|
@ -16,10 +16,149 @@ aliases:
|
||||||
|
|
||||||
# Models
|
# Models
|
||||||
|
|
||||||
This section describes `Model` component of VictoriaMetrics Anomaly Detection (or simply [`vmanomaly`](/anomaly-detection/overview.html)) and the guide of how to define a respective section of a config to launch the service.
|
This section describes `Models` component of VictoriaMetrics Anomaly Detection (or simply [`vmanomaly`](/anomaly-detection/overview.html)) and the guide of how to define a respective section of a config to launch the service.
|
||||||
vmanomaly includes various [built-in models](#built-in-models) and you can integrate your custom model with vmanomaly see [custom model](#custom-model-guide)
|
vmanomaly includes various [built-in models](#built-in-models) and you can integrate your custom model with vmanomaly see [custom model](#custom-model-guide)
|
||||||
|
|
||||||
|
|
||||||
|
> **Note: Starting from [v1.10.0](/anomaly-detection/changelog#v1100) model section in config supports multiple models via aliasing. <br>Also, `vmanomaly` expects model section to be named `models`. Using old (flat) format with `model` key is deprecated and will be removed in future versions. Having `model` and `models` sections simultaneously in a config will result in only `models` being used:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
models:
|
||||||
|
model_univariate_1:
|
||||||
|
class: "model.zscore.ZscoreModel"
|
||||||
|
z_threshold: 2.5
|
||||||
|
queries: ["query_alias2"] # referencing queries defined in `reader` section
|
||||||
|
model_multivariate_1:
|
||||||
|
class: "model.isolation_forest.IsolationForestMultivariateModel"
|
||||||
|
contamination: "auto"
|
||||||
|
args:
|
||||||
|
n_estimators: 100
|
||||||
|
# i.e. to assure reproducibility of produced results each time model is fit on the same input
|
||||||
|
random_state: 42
|
||||||
|
# if there is no explicit `queries` arg, then the model will be run on ALL queries found in reader section
|
||||||
|
```
|
||||||
|
|
||||||
|
Old-style configs (< [1.10.0](/anomaly-detection/changelog#v1100) )
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model:
|
||||||
|
class: "model.zscore.ZscoreModel"
|
||||||
|
z_threshold: 2.5
|
||||||
|
# no explicit `queries` arg is provided
|
||||||
|
```
|
||||||
|
|
||||||
|
will be **implicitly** converted to
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
models:
|
||||||
|
default_model: # default model alias, backward compatibility
|
||||||
|
class: "model.zscore.ZscoreModel"
|
||||||
|
z_threshold: 2.5
|
||||||
|
# queries arg is created and propagated with all query aliases found in `queries` arg of `reader` section
|
||||||
|
queries: ["q1", "q2", "q3"] # i.e., if your `queries` in `reader` section has exactly q1, q2, q3 aliases
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Common args
|
||||||
|
|
||||||
|
From [1.10.0](/anomaly-detection/changelog#1100), **common args**, supported by *every model (and model type)* were introduced.
|
||||||
|
|
||||||
|
### Queries
|
||||||
|
|
||||||
|
Introduced in [1.10.0](/anomaly-detection/changelog#1100), as a part to support multi-model configs, `queries` arg is meant to define [queries from VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/?highlight=queries#config-parameters) particular model should be run on (meaning, all the series returned by each of these queries will be used in such model for fitting and inferencing).
|
||||||
|
|
||||||
|
`queries` arg is supported for all [the built-in](/anomaly-detection/components/models/#built-in-models) (as well as for [custom](/anomaly-detection/components/models/#custom-model-guide)) models.
|
||||||
|
|
||||||
|
This arg is **backward compatible** - if there is no explicit `queries` arg, then the model, defined in a config, will be run on ALL queries found in reader section:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
models:
|
||||||
|
model_alias_1:
|
||||||
|
...
|
||||||
|
# no explicit `queries` arg is provided
|
||||||
|
```
|
||||||
|
|
||||||
|
will be implicitly converted to
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
models:
|
||||||
|
model_alias_1:
|
||||||
|
...
|
||||||
|
# queries arg is created and propagated with all query aliases found in `queries` arg of `reader` section
|
||||||
|
queries: ["q1", "q2", "q3"] # i.e., if your `queries` in `reader` section has exactly q1, q2, q3 aliases
|
||||||
|
```
|
||||||
|
|
||||||
|
## Model types
|
||||||
|
|
||||||
|
There are **2 model types**, supported in `vmanomaly`, resulting in **4 possible combinations**:
|
||||||
|
|
||||||
|
- [Univariate models](#univariate-models)
|
||||||
|
- [Multivariate models](#multivariate-models)
|
||||||
|
|
||||||
|
Each of these models can be
|
||||||
|
- [rolling](#rolling-models)
|
||||||
|
- [non-rolling](#non-rolling-models)
|
||||||
|
|
||||||
|
### Univariate Models
|
||||||
|
|
||||||
|
For a univariate type, **one separate model** is fit/used for inference per **each time series**, defined in its [queries](#queries) arg.
|
||||||
|
|
||||||
|
For example, if you have some **univariate** model, defined to use 3 [MetricQL queries](https://docs.victoriametrics.com/metricsql/), each returning 5 time series, there will be 3*5=15 models created in total. Each such model produce **individual [output](#vmanomaly-output)** for each of time series.
|
||||||
|
|
||||||
|
If during an inference, you got a series having **new labelset** (not present in any of fitted models), the inference will be skipped until you get a model, trained particularly for such labelset during forthcoming re-fit step.
|
||||||
|
|
||||||
|
**Implications:** Univariate models are a go-to default, when your queries returns **changing** amount of **individual** time series of **different** magnitude, [trend](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#trend) or [seasonality](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#seasonality), so you won't be mixing incompatible data with different behavior within a single fit model (context isolation).
|
||||||
|
|
||||||
|
**Examples:** [Prophet](#prophet), [Holt-Winters](#holt-winters)
|
||||||
|
|
||||||
|
<!-- TODO: add schema -->
|
||||||
|
|
||||||
|
### Multivariate Models
|
||||||
|
|
||||||
|
For a multivariate type, **one shared model** is fit/used for inference on **all time series** simultaneously, defined in its [queries](#queries) arg.
|
||||||
|
|
||||||
|
For example, if you have some **multivariate** model to use 3 [MetricQL queries](https://docs.victoriametrics.com/metricsql/), each returning 5 time series, there will be one shared model created in total. Once fit, this model will expect **exactly 15 time series with exact same labelsets as an input**. This model will produce **one shared [output](#vmanomaly-output)**.
|
||||||
|
|
||||||
|
If during an inference, you got a **different amount of series** or some series having a **new labelset** (not present in any of fitted models), the inference will be skipped until you get a model, trained particularly for such labelset during forthcoming re-fit step.
|
||||||
|
|
||||||
|
**Implications:** Multivariate models are a go-to default, when your queries returns **fixed** amount of **individual** time series (say, some aggregations), to be used for adding cross-series (and cross-query) context, useful for catching [collective anomalies](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/index.html#collective-anomalies) or [novelties](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/index.html#novelties) (expanded to multi-input scenario). For example, you may set it up for anomaly detection of CPU usage in different modes (`idle`, `user`, `system`, etc.) and use its cross-dependencies to detect **unseen (in fit data)** behavior.
|
||||||
|
|
||||||
|
**Examples:** [IsolationForest](#isolation-forest-multivariate)
|
||||||
|
|
||||||
|
<!-- TODO: add schema -->
|
||||||
|
|
||||||
|
### Rolling Models
|
||||||
|
|
||||||
|
A rolling model is a model that, once trained, **cannot be (naturally) used to make inference on data, not seen during its fit phase**.
|
||||||
|
|
||||||
|
An instance of rolling model is **simultaneously fit and used for inference** during its `infer` method call.
|
||||||
|
|
||||||
|
As a result, such model instances are **not stored** between consecutive re-fit calls (defined by `fit_every` [arg](/anomaly-detection/components/scheduler/?highlight=fit_every#periodic-scheduler) in `PeriodicScheduler`), leading to **lower RAM** consumption.
|
||||||
|
|
||||||
|
Such models put **more pressure** on your reader's source, i.e. if your model should be fit on large amount of data (say, 14 days with 1-minute resolution) and at the same time you have **frequent inference** (say, once per minute) on new chunks of data - that's because such models require (fit + infer) window of data to be fit first to be used later in each inference call.
|
||||||
|
|
||||||
|
> **Note**: Rolling models require `fit_every` to be set equal to `infer_every` in your [PeriodicScheduler](/anomaly-detection/components/scheduler/?highlight=fit_every#periodic-scheduler).
|
||||||
|
|
||||||
|
**Examples:** [RollingQuantile](#rolling-quantile)
|
||||||
|
|
||||||
|
<!-- TODO: add schema -->
|
||||||
|
|
||||||
|
### Non-Rolling Models
|
||||||
|
|
||||||
|
Everything that is not classified as [rolling](#rolling-models).
|
||||||
|
|
||||||
|
Produced models can be explicitly used to **infer on data, not seen during its fit phase**, thus, it **doesn't require re-fit procedure**.
|
||||||
|
|
||||||
|
Such models put **less pressure** on your reader's source, i.e. if you fit on large amount of data (say, 14 days with 1-minute resolution) but do it occasionally (say, once per day), at the same time you have **frequent inference**(say, once per minute) on new chunks of data
|
||||||
|
|
||||||
|
> **Note**: However, it's still highly recommended, to keep your model up-to-date with tendencies found in your data as it evolves in time.
|
||||||
|
|
||||||
|
Produced model instances are **stored in-memory** between consecutive re-fit calls (defined by `fit_every` [arg](/anomaly-detection/components/scheduler/?highlight=fit_every#periodic-scheduler) in `PeriodicScheduler`), leading to **higher RAM** consumption.
|
||||||
|
|
||||||
|
**Examples:** [Prophet](#prophet)
|
||||||
|
|
||||||
|
<!-- TODO: add schema -->
|
||||||
|
|
||||||
## Built-in Models
|
## Built-in Models
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
@ -68,7 +207,8 @@ Depending on chosen `seasonality` parameter FB Prophet can return additional met
|
||||||
*Config Example*
|
*Config Example*
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.prophet.ProphetModel"
|
class: "model.prophet.ProphetModel"
|
||||||
seasonalities:
|
seasonalities:
|
||||||
- name: 'hourly'
|
- name: 'hourly'
|
||||||
|
@ -95,7 +235,8 @@ Resulting metrics of the model are described [here](#vmanomaly-output)
|
||||||
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.zscore.ZscoreModel"
|
class: "model.zscore.ZscoreModel"
|
||||||
z_threshold: 2.5
|
z_threshold: 2.5
|
||||||
```
|
```
|
||||||
|
@ -131,7 +272,8 @@ Used to compute "seasonal_periods" param for the model (e.g. '1D' or '1W').
|
||||||
*Config Example*
|
*Config Example*
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.holtwinters.HoltWinters"
|
class: "model.holtwinters.HoltWinters"
|
||||||
seasonality: '1d'
|
seasonality: '1d'
|
||||||
frequency: '1h'
|
frequency: '1h'
|
||||||
|
@ -156,7 +298,8 @@ The MAD model is a robust method for anomaly detection that is *less sensitive*
|
||||||
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.mad.MADModel"
|
class: "model.mad.MADModel"
|
||||||
threshold: 2.5
|
threshold: 2.5
|
||||||
```
|
```
|
||||||
|
@ -175,13 +318,13 @@ Resulting metrics of the model are described [here](#vmanomaly-output).
|
||||||
*Config Example*
|
*Config Example*
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.rolling_quantile.RollingQuantileModel"
|
class: "model.rolling_quantile.RollingQuantileModel"
|
||||||
quantile: 0.9
|
quantile: 0.9
|
||||||
window_steps: 96
|
window_steps: 96
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
Resulting metrics of the model are described [here](#vmanomaly-output).
|
Resulting metrics of the model are described [here](#vmanomaly-output).
|
||||||
|
|
||||||
### [Seasonal Trend Decomposition](https://en.wikipedia.org/wiki/Seasonal_adjustment)
|
### [Seasonal Trend Decomposition](https://en.wikipedia.org/wiki/Seasonal_adjustment)
|
||||||
|
@ -198,7 +341,8 @@ Here we use Seasonal Decompose implementation from `statsmodels` [library](https
|
||||||
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.std.StdModel"
|
class: "model.std.StdModel"
|
||||||
period: 2
|
period: 2
|
||||||
```
|
```
|
||||||
|
@ -233,7 +377,8 @@ Here we use ARIMA implementation from `statsmodels` [library](https://www.statsm
|
||||||
*Config Example*
|
*Config Example*
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
class: "model.arima.ArimaModel"
|
class: "model.arima.ArimaModel"
|
||||||
# ARIMA's (p,d,q) order
|
# ARIMA's (p,d,q) order
|
||||||
order: [1, 1, 0]
|
order: [1, 1, 0]
|
||||||
|
@ -264,7 +409,8 @@ Here we use Isolation Forest implementation from `scikit-learn` [library](https:
|
||||||
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
# To use univariate model, substitute class argument with "model.isolation_forest.IsolationForestModel".
|
# To use univariate model, substitute class argument with "model.isolation_forest.IsolationForestModel".
|
||||||
class: "model.isolation_forest.IsolationForestMultivariateModel"
|
class: "model.isolation_forest.IsolationForestMultivariateModel"
|
||||||
contamination: "auto"
|
contamination: "auto"
|
||||||
|
@ -318,6 +464,8 @@ Here in this guide, we will
|
||||||
|
|
||||||
### 1. Custom model
|
### 1. Custom model
|
||||||
|
|
||||||
|
> **Note**: By default, each custom model is created as [**univariate**](#univariate-models) / [**non-rolling**](#non-rolling-models) model. If you want to override this behavior, define models inherited from `RollingModel` (to get a rolling model), or having `is_multivariate` class arg set to `True` (please refer to the code example below).
|
||||||
|
|
||||||
We'll create `custom_model.py` file with `CustomModel` class that will inherit from vmanomaly `Model` base class.
|
We'll create `custom_model.py` file with `CustomModel` class that will inherit from vmanomaly `Model` base class.
|
||||||
In the `CustomModel` class there should be three required methods - `__init__`, `fit` and `infer`:
|
In the `CustomModel` class there should be three required methods - `__init__`, `fit` and `infer`:
|
||||||
* `__init__` method should initiate parameters for the model.
|
* `__init__` method should initiate parameters for the model.
|
||||||
|
@ -327,7 +475,7 @@ In the `CustomModel` class there should be three required methods - `__init__`,
|
||||||
super().__init__(**kwargs)
|
super().__init__(**kwargs)
|
||||||
```
|
```
|
||||||
to initialize the base class each model derives from
|
to initialize the base class each model derives from
|
||||||
* `fit` method should contain the model training process.
|
* `fit` method should contain the model training process. Please be aware that for `RollingModel` defining `fit` method is not needed, as the whole fit/infer process should be defined completely in `infer` method.
|
||||||
* `infer` should return Pandas.DataFrame object with model's inferences.
|
* `infer` should return Pandas.DataFrame object with model's inferences.
|
||||||
|
|
||||||
For the sake of simplicity, the model in this example will return one of two values of `anomaly_score` - 0 or 1 depending on input parameter `percentage`.
|
For the sake of simplicity, the model in this example will return one of two values of `anomaly_score` - 0 or 1 depending on input parameter `percentage`.
|
||||||
|
@ -340,6 +488,7 @@ import scipy.stats as st
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
from model.model import Model
|
from model.model import Model
|
||||||
|
# from model.model import RollingModel # inherit from it for your model to be of rolling type
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
@ -348,6 +497,10 @@ class CustomModel(Model):
|
||||||
Custom model implementation.
|
Custom model implementation.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
# by default, each `Model` will be created as a univariate one
|
||||||
|
# uncomment line below for it to be of multivariate type
|
||||||
|
# is_multivariate = True
|
||||||
|
|
||||||
def __init__(self, percentage: float = 0.95, **kwargs):
|
def __init__(self, percentage: float = 0.95, **kwargs):
|
||||||
super().__init__(**kwargs)
|
super().__init__(**kwargs)
|
||||||
self.percentage = percentage
|
self.percentage = percentage
|
||||||
|
@ -362,7 +515,6 @@ class CustomModel(Model):
|
||||||
if self._std == 0.0:
|
if self._std == 0.0:
|
||||||
self._std = 1 / 65536
|
self._std = 1 / 65536
|
||||||
|
|
||||||
|
|
||||||
def infer(self, df: pd.DataFrame) -> np.array:
|
def infer(self, df: pd.DataFrame) -> np.array:
|
||||||
# Inference process:
|
# Inference process:
|
||||||
y = df['y']
|
y = df['y']
|
||||||
|
@ -373,7 +525,6 @@ class CustomModel(Model):
|
||||||
df_pred['anomaly_score'] = df_pred['anomaly_score'].astype('int32', errors='ignore')
|
df_pred['anomaly_score'] = df_pred['anomaly_score'].astype('int32', errors='ignore')
|
||||||
|
|
||||||
return df_pred
|
return df_pred
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
@ -381,8 +532,8 @@ class CustomModel(Model):
|
||||||
### 2. Configuration file
|
### 2. Configuration file
|
||||||
|
|
||||||
Next, we need to create `config.yaml` file with VM Anomaly Detection configuration and model input parameters.
|
Next, we need to create `config.yaml` file with VM Anomaly Detection configuration and model input parameters.
|
||||||
In the config file `model` section we need to put our model class `model.custom.CustomModel` and all parameters used in `__init__` method.
|
In the config file `models` section we need to put our model class `model.custom.CustomModel` and all parameters used in `__init__` method.
|
||||||
You can find out more about configuration parameters in vmanomaly docs.
|
You can find out more about configuration parameters in [vmanomaly config docs](/anomaly-detection/components/).
|
||||||
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
|
@ -391,7 +542,8 @@ scheduler:
|
||||||
fit_every: "1m"
|
fit_every: "1m"
|
||||||
fit_window: "1d"
|
fit_window: "1d"
|
||||||
|
|
||||||
model:
|
models:
|
||||||
|
your_desired_alias_for_a_model:
|
||||||
# note: every custom model should implement this exact path, specified in `class` field
|
# note: every custom model should implement this exact path, specified in `class` field
|
||||||
class: "model.model.CustomModel"
|
class: "model.model.CustomModel"
|
||||||
# custom model params are defined here
|
# custom model params are defined here
|
||||||
|
@ -426,15 +578,13 @@ monitoring:
|
||||||
### 3. Running custom model
|
### 3. Running custom model
|
||||||
Let's pull the docker image for vmanomaly:
|
Let's pull the docker image for vmanomaly:
|
||||||
|
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
docker pull us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:latest
|
docker pull victoriametrics/vmanomaly:latest
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
Now we can run the docker container putting as volumes both config and model file:
|
Now we can run the docker container putting as volumes both config and model file:
|
||||||
|
|
||||||
**Note**: place the model file to `/model/custom.py` path when copying
|
> **Note**: place the model file to `/model/custom.py` path when copying
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
docker run -it \
|
docker run -it \
|
||||||
|
@ -442,11 +592,10 @@ docker run -it \
|
||||||
-v [YOUR_LICENSE_FILE_PATH]:/license.txt \
|
-v [YOUR_LICENSE_FILE_PATH]:/license.txt \
|
||||||
-v $(PWD)/custom_model.py:/vmanomaly/src/model/custom.py \
|
-v $(PWD)/custom_model.py:/vmanomaly/src/model/custom.py \
|
||||||
-v $(PWD)/custom.yaml:/config.yaml \
|
-v $(PWD)/custom.yaml:/config.yaml \
|
||||||
us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:latest /config.yaml \
|
victoriametrics/vmanomaly:latest /config.yaml \
|
||||||
--license-file=/license.txt
|
--license-file=/license.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
Please find more detailed instructions (license, etc.) [here](/anomaly-detection/overview.html#run-vmanomaly-docker-container)
|
Please find more detailed instructions (license, etc.) [here](/anomaly-detection/overview.html#run-vmanomaly-docker-container)
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -128,6 +128,8 @@ monitoring:
|
||||||
### Models Behaviour Metrics
|
### Models Behaviour Metrics
|
||||||
Label names [description](#labelnames)
|
Label names [description](#labelnames)
|
||||||
|
|
||||||
|
> **Note**: There is a new label key `model_alias` introduced in multi-model support [v1.10.0](/anomaly-detection/changelog/#v1100). This label key adjustment was made to preserve unique label set production during writing produced metrics back to VictoriaMetrics.
|
||||||
|
|
||||||
<table>
|
<table>
|
||||||
<thead>
|
<thead>
|
||||||
<tr>
|
<tr>
|
||||||
|
@ -142,25 +144,25 @@ Label names [description](#labelnames)
|
||||||
<td><code>vmanomaly_model_runs</code></td>
|
<td><code>vmanomaly_model_runs</code></td>
|
||||||
<td>Counter</td>
|
<td>Counter</td>
|
||||||
<td>How many times models ran (per model)</td>
|
<td>How many times models ran (per model)</td>
|
||||||
<td><code>stage, query_key</code></td>
|
<td><code>stage, query_key, model_alias</code></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>vmanomaly_model_run_duration_seconds</code></td>
|
<td><code>vmanomaly_model_run_duration_seconds</code></td>
|
||||||
<td>Summary</td>
|
<td>Summary</td>
|
||||||
<td>How much time (in seconds) model invocations took</td>
|
<td>How much time (in seconds) model invocations took</td>
|
||||||
<td><code>stage, query_key</code></td>
|
<td><code>stage, query_key, model_alias</code></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>vmanomaly_model_datapoints_accepted</code></td>
|
<td><code>vmanomaly_model_datapoints_accepted</code></td>
|
||||||
<td>Counter</td>
|
<td>Counter</td>
|
||||||
<td>How many datapoints did models accept</td>
|
<td>How many datapoints did models accept</td>
|
||||||
<td><code>stage, query_key</code></td>
|
<td><code>stage, query_key, model_alias</code></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>vmanomaly_model_datapoints_produced</code></td>
|
<td><code>vmanomaly_model_datapoints_produced</code></td>
|
||||||
<td>Counter</td>
|
<td>Counter</td>
|
||||||
<td>How many datapoints were generated by models</td>
|
<td>How many datapoints were generated by models</td>
|
||||||
<td><code>stage, query_key</code></td>
|
<td><code>stage, query_key, model_alias</code></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>vmanomaly_models_active</code></td>
|
<td><code>vmanomaly_models_active</code></td>
|
||||||
|
@ -172,7 +174,7 @@ Label names [description](#labelnames)
|
||||||
<td><code>vmanomaly_model_runs_skipped</code></td>
|
<td><code>vmanomaly_model_runs_skipped</code></td>
|
||||||
<td>Counter</td>
|
<td>Counter</td>
|
||||||
<td>How many times a run was skipped (per model)</td>
|
<td>How many times a run was skipped (per model)</td>
|
||||||
<td><code>stage, query_key</code></td>
|
<td><code>stage, query_key, model_alias</code></td>
|
||||||
</tr>
|
</tr>
|
||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
|
@ -286,6 +288,8 @@ Label names [description](#labelnames)
|
||||||
|
|
||||||
<code>query_key</code> - query alias from [`reader`](/anomaly-detection/components/reader.html) config section.
|
<code>query_key</code> - query alias from [`reader`](/anomaly-detection/components/reader.html) config section.
|
||||||
|
|
||||||
|
<code>model_alias</code> - model alias from [`models`](/anomaly-detection/components/models.html) config section. **Introduced in [v1.10.0](/anomaly-detection/changelog/#v1100).**
|
||||||
|
|
||||||
<code>url</code> - writer or reader url endpoint.
|
<code>url</code> - writer or reader url endpoint.
|
||||||
|
|
||||||
<code>code</code> - response status code or `connection_error`, `timeout`.
|
<code>code</code> - response status code or `connection_error`, `timeout`.
|
||||||
|
|
|
@ -36,7 +36,7 @@ aliases:
|
||||||
|
|
||||||
All the service parameters are defined in a config file.
|
All the service parameters are defined in a config file.
|
||||||
|
|
||||||
> **Note**: As of the time of writing, in the [1.9.2](https://docs.victoriametrics.com/anomaly-detection/changelog/#v192) release and earlier versions, each `vmanomaly` configuration file is limited to supporting only one model type. To utilize *different models* on your data, it is necessary to run multiple instances of the `vmanomaly` process. Each instance should operate with its own configuration file, differing in the `model` section.
|
> **Note**: Starting from [1.10.0](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1100), each `vmanomaly` configuration file can support more that one model type. To utilize *different models* on your data, it is no longer necessary to run multiple instances of the `vmanomaly` process. Please refer to [model](/anomaly-detection/models/) config section for more details.
|
||||||
|
|
||||||
|
|
||||||
**vmanomaly** does the following:
|
**vmanomaly** does the following:
|
||||||
|
@ -117,7 +117,7 @@ The configuration file for `vmanomaly` comprises 4 essential sections:
|
||||||
|
|
||||||
1. [`scheduler`](/anomaly-detection/components/scheduler.html) - This section determines the frequency of model inferences and training, including the time range for model training.
|
1. [`scheduler`](/anomaly-detection/components/scheduler.html) - This section determines the frequency of model inferences and training, including the time range for model training.
|
||||||
|
|
||||||
2. [`model`](/anomaly-detection/components/models.html) - Here, you define specific parameters and configurations for the model being used for anomaly detection.
|
2. [`models`](/anomaly-detection/components/models.html) - Here, you define specific parameters and configurations for the models being used for anomaly detection.
|
||||||
|
|
||||||
3. [`reader`](/anomaly-detection/components/reader.html) - This section outlines the methodology for data reading, including the data source location.
|
3. [`reader`](/anomaly-detection/components/reader.html) - This section outlines the methodology for data reading, including the data source location.
|
||||||
|
|
||||||
|
@ -132,7 +132,7 @@ Detailed parameters in each section:
|
||||||
* `fit_every` - Sets the frequency for retraining the models. A higher frequency ensures more updated models but requires more CPU resources. If omitted, models are retrained in each `infer_every` cycle. Format is similar to `infer_every`.
|
* `fit_every` - Sets the frequency for retraining the models. A higher frequency ensures more updated models but requires more CPU resources. If omitted, models are retrained in each `infer_every` cycle. Format is similar to `infer_every`.
|
||||||
* `fit_window` - Defines the data interval for training the models. Longer intervals allow for capturing extensive historical behavior and better seasonal pattern detection but may slow down the model's response to permanent metric changes and increase resource consumption. A minimum of two full seasonal cycles is recommended. Example format: 3h for three hours of data.
|
* `fit_window` - Defines the data interval for training the models. Longer intervals allow for capturing extensive historical behavior and better seasonal pattern detection but may slow down the model's response to permanent metric changes and increase resource consumption. A minimum of two full seasonal cycles is recommended. Example format: 3h for three hours of data.
|
||||||
|
|
||||||
* `model`
|
* `models`
|
||||||
* `class` - Specifies the model to be used. Options include custom models ([guide here](/anomaly-detection/components/models.html#custom-model-guide)) or a selection from [built-in models](/anomaly-detection/components/models.html#built-in-models), such as the [Facebook Prophet](/anomaly-detection/components/models.html#prophet) (`model.prophet.ProphetModel`).
|
* `class` - Specifies the model to be used. Options include custom models ([guide here](/anomaly-detection/components/models.html#custom-model-guide)) or a selection from [built-in models](/anomaly-detection/components/models.html#built-in-models), such as the [Facebook Prophet](/anomaly-detection/components/models.html#prophet) (`model.prophet.ProphetModel`).
|
||||||
* `args` - Model-specific parameters, formatted as a YAML dictionary in the `key: value` structure. Parameters available in [FB Prophet](https://facebook.github.io/prophet/docs/quick_start.html) can be used as an example.
|
* `args` - Model-specific parameters, formatted as a YAML dictionary in the `key: value` structure. Parameters available in [FB Prophet](https://facebook.github.io/prophet/docs/quick_start.html) can be used as an example.
|
||||||
|
|
||||||
|
@ -152,7 +152,8 @@ scheduler:
|
||||||
fit_every: "2m"
|
fit_every: "2m"
|
||||||
fit_window: "3h"
|
fit_window: "3h"
|
||||||
|
|
||||||
model:
|
models:
|
||||||
|
prophet:
|
||||||
class: "model.prophet.ProphetModel"
|
class: "model.prophet.ProphetModel"
|
||||||
args:
|
args:
|
||||||
interval_width: 0.98
|
interval_width: 0.98
|
||||||
|
|
Loading…
Reference in a new issue