docs/vmanomaly - release 1.18.5 (#7684)

### Describe Your Changes docs/vmanomaly - release 1.18.5 doc updates ### Checklist The following checks are **mandatory**: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).
2025-03-11 15:34:56 +00:00 · 2024-11-28 12:34:45 +01:00 · 2024-11-28 12:34:45 +01:00 · ae673e8b34
commit ae673e8b34
parent 879bba11ba
8 changed files with 121 additions and 28 deletions
--- a/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml
+++ b/deployment/docker/vmanomaly/vmanomaly-integration/docker-compose.yml
@ -72,7 +72,7 @@ services:
    restart: always
  vmanomaly:
    container_name: vmanomaly
-    image: victoriametrics/vmanomaly:v1.18.4
+    image: victoriametrics/vmanomaly:v1.18.5
    depends_on:
      - "victoriametrics"
    ports:
--- a/docs/anomaly-detection/CHANGELOG.md
+++ b/docs/anomaly-detection/CHANGELOG.md
@ -11,6 +11,13 @@ aliases:
 ---
 Please find the changelog for VictoriaMetrics Anomaly Detection below.

+## v1.18.5
+Released: 2024-11-27
+
+- IMPROVEMENT: Introduced the ability to run `vmanomaly` using a configuration directory. This enhancement allows users to recursively merge multiple full configuration files (previously limited to merging specific sections, such as `reader`) and execute a single instance of the service with the combined configuration.
+- IMPROVEMENT: Added a new utility, `config_splitter.py`, to streamline the process of splitting a single configuration file into multiple standalone configurations. The configurations are split by specified entities like `schedulers`, `models`, `queries` or `extra_filters`. The split configurations can be saved to a designated directory. It simplifies scaling `vmanomaly` and enhances user experience by automating the process of separating config files so they can be run on separate instances of vmanomaly. For more details, refer to [this section](https://docs.victoriametrics.com/anomaly-detection/faq/#splitting-the-config).
+- IMPROVEMENT: Introduced the ability to configure the [`PeriodicScheduler`](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/?highlight=start_from#periodic-scheduler) to start at a specific time using the `start_from` and `tz` parameters. The `start_from` parameter accepts either `HH:MM` or [ISO 8601 formats](https://en.wikipedia.org/wiki/ISO_8601), with `tz` defaulting to `UTC`. If `start_from` is in the past, the next valid start time is automatically calculated based on the `fit_every` interval.
+
 ## v1.18.4
 Released: 2024-11-18

--- a/docs/anomaly-detection/FAQ.md
+++ b/docs/anomaly-detection/FAQ.md
@ -159,7 +159,7 @@ services:
  # ...
  vmanomaly:
    container_name: vmanomaly
-    image: victoriametrics/vmanomaly:v1.18.4
+    image: victoriametrics/vmanomaly:v1.18.5
    # ...
    ports:
      - "8490:8490"
@ -323,18 +323,67 @@ Please note that this approach may not fully resolve the issue if subqueries are


 ## Scaling vmanomaly
+
 > **Note:** As of latest release we do not support cluster or auto-scaled version yet (though, it's in our roadmap for - better backends, more parallelization, etc.), so proposed workarounds should be addressed *manually*.

-`vmanomaly` can be scaled horizontally by launching multiple independent instances, each with its own [MetricsQL](https://docs.victoriametrics.com/metricsql/) queries and [configurations](https://docs.victoriametrics.com/anomaly-detection/components/):
+`vmanomaly` supports **vertical** scalability, benefiting from additional CPU cores (resulting in faster processing times) and increased RAM (allowing more models to be trained and larger volumes of timeseries data to be processed efficiently).

- By splitting **queries**, [defined in reader section](https://docs.victoriametrics.com/anomaly-detection/components/reader#vm-reader) and spawn separate service around it. Also in case you have *only 1 query returning huge amount of timeseries*, you can further split it by applying MetricsQL filters, i.e. using "extra_filters" [param in reader](https://docs.victoriametrics.com/anomaly-detection/components/reader?highlight=extra_filters#vm-reader). See the example below.
+For **horizontal** scalability, `vmanomaly` can be deployed as multiple independent instances, each configured with its own [MetricsQL](https://docs.victoriametrics.com/metricsql/) queries and [configurations](https://docs.victoriametrics.com/anomaly-detection/components/):

- or **models** (in case you decide to run several models for each timeseries received i.e. for averaging anomaly scores in your alerting rules of `vmalert` or using a vote approach to reduce false positives) - see `queries` arg in [model config](https://docs.victoriametrics.com/anomaly-detection/components/models#queries)
+- Splitting by **queries** [defined in the reader section](https://docs.victoriametrics.com/anomaly-detection/components/reader#vm-reader) and assigning each subset to a separate service instance should be used when having *a single query returning a large number of timeseries*. This can be further split by applying global MetricsQL filters using the `extra_filters` [parameter in the reader](https://docs.victoriametrics.com/anomaly-detection/components/reader?highlight=extra_filters#vm-reader). See example below.

- or **schedulers** (in case you want the same models to be trained under several schedules) - see `schedulers` arg [model section](https://docs.victoriametrics.com/anomaly-detection/components/models#schedulers) and `scheduler` [component itself](https://docs.victoriametrics.com/anomaly-detection/components/scheduler)
+- Spliting by **models** should be used when running multiple models on the same query. This is commonly done to reduce false positives by alerting only if multiple models detect an anomaly. See the `queries` argument in the [model configuration](https://docs.victoriametrics.com/anomaly-detection/components/models#queries). Additionally, this approach is useful when you just have a large set of resource-intensive independent models.

+- Splitting by **schedulers** should be used when the same models needs to be trained or inferred under different schedules. Refer to the `schedulers` argument in the [model section](https://docs.victoriametrics.com/anomaly-detection/components/models#schedulers) and the `scheduler` [component documentation](https://docs.victoriametrics.com/anomaly-detection/components/scheduler).

-Here's an example of how to split on `extra_filters`, based on `extra_filters` reader's arg:
+### Splitting the config
+
+Starting from [v1.18.5](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1185), a CLI utility named `config_splitter.py` is available in vmanoamly. The config splitter tool enables splitting a parent vmanomaly YAML configuration file into multiple sub-configurations based on logical entities  such as `schedulers`, `queries`, `models`, `extra_filters`. The resulting sub-configurations are fully validated, functional, account for many-to-many relationships between models and their associated queries, and the schedulers they are linked to. These sub-configurations can then be saved to a specified directory for further use:
+
+```shellhelp
+usage: config_splitter.py [-h] --splitBy {schedulers,models,queries,extra_filters} --outputDir OUTPUT_DIR [--fileNameFormat {raw,hash,int}] [--loggerLevel {WARNING,INFO,ERROR,FATAL,DEBUG}]
+                          config [config ...]
+
+Splits the configuration of VictoriaMetrics Anomaly Detection service by a logical entity.
+
+positional arguments:
+  config                YAML config files to be combined into a single configuration.
+
+options:
+  -h                    show this help message and exit
+  --splitBy {schedulers,models,queries,extra_filters}
+                        The logical entity to split by. Choices: ['schedulers', 'models', 'queries', 'extra_filters'].
+  --outputDir output_dir
+                        Directory where the split configuration files will be saved.
+  --fileNameFormat {raw,hash,int}
+                        The naming format for the output configuration files. Choices: raw (use the entity alias), hash (use hashed alias), int (use a sequential integer from 0 to N for N
+                        produced sub-configs). Default: raw.
+  --loggerLevel {WARNING,INFO,ERROR,FATAL,DEBUG}
+                        Minimum level to log. Default: INFO
+```
+
+Here’s an example of using the config splitter to divide configurations based on the `extra_filters` argument from the reader section:
+
+```sh
+docker pull victoriametrics/vmanomaly:v1.18.5 && docker image tag victoriametrics/vmanomaly:v1.18.5 vmanomaly
+```
+
+```sh
+export YOUR_INPUT_CONFIG_PATH=path/to/input/config.yml
+export YOUR_OUTPUT_DIR_PATH=path/to/output/directory
+
+docker run -it --rm \
+    -v $YOUR_INPUT_CONFIG_PATH:/input_config.yml \
+    -v $YOUR_OUTPUT_DIR_PATH:/output_dir \
+    vmanomaly python3 /vmanomaly/config_splitter.py \
+    /input_config.yml \
+    --splitBy=extra_filters \
+    --outputDir=/output_dir \
+    --fileNameFormat=raw \
+    --loggerLevel=INFO
+```
+
+After running the command, the output directory (specified by `YOUR_OUTPUT_DIR_PATH`) will contain 1+ split configuration files like the examples below. Each file can be used to launch a separate vmanomaly instance. Use similar approach to split on other entities, like `models` or `schedulers`.

 ```yaml
 # config file #1, for 1st vmanomaly instance
@ -345,7 +394,7 @@ reader:
    extra_big_query: metricsql_expression_returning_too_many_timeseries
    extra_filters:
      # suppose you have a label `region` with values to deterministically define such subsets
-      - '{region="region_name_1"}'
+      - '{env="region_name_1"}'
      # ...
 ```

--- a/docs/anomaly-detection/Overview.md
+++ b/docs/anomaly-detection/Overview.md
@ -229,7 +229,7 @@ This will expose metrics at `http://0.0.0.0:8080/metrics` page.
 To use *vmanomaly* you need to pull docker image:

 ```sh
-docker pull victoriametrics/vmanomaly:v1.18.4
+docker pull victoriametrics/vmanomaly:v1.18.5
 ```

 > Note: please check what is latest release in [CHANGELOG](https://docs.victoriametrics.com/anomaly-detection/changelog/)
@ -239,7 +239,7 @@ docker pull victoriametrics/vmanomaly:v1.18.4
 You can put a tag on it for your convenience:

 ```sh
-docker image tag victoriametrics/vmanomaly:v1.18.4 vmanomaly
+docker image tag victoriametrics/vmanomaly:v1.18.5 vmanomaly
 ```
 Here is an example of how to run *vmanomaly* docker container with [license file](#licensing):

--- a/docs/anomaly-detection/QuickStart.md
+++ b/docs/anomaly-detection/QuickStart.md
@ -29,22 +29,27 @@ The following options are available:

 The `vmanomaly` service supports several command-line arguments to configure its behavior, including options for licensing, logging levels, and more. These arguments can be passed when starting the service via Docker or any other setup. Below is the list of available options:

+> **Note**: Starting from [v1.18.5](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1185) `vmanomaly` support running on config *directories*, see the `config` positional arg description in help message below.

 ```shellhelp
+usage: vmanomaly.py [-h] [--license STRING | --licenseFile PATH] [--license.forceOffline] [--loggerLevel {INFO,DEBUG,ERROR,WARNING,FATAL}] [--watch] config [config ...]
+
 VictoriaMetrics Anomaly Detection Service

 positional arguments:
-  config                YAML config file. Multiple files will override each other's top level values (aka shallow merge), so multiple configs can be combined.
+  config                YAML config file(s) or directories containing YAML files. Multiple files will recursively merge each other values so multiple configs can be combined. If a directory
+                        is provided, all `.yaml` files inside will be merged, without recursion. Default: vmanomaly.yaml is expected in the current directory.

 options:
  -h                    show this help message and exit
  --license STRING      License key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/trial/ to obtain a trial license.
  --licenseFile PATH    Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/trial/ to obtain a trial license.
  --license.forceOffline 
-                        Whether to force offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag.
-                        The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification.
-  --loggerLevel {FATAL,WARNING,ERROR,DEBUG,INFO}
+                        Whether to force offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The
+                        issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification.
+  --loggerLevel {INFO,DEBUG,ERROR,WARNING,FATAL}
                        Minimum level to log. Possible values: DEBUG, INFO, WARNING, ERROR, FATAL.
+  --watch               [DEPRECATED SINCE v1.11.0] Watch config files for changes. This option is no longer supported and will be ignored.
 ```

 You can specify these options when running `vmanomaly` to fine-tune logging levels or handle licensing configurations, as per your requirements.
@ -58,13 +63,13 @@ Below are the steps to get `vmanomaly` up and running inside a Docker container:
 1. Pull Docker image:

 ```sh
-docker pull victoriametrics/vmanomaly:v1.18.4
+docker pull victoriametrics/vmanomaly:v1.18.5
 ```

 2. (Optional step) tag the `vmanomaly` Docker image:

 ```sh
-docker image tag victoriametrics/vmanomaly:v1.18.4 vmanomaly
+docker image tag victoriametrics/vmanomaly:v1.18.5 vmanomaly
 ```

 3. Start the `vmanomaly` Docker container with a *license file*, use the command below.
@ -98,7 +103,7 @@ docker run -it --user 1000:1000 \
 services:
  # ...
  vmanomaly:
-    image: victoriametrics/vmanomaly:v1.18.4
+    image: victoriametrics/vmanomaly:v1.18.5
    volumes:
        $YOUR_LICENSE_FILE_PATH:/license
        $YOUR_CONFIG_FILE_PATH:/config.yml
--- a/docs/anomaly-detection/components/models.md
+++ b/docs/anomaly-detection/components/models.md
@ -993,7 +993,7 @@ monitoring:
 Let's pull the docker image for `vmanomaly`:

 ```sh
-docker pull victoriametrics/vmanomaly:v1.18.4
+docker pull victoriametrics/vmanomaly:v1.18.5
 ```

 Now we can run the docker container putting as volumes both config and model file:
@ -1007,7 +1007,7 @@ docker run -it \
 -v $(PWD)/license:/license \
 -v $(PWD)/custom_model.py:/vmanomaly/model/custom.py \
 -v $(PWD)/custom.yaml:/config.yaml \
-victoriametrics/vmanomaly:v1.18.4 /config.yaml \
+victoriametrics/vmanomaly:v1.18.5 /config.yaml \
 --licenseFile=/license
 ```

--- a/docs/anomaly-detection/components/scheduler.md
+++ b/docs/anomaly-detection/components/scheduler.md
@ -121,7 +121,7 @@ Examples: `"50s"`, `"4m"`, `"3h"`, `"2d"`, `"1w"`.
            <td>str</td>
            <td>

-`"14d"`
+`14d`
            </td>
            <td>What time range to use for training the models. Must be at least 1 second.</td>
        </tr>
@ -133,9 +133,9 @@ Examples: `"50s"`, `"4m"`, `"3h"`, `"2d"`, `"1w"`.
            <td>str</td>
            <td>

-`"1m"`
+`1m`
            </td>
-            <td>How often a model will write its conclusions on newly added data. Must be at least 1 second.</td>
+            <td>How often a model produce and write its anomaly scores on new datapoints. Must be at least 1 second.</td>
        </tr>
        <tr>
            <td>
@ -145,11 +145,41 @@ Examples: `"50s"`, `"4m"`, `"3h"`, `"2d"`, `"1w"`.
            <td>str, Optional</td>
            <td>

-`"1h"`
+`1h`
            </td>
            <td>

-How often to completely retrain the models. If missing value of `infer_every` is used and retrain on every inference run.
+How often to completely retrain the models. If not set, value of `infer_every` is used and retrain happens on every inference run.
+            </td>
+        </tr>
+        <tr>
+            <td>
+
+`start_from`
+            </td>
+            <td>str, Optional</td>
+            <td>
+
+`2024-11-26T01:00:00Z`, `01:00`
+            </td>
+            <td>
+
+Available since [v1.18.5](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1185). Specifies when to initiate the first `fit_every` call. Accepts either an ISO 8601 datetime or a time in HH:MM format. If the specified time is in the past, the next suitable time is calculated based on the `fit_every` interval. For the HH:MM format, if the time is in the past, it will be scheduled for the same time on the following day, respecting the `tz` argument if provided. By default, the timezone defaults to `UTC`.
+            </td>
+        </tr>
+        <tr>
+            <td>
+
+`tz`
+            </td>
+            <td>str, Optional</td>
+            <td>
+
+`America/New_York`
+            </td>
+            <td>
+
+Available since [v1.18.5](https://docs.victoriametrics.com/anomaly-detection/changelog/#v1185). Defines the local timezone for the `start_from` parameter, if specified. Defaults to `UTC` if no timezone is provided.
            </td>
        </tr>
    </tbody>
@ -161,13 +191,15 @@ How often to completely retrain the models. If missing value of `infer_every` is
 schedulers:
  periodic_scheduler_alias:
    class: "periodic"
-    # (or class: "scheduler.periodic.PeriodicScheduler" until v1.13.0 with class alias support)
+    # (or class: "scheduler.periodic.PeriodicScheduler" for versions before v1.13.0, without class alias support)
    fit_window: "14d" 
    infer_every: "1m" 
-    fit_every: "1h" 
+    fit_every: "1h"
+    start_from: "20:00"  # If launched before 20:00 (local Kyiv time), the first run starts today at 20:00. Otherwise, it starts tomorrow at 20:00.
+    tz: "Europe/Kyiv"  # Defaults to 'UTC' if not specified.
 ```

-This part of the config means that `vmanomaly` will calculate the time window of the previous 14 days and use it to train a model. Every hour model will be retrained again on 14 days’ data, which will include + 1 hour of new data. The time window is strictly the same 14 days and doesn't extend for the next retrains. Every minute `vmanomaly` will produce model inferences for newly added data points by using the model that is kept in memory at that time.
+This configuration specifies that `vmanomaly` will calculate a 14-day time window from the time of `fit_every` call to train the model. Starting at 20:00 Kyiv local time today (or tomorrow if launched after 20:00), the model will be retrained every hour using the most recent 14-day window, which always includes an additional hour of new data. The time window remains strictly 14 days and does not extend with subsequent retrains. Additionally, `vmanomaly` will perform model inference every minute, processing newly added data points using the most recent model.

 ## Oneoff scheduler 

--- a/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md
+++ b/docs/anomaly-detection/guides/guide-vmanomaly-vmalert/README.md
@ -385,7 +385,7 @@ services:
    restart: always
  vmanomaly:
    container_name: vmanomaly
-    image: victoriametrics/vmanomaly:v1.18.4
+    image: victoriametrics/vmanomaly:v1.18.5
    depends_on:
      - "victoriametrics"
    ports: