docs: document IndexDB (#6840)

### Describe Your Changes This is an attempt to document IndexDB. I guess I was trying to touch the important points that might be of interest for the end users while refraining from making it too detailed (such as I did not enumerate and describe all the specific record types). Please take a look and any suggestions are very welcome. ### Checklist The following checks are **mandatory**: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
2024-11-21 14:44:00 +00:00 · 2024-08-27 14:52:46 +03:00 · 2024-08-27 14:52:46 +03:00 · e7f1297517
commit e7f1297517
parent e97e966f82
1 changed files with 44 additions and 5 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -1202,8 +1202,8 @@ Using the delete API is not recommended in the following cases, since it brings
 * Reducing disk space usage by deleting unneeded time series. This doesn't work as expected, since the deleted
  time series occupy disk space until the next merge operation, which can never occur when deleting too old data.
  [Forced merge](#forced-merge) may be used for freeing up disk space occupied by old data.
-  Note that VictoriaMetrics doesn't delete entries from inverted index (aka `indexdb`) for the deleted time series.
+  Note that VictoriaMetrics doesn't delete entries from [IndexDB](#indexdb) for the deleted time series.
-  Inverted index is cleaned up once per the configured [retention](#retention).
+  IndexDB is cleaned up once per the configured [retention](#retention).
 It's better to use the `-retentionPeriod` command-line flag for efficient pruning of old data.
@ -1905,7 +1905,46 @@ See more details in [monitoring docs](#monitoring).
 See [this article](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for more details.
-See also [how to work with snapshots](#how-to-work-with-snapshots).
+See also [how to work with snapshots](#how-to-work-with-snapshots) and [IndexDB](#indexdb).
 ## IndexDB
 VictoriaMetrics identifies
 [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) by
 `TSID` (time series ID) and stores
 [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) sorted
 by TSID (see [Storage](#storage)). Thus, the TSID is a primary index and could
 be used for searching and retrieving raw samples. However, the TSID is never
 exposed to the clients, i.e. it is for internal use only.
 Instead, VictoriaMetrics maintains an **inverted index** that enables searching
 the raw samples by metric name, label name, and label value by mapping these
 values to the corresponding TSIDs.
 VictoriaMetrics uses two types of inverted indexes:
 -   Global index. Searches using this index is performed across the entire
    retention period.
 -   Per-day index. This index stores mappings similar to ones in global index
    but also includes the date in each mapping. This speeds up data retrieval
    for queries within a shorter time range (which is often just the last day).
 When the search query is executed, VictoriaMetrics decides which index to use
 based on the time range of the query:
 -   Per-day index is used if the search time range is 40 days or less.
 -   Global index is used for search queries with a time range greater than 40
    days.
 Mappings are added to the indexes during the data ingestion:
 -   In global index each mapping is created only once per retention period.
 -   In the per-day index each mapping is be created for each unique date that
    has been seen in the samples for the corresponding time series.
 IndexDB respects [retention period](#retention) and once it is over, the indexes
 are dropped. For the new retention period, the indexes are gradually populated
 again as the new samples arrive.
 ## Retention
@ -1969,8 +2008,8 @@ For example, the following config sets 3 days retention for time series with `te
 Important notes:
 - The data outside the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage).
- The `-retentionFilter` doesn't remove old data from `indexdb` (aka inverted index) until the configured [-retentionPeriod](#retention).
+- The `-retentionFilter` doesn't remove old data from [IndexDB](#indexdb) until the configured [-retentionPeriod](#retention).
-  So the `indexdb` size can grow big under [high churn rate](https://docs.victoriametrics.com/faq/#what-is-high-churn-rate)
+  So the IndexDB size can grow big under [high churn rate](https://docs.victoriametrics.com/faq/#what-is-high-churn-rate)
  even for small retentions configured via `-retentionFilter`.
 It is safe updating `-retentionFilter` during VictoriaMetrics restarts - the updated retention filters are applied eventually