mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
29b7c0b4a6
Suggest `-search.maxStalenessInterval=10s` instead of `-search.maxStalenessInterval=1ms`, since `1ms` would result in empty graphs in most cases, since the interval between data points on the graph is usually much higher than 1ms. For example, if the graph shows time range of one hour and it contains 1000 points, then the interval between points on the graph would equal to 3600s/1000=3.6 seconds.
275 lines
15 KiB
Markdown
275 lines
15 KiB
Markdown
# Migrate from InfluxDB to VictoriaMetrics
|
|
|
|
InfluxDB is a well-known time series database built for
|
|
[IoT](https://en.wikipedia.org/wiki/Internet_of_things) monitoring, Application Performance Monitoring (APM) and
|
|
analytics. It has its query language, unique data model, and rich tooling for collecting and processing metrics.
|
|
|
|
Nowadays, the volume of time series data grows constantly, as well as requirements for durable time series storage. And
|
|
sometimes old known solutions just can't keep up with the new expectations.
|
|
|
|
VictoriaMetrics is a high-performance opensource time series database specifically designed to deal with huge volumes of
|
|
monitoring data while remaining cost-efficient at the same time. Many companies are choosing to migrate from InfluxDB to
|
|
VictoriaMetrics specifically for performance and scalability reasons. Along them see case studies provided by
|
|
[ARNES](https://docs.victoriametrics.com/CaseStudies.html#arnes)
|
|
and [Brandwatch](https://docs.victoriametrics.com/CaseStudies.html#brandwatch).
|
|
|
|
This guide will cover the differences between two solutions, most commonly asked questions, and approaches for migrating
|
|
from InfluxDB to VictoriaMetrics.
|
|
|
|
## Data model differences
|
|
|
|
While readers are likely familiar
|
|
with [InfluxDB key concepts](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/), the data model of
|
|
VictoriaMetrics is something [new to explore](https://docs.victoriametrics.com/keyConcepts.html#data-model). Let's start
|
|
with similarities and differences:
|
|
|
|
* both solutions are **schemaless**, which means there is no need to define metrics or their tags in advance;
|
|
* multi-dimensional data support is implemented
|
|
via [tags](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#tags)
|
|
in InfluxDB and via [labels](https://docs.victoriametrics.com/keyConcepts.html#structure-of-a-metric) in
|
|
VictoriaMetrics. However, labels in VictoriaMetrics are always `strings`, while InfluxDB supports multiple data types;
|
|
* timestamps are stored with nanosecond resolution in InfluxDB, while in VictoriaMetrics it is **milliseconds**;
|
|
* in VictoriaMetrics metric's value is always `float64`, while InfluxDB supports multiple data types.
|
|
* there are
|
|
no [measurements](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#measurement)
|
|
or [fields](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#field-key) in
|
|
VictoriaMetrics, metric name contains it all. If measurement contains more than 1 field, then for VictoriaMetrics
|
|
it will be multiple metrics;
|
|
* there are no [buckets](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#bucket)
|
|
or [organizations](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#organization), all
|
|
data in VictoriaMetrics is stored in a global namespace or within
|
|
a [tenant](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy).
|
|
|
|
Let's consider the
|
|
following [sample data](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#sample-data)
|
|
borrowed from InfluxDB docs as an example:
|
|
|
|
| _measurement | _field | location | scientist | _value | _time |
|
|
|--------------|--------|----------|-------------|--------|----------------------|
|
|
| census | bees | klamath | anderson | 23 | 2019-08-18T00:00:00Z |
|
|
| census | ants | portland | mullen | 30 | 2019-08-18T00:00:00Z |
|
|
| census | bees | klamath | anderson | 28 | 2019-08-18T00:06:00Z |
|
|
| census | ants | portland | mullen | 32 | 2019-08-18T00:06:00Z |
|
|
|
|
In VictoriaMetrics data model this sample will have the following form:
|
|
|
|
| metric name | labels | value | time |
|
|
|-------------|:---------------------------------------------|-------|----------------------|
|
|
| census_bees | {location="klamath", scientist="anderson"} | 23 | 2019-08-18T00:00:00Z |
|
|
| census_ants | {location="portland", scientist="mullen"} | 30 | 2019-08-18T00:00:00Z |
|
|
| census_bees | {location="klamath", scientist="anderson"} | 28 | 2019-08-18T00:06:00Z |
|
|
| census_ants | {location="portland", scientist="mullen"} | 32 | 2019-08-18T00:06:00Z |
|
|
|
|
Actually, metric name for VictoriaMetrics is also a label with static name `__name__`, and example above can be
|
|
converted to `{__name__="census_bees", location="klamath", scientist="anderson"}`. All labels are indexed by
|
|
VictoriaMetrics, so lookups by names or labels have the same query speed.
|
|
|
|
|
|
## Write data
|
|
|
|
VictoriaMetrics
|
|
supports [InfluxDB line protocol](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
|
|
for data ingestion. For example, to write a measurement to VictoriaMetrics we need to send an HTTP POST request with
|
|
payload in a line protocol format:
|
|
|
|
```bash
|
|
curl -d 'census,location=klamath,scientist=anderson bees=23 1566079200000' -X POST 'http://<victoriametric-addr>:8428/write'
|
|
```
|
|
|
|
_hint: timestamp in the example might be out of configured retention for VictoriaMetrics. Consider increasing the
|
|
retention period or changing the timestamp, if that is the case._
|
|
|
|
Please note, an arbitrary number of lines delimited by `\n` (aka newline char) can be sent in a single request.
|
|
|
|
To get the written data back let's export all series matching the `location="klamath"` filter:
|
|
|
|
```bash
|
|
curl -G 'http://<victoriametric-addr>:8428/api/v1/export' -d 'match={location="klamath"}'
|
|
```
|
|
|
|
The expected response is the following:
|
|
|
|
```json
|
|
{
|
|
"metric": {
|
|
"__name__": "census_bees",
|
|
"location": "klamath",
|
|
"scientist": "anderson"
|
|
},
|
|
"values": [
|
|
23
|
|
],
|
|
"timestamps": [
|
|
1566079200000
|
|
]
|
|
}
|
|
```
|
|
|
|
Please note, VictoriaMetrics performed additional
|
|
[data mapping](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
|
|
to the data ingested via InfluxDB line protocol.
|
|
|
|
Support of InfluxDB line protocol also means VictoriaMetrics is compatible with
|
|
[Telegraf](https://github.com/influxdata/telegraf). To configure Telegraf, simply
|
|
add `http://<victoriametric-addr>:8428` URL to Telegraf configs:
|
|
|
|
```
|
|
[[outputs.influxdb]]
|
|
urls = ["http://<victoriametrics-addr>:8428"]
|
|
```
|
|
|
|
In addition to InfluxDB line protocol, VictoriaMetrics supports many other ways for
|
|
[metrics collection](https://docs.victoriametrics.com/keyConcepts.html#write-data).
|
|
|
|
## Query data
|
|
|
|
VictoriaMetrics does not have a com\mand-line interface (CLI). Instead, it provides
|
|
an [HTTP API](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-querying-api-usage)
|
|
for serving read queries. This API is used in various integrations such as
|
|
[Grafana](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup). The same API is also used
|
|
by [VMUI](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) - a graphical User Interface for
|
|
querying and visualizing metrics:
|
|
|
|
{% include img.html href="migrate-from-influx-vmui.png" %}
|
|
|
|
See more about [how to query data in VictoriaMetrics](https://docs.victoriametrics.com/keyConcepts.html#query-data).
|
|
|
|
### Basic concepts
|
|
|
|
Let's take a closer look at querying specific with the following data sample:
|
|
|
|
```sql
|
|
foo
|
|
,instance=localhost bar=1.00 1652169600000000000
|
|
foo,instance=localhost bar=2.00 1652169660000000000
|
|
foo,instance=localhost bar=3.00 1652169720000000000
|
|
foo,instance=localhost bar=5.00 1652169840000000000
|
|
foo,instance=localhost bar=5.50 1652169960000000000
|
|
foo,instance=localhost bar=5.50 1652170020000000000
|
|
foo,instance=localhost bar=4.00 1652170080000000000
|
|
foo,instance=localhost bar=3.50 1652170260000000000
|
|
foo,instance=localhost bar=3.25 1652170320000000000
|
|
foo,instance=localhost bar=3.00 1652170380000000000
|
|
foo,instance=localhost bar=2.00 1652170440000000000
|
|
foo,instance=localhost bar=1.00 1652170500000000000
|
|
foo,instance=localhost bar=4.00 1652170560000000000
|
|
```
|
|
|
|
The data sample consists data points for a measurement `foo`
|
|
and a field `bar` with additional tag `instance=localhost`. If we would like plot this data as a time series in Grafana
|
|
it might have the following look:
|
|
|
|
{% include img.html href="migrate-from-influx-data-sample-in-influx.png" %}
|
|
|
|
The query used for this panel is written in
|
|
[InfluxQL](https://docs.influxdata.com/influxdb/v1.8/query_language/):
|
|
|
|
```sql
|
|
SELECT last ("bar")
|
|
FROM "foo"
|
|
WHERE ("instance" = 'localhost')
|
|
AND $timeFilter
|
|
GROUP BY time (1m)
|
|
```
|
|
|
|
Having this, let's import the same data sample in VictoriaMetrics and plot it in Grafana as well. To understand how the
|
|
InfluxQL query might be translated to MetricsQL let's break it into components first:
|
|
|
|
* `SELECT last("bar") FROM "foo"` - all requests
|
|
to [instant](https://docs.victoriametrics.com/keyConcepts.html#instant-query)
|
|
or [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) VictoriaMetrics APIs are reads, so no need
|
|
to specify the `SELECT` statement. There are no `measurements` or `fields` in VictoriaMetrics, so the whole expression
|
|
can be replaced with `foo_bar` in MetricsQL;
|
|
* `WHERE ("instance" = 'localhost')`- [filtering by labels](https://docs.victoriametrics.com/keyConcepts.html#filtering)
|
|
in MetricsQL requires specifying the filter in curly braces next to the metric name. So in MetricsQL filter expression
|
|
will be translated to `{instance="localhost"}`;
|
|
* `WHERE $timeFilter` - filtering by time is done via request params sent along with query, so in MetricsQL no need to
|
|
specify this filter;
|
|
* `GROUP BY time(1m)` - grouping by time is done by default
|
|
in [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) API according to specified `step` param.
|
|
This param is also a part of params sent along with request. See how to perform additional
|
|
[aggregations and grouping via MetricsQL](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
|
|
.
|
|
|
|
In result, executing the `foo_bar{instance="localhost"}` MetricsQL expression with `step=1m` for the same set of data in
|
|
Grafana will have the following form:
|
|
|
|
{% include img.html href="migrate-from-influx-data-sample-in-vm.png" %}
|
|
|
|
Visualizations from both databases are a bit different - VictoriaMetrics shows some extra points
|
|
filling the gaps in the graph. This behavior is described in more
|
|
detail [here](https://docs.victoriametrics.com/keyConcepts.html#range-query). In InfluxDB, we can achieve a similar
|
|
behavior by adding `fill(previous)` to the query.
|
|
|
|
VictoriaMetrics fills the gaps on the graph assuming time series are always continious and not discrete.
|
|
To limit the interval on which VictoriaMetrics will try to fill the gaps try setting `-search.maxStalenessInterval`
|
|
command-line flag to the value equal to actual resolution between data points (for example, to `10s`).
|
|
|
|
|
|
### Advanced usage
|
|
|
|
The good thing is that knowing the basics and some aggregation functions is often enough for using MetricsQL or PromQL.
|
|
Let's consider one of the most popular Grafana
|
|
dashboards [Node Exporter Full](https://grafana.com/grafana/dashboards/1860). It has almost 15 million downloads and
|
|
about 230 PromQL queries in it! But a closer look at those queries shows the following:
|
|
|
|
* ~120 queries are just selecting a metric with label filters,
|
|
e.g. `node_textfile_scrape_error{instance="$node",job="$job"}`;
|
|
* ~80 queries are using [rate](https://docs.victoriametrics.com/MetricsQL.html#rate) function for selected metric,
|
|
e.g. `rate(node_netstat_Tcp_InSegs{instance=\"$node\",job=\"$job\"})`
|
|
* and the rest
|
|
are [aggregation functions](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
|
|
like [sum](https://docs.victoriametrics.com/MetricsQL.html#sum)
|
|
or [count](https://docs.victoriametrics.com/MetricsQL.html#count).
|
|
|
|
To get a better understanding of how MetricsQL works, see the following resources:
|
|
|
|
* [MetricsQL concepts](https://docs.victoriametrics.com/keyConcepts.html#metricsql);
|
|
* [MetricsQL functions](https://docs.victoriametrics.com/MetricsQL.html);
|
|
* [PromQL tutorial for beginners](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085).
|
|
|
|
## How to migrate current data from InfluxDB to VictoriaMetrics
|
|
|
|
Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of
|
|
[supported formats](https://docs.victoriametrics.com/keyConcepts.html#push-model).
|
|
|
|
But migration from InfluxDB might get easier when using [vmctl](https://docs.victoriametrics.com/vmctl.html) -
|
|
VictoriaMetrics command-line tool. See more about
|
|
migrating [from InfluxDB v1.x versions](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x).
|
|
Migrating data from InfluxDB v2.x is not supported yet. But there is
|
|
useful [3rd party solution]((https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-2x)) for this.
|
|
|
|
Please note, that data migration is a backfilling process. So, please
|
|
consider [backfilling tips](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backfilling).
|
|
|
|
## Frequently asked questions
|
|
|
|
* How does VictoriaMetrics compare to InfluxDB?
|
|
* _[Answer](https://docs.victoriametrics.com/FAQ.html#how-does-victoriametrics-compare-to-influxdb)_
|
|
* Why don't VictoriaMetrics support Remote Read API, so I don't need to learn MetricsQL?
|
|
* _[Answer](https://docs.victoriametrics.com/FAQ.html#why-doesnt-victoriametrics-support-the-prometheus-remote-read-api)_
|
|
* The PromQL and MetricsQL are often mentioned together - why is that?
|
|
* _MetricsQL - query language inspired by PromQL. MetricsQL is backward-compatible with PromQL, so Grafana
|
|
dashboards backed by Prometheus datasource should work the same after switching from Prometheus to
|
|
VictoriaMetrics. Both languages mostly share the same concepts with slight differences._
|
|
* Query returns more data points than expected - why?
|
|
* _VictoriaMetrics may return non-existing data points if `step` param is lower than the actual data resolution. See
|
|
more about this [here](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
|
|
* How do I get the `real` last data point, not `ephemeral`?
|
|
* _[last_over_time](https://docs.victoriametrics.com/MetricsQL.html#last_over_time) function can be used for
|
|
limiting the lookbehind window for calculated data. For example, `last_over_time(metric[10s])` would return
|
|
calculated samples only if the real samples are located closer than 10 seconds to the calculated timestamps
|
|
according to
|
|
`start`, `end` and `step` query args passed
|
|
to [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
|
|
* How do I get raw data points with MetricsQL?
|
|
* _For getting raw data points specify the interval at which you want them in square brackets and send
|
|
as [instant query](https://docs.victoriametrics.com/keyConcepts.html#instant-query). For
|
|
example, `GET api/v1/query?query="my_metric[5m]"&time=<time>` will return raw samples for `my_metric` in interval
|
|
from `<time>` to `<time>-5m`._
|
|
* Can you have multiple aggregators in a MetricsQL query, e.g. `SELECT MAX(field), MIN(field) ...`?
|
|
* _Yes, try the following query `( alias(max(field), "max"), alias(min(field), "min") )`._
|
|
* How to translate Influx `percentile` function to MetricsQL?
|
|
* _[Answer](https://stackoverflow.com/questions/66431990/translate-influx-percentile-function-to-promqlb)_
|
|
* How to translate Influx `stddev` function to MetricsQL?
|
|
* _[Answer](https://stackoverflow.com/questions/66433143/translate-influx-stddev-to-promql)_
|