2023-07-14 10:12:23 +00:00
---
2023-10-05 08:40:55 +00:00
sort: 3
2023-07-14 10:12:23 +00:00
title: Data ingestion
2023-10-05 08:40:55 +00:00
weight: 3
2023-07-14 20:21:58 +00:00
menu:
docs:
identifier: victorialogs-data-ingestion
parent: "victorialogs"
2023-10-05 08:40:55 +00:00
weight: 3
2023-10-14 11:45:57 +00:00
aliases:
- /VictoriaLogs/data-ingestion/
- /VictoriaLogs/data-ingestion/index.html
2023-07-14 10:12:23 +00:00
---
2023-06-21 05:08:19 +00:00
# Data ingestion
2024-05-24 22:30:58 +00:00
[VictoriaLogs ](https://docs.victoriametrics.com/victorialogs/ ) can accept logs from the following log collectors:
2023-06-21 05:08:19 +00:00
2024-06-17 10:13:18 +00:00
- Syslog - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/ ).
- Filebeat - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/filebeat/ ).
- Fluentbit - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/fluentbit/ ).
- Logstash - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/logstash/ ).
- Vector - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/vector/ ).
- Promtail (aka Grafana Loki) - see [these docs ](https://docs.victoriametrics.com/victorialogs/data-ingestion/promtail/ ).
2023-06-21 14:58:43 +00:00
2024-05-24 22:30:58 +00:00
The ingested logs can be queried according to [these docs ](https://docs.victoriametrics.com/victorialogs/querying/ ).
2023-06-21 05:08:19 +00:00
2023-06-22 01:31:50 +00:00
See also:
- [Log collectors and data ingestion formats ](#log-collectors-and-data-ingestion-formats ).
- [Data ingestion troubleshooting ](#troubleshooting ).
2023-06-21 05:08:19 +00:00
## HTTP APIs
VictoriaLogs supports the following data ingestion HTTP APIs:
- Elasticsearch bulk API. See [these docs ](#elasticsearch-bulk-api ).
2024-02-08 15:06:31 +00:00
- JSON stream API aka [ndjson ](https://jsonlines.org/ ). See [these docs ](#json-stream-api ).
2023-07-20 23:21:47 +00:00
- Loki JSON API. See [these docs ](#loki-json-api ).
2023-06-21 05:08:19 +00:00
VictoriaLogs accepts optional [HTTP parameters ](#http-parameters ) at data ingestion HTTP APIs.
### Elasticsearch bulk API
VictoriaLogs accepts logs in [Elasticsearch bulk API ](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html )
2023-06-21 14:58:43 +00:00
/ [OpenSearch Bulk API ](http://opensearch.org/docs/1.2/opensearch/rest-api/document-apis/bulk/ ) format
at `http://localhost:9428/insert/elasticsearch/_bulk` endpoint.
2023-06-21 05:08:19 +00:00
2023-06-22 02:39:22 +00:00
The following command pushes a single log line to VictoriaLogs:
2023-06-21 05:08:19 +00:00
2024-01-27 18:29:11 +00:00
```sh
2023-06-21 05:08:19 +00:00
echo '{"create":{}}
2023-07-20 23:21:47 +00:00
{"_msg":"cannot open file","_time":"0","host.name":"host123"}
2023-06-21 05:08:19 +00:00
' | curl -X POST -H 'Content-Type: application/json' --data-binary @- http://localhost:9428/insert/elasticsearch/_bulk
```
2023-06-22 02:39:22 +00:00
It is possible to push thousands of log lines in a single request to this API.
2024-05-24 22:30:58 +00:00
If the [timestamp field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field ) is set to `"0"` ,
2023-07-20 23:21:47 +00:00
then the current timestamp at VictoriaLogs side is used per each ingested log line.
Otherwise the timestamp field must be in the [ISO8601 ](https://en.wikipedia.org/wiki/ISO_8601 ) format. For example, `2023-06-20T15:32:10Z` .
Optional fractional part of seconds can be specified after the dot - `2023-06-20T15:32:10.123Z` .
Timezone can be specified instead of `Z` suffix - `2023-06-20T15:32:10+02:00` .
2024-05-24 22:30:58 +00:00
See [these docs ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ) for details on fields,
2023-06-22 02:39:22 +00:00
which must be present in the ingested log messages.
2023-06-22 02:41:18 +00:00
The API accepts various http parameters, which can change the data ingestion behavior - [these docs ](#http-parameters ) for details.
2023-06-22 02:39:22 +00:00
2024-05-24 22:30:58 +00:00
The following command verifies that the data has been successfully ingested to VictoriaLogs by [querying ](https://docs.victoriametrics.com/victorialogs/querying/ ) it:
2023-06-21 05:08:19 +00:00
2024-01-27 18:29:11 +00:00
```sh
2023-06-21 05:08:19 +00:00
curl http://localhost:9428/select/logsql/query -d 'query=host.name:host123'
```
The command should return the following response:
2024-01-27 18:29:11 +00:00
```sh
2023-06-21 05:08:19 +00:00
{"_msg":"cannot open file","_stream":"{}","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
```
2024-05-12 14:33:29 +00:00
The response by default contains all the [log fields ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ).
2024-05-24 22:30:58 +00:00
See [how to query specific fields ](https://docs.victoriametrics.com/victorialogs/logsql/#querying-specific-fields ).
2023-07-10 22:00:10 +00:00
2023-09-16 13:10:29 +00:00
The duration of requests to `/insert/elasticsearch/_bulk` can be monitored with `vl_http_request_duration_seconds{path="/insert/elasticsearch/_bulk"}` metric.
2023-06-22 02:39:22 +00:00
See also:
- [How to debug data ingestion ](#troubleshooting ).
- [HTTP parameters, which can be passed to the API ](#http-parameters ).
2024-05-24 22:30:58 +00:00
- [How to query VictoriaLogs ](https://docs.victoriametrics.com/victorialogs/querying/ ).
2023-06-22 02:39:22 +00:00
2023-06-21 05:08:19 +00:00
### JSON stream API
2024-02-08 15:06:31 +00:00
VictoriaLogs accepts JSON line stream aka [ndjson ](https://jsonlines.org/ ) at `http://localhost:9428/insert/jsonline` endpoint.
2023-06-22 02:39:22 +00:00
The following command pushes multiple log lines to VictoriaLogs:
2024-01-27 18:29:11 +00:00
```sh
2023-07-20 23:21:47 +00:00
echo '{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream1" }
{ "log": { "level": "error", "message": "oh no!" }, "date": "0", "stream": "stream1" }
{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream2" }
2023-06-22 02:39:22 +00:00
' | curl -X POST -H 'Content-Type: application/stream+json' --data-binary @- \
'http://localhost:9428/insert/jsonline?_stream_fields=stream& _time_field=date& _msg_field=log.message'
```
It is possible to push unlimited number of log lines in a single request to this API.
2024-05-24 22:30:58 +00:00
If the [timestamp field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field ) is set to `"0"` ,
2023-07-20 23:21:47 +00:00
then the current timestamp at VictoriaLogs side is used per each ingested log line.
Otherwise the timestamp field must be in the [ISO8601 ](https://en.wikipedia.org/wiki/ISO_8601 ) format. For example, `2023-06-20T15:32:10Z` .
2023-06-22 02:39:22 +00:00
Optional fractional part of seconds can be specified after the dot - `2023-06-20T15:32:10.123Z` .
Timezone can be specified instead of `Z` suffix - `2023-06-20T15:32:10+02:00` .
2024-05-24 22:30:58 +00:00
See [these docs ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ) for details on fields,
2023-06-22 02:39:22 +00:00
which must be present in the ingested log messages.
2023-06-21 13:31:28 +00:00
2023-06-22 02:41:18 +00:00
The API accepts various http parameters, which can change the data ingestion behavior - [these docs ](#http-parameters ) for details.
2023-06-22 02:39:22 +00:00
2024-05-24 22:30:58 +00:00
The following command verifies that the data has been successfully ingested into VictoriaLogs by [querying ](https://docs.victoriametrics.com/victorialogs/querying/ ) it:
2023-06-22 02:39:22 +00:00
2024-01-27 18:29:11 +00:00
```sh
2023-06-22 02:39:22 +00:00
curl http://localhost:9428/select/logsql/query -d 'query=log.level:*'
```
The command should return the following response:
2024-01-27 18:29:11 +00:00
```sh
2023-06-22 02:39:22 +00:00
{"_msg":"hello world","_stream":"{stream=\"stream2\"}","_time":"2023-06-20T13:35:11.56789Z","log.level":"info"}
{"_msg":"hello world","_stream":"{stream=\"stream1\"}","_time":"2023-06-20T15:31:23Z","log.level":"info"}
{"_msg":"oh no!","_stream":"{stream=\"stream1\"}","_time":"2023-06-20T15:32:10.567Z","log.level":"error"}
```
2024-05-12 14:33:29 +00:00
The response by default contains all the [log fields ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ).
2024-05-24 22:30:58 +00:00
See [how to query specific fields ](https://docs.victoriametrics.com/victorialogs/logsql/#querying-specific-fields ).
2023-07-10 22:00:10 +00:00
2023-09-16 13:10:29 +00:00
The duration of requests to `/insert/jsonline` can be monitored with `vl_http_request_duration_seconds{path="/insert/jsonline"}` metric.
2023-06-22 02:39:22 +00:00
See also:
2023-06-21 13:31:28 +00:00
2023-06-22 02:39:22 +00:00
- [How to debug data ingestion ](#troubleshooting ).
- [HTTP parameters, which can be passed to the API ](#http-parameters ).
2024-05-24 22:30:58 +00:00
- [How to query VictoriaLogs ](https://docs.victoriametrics.com/victorialogs/querying/ ).
2023-06-21 05:08:19 +00:00
2023-07-20 08:10:55 +00:00
### Loki JSON API
2023-07-20 23:21:47 +00:00
VictoriaLogs accepts logs in [Loki JSON API ](https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki ) format at `http://localhost:9428/insert/loki/api/v1/push` endpoint.
2023-07-20 08:10:55 +00:00
The following command pushes a single log line to Loki JSON API at VictoriaLogs:
2024-01-27 18:29:11 +00:00
```sh
2023-07-20 23:21:47 +00:00
curl -H "Content-Type: application/json" -XPOST "http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job" --data-raw \
'{"streams": [{ "stream": { "instance": "host123", "job": "app42" }, "values": [ [ "0", "foo fizzbuzz bar" ] ] }]}'
```
It is possible to push thousands of log streams and log lines in a single request to this API.
The API accepts various http parameters, which can change the data ingestion behavior - [these docs ](#http-parameters ) for details.
2023-07-21 04:11:18 +00:00
There is no need in specifying `_msg_field` and `_time_field` query args, since VictoriaLogs automatically extracts log message and timestamp from the ingested Loki data.
2023-07-20 23:21:47 +00:00
2024-05-24 22:30:58 +00:00
The following command verifies that the data has been successfully ingested into VictoriaLogs by [querying ](https://docs.victoriametrics.com/victorialogs/querying/ ) it:
2023-07-20 23:21:47 +00:00
2024-01-27 18:29:11 +00:00
```sh
2023-07-20 23:21:47 +00:00
curl http://localhost:9428/select/logsql/query -d 'query=fizzbuzz'
2023-07-20 08:10:55 +00:00
```
2023-07-20 23:21:47 +00:00
The command should return the following response:
2024-01-27 18:29:11 +00:00
```sh
2023-07-20 23:21:47 +00:00
{"_msg":"foo fizzbuzz bar","_stream":"{instance=\"host123\",job=\"app42\"}","_time":"2023-07-20T23:01:19.288676497Z"}
```
2024-05-12 14:33:29 +00:00
The response by default contains all the [log fields ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ).
2024-05-24 22:30:58 +00:00
See [how to query specific fields ](https://docs.victoriametrics.com/victorialogs/logsql/#querying-specific-fields ).
2023-07-20 23:21:47 +00:00
2023-09-16 13:10:29 +00:00
The duration of requests to `/insert/loki/api/v1/push` can be monitored with `vl_http_request_duration_seconds{path="/insert/loki/api/v1/push"}` metric.
2023-07-20 23:21:47 +00:00
See also:
- [How to debug data ingestion ](#troubleshooting ).
- [HTTP parameters, which can be passed to the API ](#http-parameters ).
2024-05-24 22:30:58 +00:00
- [How to query VictoriaLogs ](https://docs.victoriametrics.com/victorialogs/querying/ ).
2023-07-20 23:21:47 +00:00
2023-06-21 05:08:19 +00:00
### HTTP parameters
VictoriaLogs accepts the following parameters at [data ingestion HTTP APIs ](#http-apis ):
2024-05-24 22:30:58 +00:00
- `_msg_field` - it must contain the name of the [log field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model )
with the [log message ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field ) generated by the log shipper.
2023-06-21 05:08:19 +00:00
This is usually the `message` field for Filebeat and Logstash.
If the `_msg_field` parameter isn't set, then VictoriaLogs reads the log message from the `_msg` field.
2024-05-24 22:30:58 +00:00
- `_time_field` - it must contain the name of the [log field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model )
with the [log timestamp ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field ) generated by the log shipper.
2023-06-21 05:08:19 +00:00
This is usually the `@timestamp` field for Filebeat and Logstash.
If the `_time_field` parameter isn't set, then VictoriaLogs reads the timestamp from the `_time` field.
If this field doesn't exist, then the current timestamp is used.
2024-05-24 22:30:58 +00:00
- `_stream_fields` - it should contain comma-separated list of [log field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ) names,
which uniquely identify every [log stream ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields ) collected the log shipper.
2023-06-21 05:08:19 +00:00
If the `_stream_fields` parameter isn't set, then all the ingested logs are written to default log stream - `{}` .
2024-05-24 22:30:58 +00:00
- `ignore_fields` - this parameter may contain the list of [log field ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ) names,
2023-06-21 05:08:19 +00:00
which must be ignored during data ingestion.
- `debug` - if this parameter is set to `1` , then the ingested logs aren't stored in VictoriaLogs. Instead,
the ingested data is logged by VictoriaLogs, so it can be investigated later.
See also [HTTP headers ](#http-headers ).
### HTTP headers
VictoriaLogs accepts optional `AccountID` and `ProjectID` headers at [data ingestion HTTP APIs ](#http-apis ).
2024-05-24 22:30:58 +00:00
These headers may contain the needed tenant to ingest data to. See [multitenancy docs ](https://docs.victoriametrics.com/victorialogs/#multitenancy ) for details.
2023-06-21 05:08:19 +00:00
## Troubleshooting
2023-07-17 05:53:59 +00:00
The following command can be used for verifying whether the data is successfully ingested into VictoriaLogs:
2024-01-27 18:29:11 +00:00
```sh
2023-07-17 05:53:59 +00:00
curl http://localhost:9428/select/logsql/query -d 'query=*' | head
```
2024-05-24 22:30:58 +00:00
This command selects all the data ingested into VictoriaLogs via [HTTP query API ](https://docs.victoriametrics.com/victorialogs/querying/#http-api )
using [any value filter ](https://docs.victoriametrics.com/victorialogs/logsql/#any-value-filter ),
while `head` cancels query execution after reading the first 10 log lines. See [these docs ](https://docs.victoriametrics.com/victorialogs/querying/#command-line )
2023-07-17 05:53:59 +00:00
for more details on how `head` integrates with VictoriaLogs.
2024-05-12 14:33:29 +00:00
The response by default contains all the [log fields ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ).
2024-05-24 22:30:58 +00:00
See [how to query specific fields ](https://docs.victoriametrics.com/victorialogs/logsql/#querying-specific-fields ).
2023-07-17 05:53:59 +00:00
2023-06-21 05:08:19 +00:00
VictoriaLogs provides the following command-line flags, which can help debugging data ingestion issues:
- `-logNewStreams` - if this flag is passed to VictoriaLogs, then it logs all the newly
2024-05-24 22:30:58 +00:00
registered [log streams ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields ).
This may help debugging [high cardinality issues ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#high-cardinality ).
2023-06-21 05:08:19 +00:00
- `-logIngestedRows` - if this flag is passed to VictoriaLogs, then it logs all the ingested
2024-05-24 22:30:58 +00:00
[log entries ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model ).
2023-06-21 05:08:19 +00:00
See also `debug` [parameter ](#http-parameters ).
2024-05-24 22:30:58 +00:00
VictoriaLogs exposes various [metrics ](https://docs.victoriametrics.com/victorialogs/#monitoring ), which may help debugging data ingestion issues:
2023-06-21 05:08:19 +00:00
2024-05-24 22:30:58 +00:00
- `vl_rows_ingested_total` - the number of ingested [log entries ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model )
2024-06-03 08:04:13 +00:00
since the last VictoriaLogs restart. If this number increases over time, then logs are successfully ingested into VictoriaLogs.
2023-06-21 05:08:19 +00:00
The ingested logs can be inspected in the following ways:
- By passing `debug=1` parameter to every request to [data ingestion APIs ](#http-apis ). The ingested rows aren't stored in VictoriaLogs
in this case. Instead, they are logged, so they can be investigated later.
2024-05-24 22:30:58 +00:00
The `vl_rows_dropped_total` [metric ](https://docs.victoriametrics.com/victorialogs/#monitoring ) is incremented for each logged row.
2023-06-21 05:08:19 +00:00
- By passing `-logIngestedRows` command-line flag to VictoriaLogs. In this case it logs all the ingested data, so it can be investigated later.
2024-05-24 22:30:58 +00:00
- `vl_streams_created_total` - the number of created [log streams ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields )
2023-06-21 05:08:19 +00:00
since the last VictoriaLogs restart. If this metric grows rapidly during extended periods of time, then this may lead
2024-05-24 22:30:58 +00:00
to [high cardinality issues ](https://docs.victoriametrics.com/victorialogs/keyconcepts/#high-cardinality ).
2023-06-21 05:08:19 +00:00
The newly created log streams can be inspected in logs by passing `-logNewStreams` command-line flag to VictoriaLogs.
2023-06-21 14:58:43 +00:00
## Log collectors and data ingestion formats
2023-06-22 01:31:50 +00:00
Here is the list of log collectors and their ingestion formats supported by VictoriaLogs:
2023-06-21 14:58:43 +00:00
2023-07-28 18:30:44 +00:00
| How to setup the collector | Format: Elasticsearch | Format: JSON Stream | Format: Loki |
|------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|---------------------------------------------------------------|-------------------------------------------------------------------------------------|
2024-06-17 10:13:18 +00:00
| [Filebeat ](https://docs.victoriametrics.com/victorialogs/data-ingestion/filebeat/ ) | [Yes ](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html ) | No | No |
| [Fluentbit ](https://docs.victoriametrics.com/victorialogs/data-ingestion/fluentbit/ ) | No | [Yes ](https://docs.fluentbit.io/manual/pipeline/outputs/http ) | [Yes ](https://docs.fluentbit.io/manual/pipeline/outputs/loki ) |
| [Logstash ](https://docs.victoriametrics.com/victorialogs/data-ingestion/logstash/ ) | [Yes ](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html ) | No | No |
| [Vector ](https://docs.victoriametrics.com/victorialogs/data-ingestion/vector/ ) | [Yes ](https://vector.dev/docs/reference/configuration/sinks/elasticsearch/ ) | [Yes ](https://vector.dev/docs/reference/configuration/sinks/http/ ) | [Yes ](https://vector.dev/docs/reference/configuration/sinks/loki/ ) |
| [Promtail ](https://docs.victoriametrics.com/victorialogs/data-ingestion/promtail/ ) | No | No | [Yes ](https://grafana.com/docs/loki/latest/clients/promtail/configuration/#clients ) |