VictoriaMetrics/docs/VictoriaLogs/data-ingestion/Vector.md

# Vector setup

[Vector](http://vector.dev) log collector supports
[Elasticsearch sink](https://vector.dev/docs/reference/configuration/sinks/elasticsearch/) compatible with
[VictoriaMetrics Elasticsearch bulk API](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#elasticsearch-bulk-api).

Specify [`sinks.vlogs`](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html)  with `type=elasticsearch` section in the `vector.toml`
for sending the collected logs to VictoriaLogs:

```toml
[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
```

Substitute the `localhost:9428` address inside `endpoints` section with the real TCP address of VictoriaLogs.

Replace `your_input` with the name of the `inputs` section, which collects logs. See [these docs](https://vector.dev/docs/reference/configuration/sources/) for details.

The `_msg_field` parameter must contain the field name with the log message generated by Vector. This is usually `message` field.
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) for details.

The `_time_field` parameter must contain the field name with the log timestamp generated by Vector. This is usually `@timestamp` field.
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) for details.

It is recommended specifying comma-separated list of field names, which uniquely identify every log stream collected by Vector, in the `_stream_fields` parameter.
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) for details.

If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) aren't needed,
then VictoriaLogs can be instructed to ignore them during data ingestion - just pass `ignore_fields` parameter with comma-separated list of fields to ignore.
For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:

```toml
[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
    ignore_fields = "log.offset,event.original"
```

More details about `_msg_field`, `_time_field`, `_stream_fields` and `ignore_fields` are
available [here](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters).

When Vector ingests logs into VictoriaLogs at a high rate, then it may be needed to tune `batch.max_events` option.
For example, the following config is optimized for higher than usual ingestion rate:

```toml
[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.batch]
    max_events = 1000
```

If the Vector sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `compression` option.
This usually allows saving network bandwidth and costs by up to 5 times:

```toml
[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false
  compression = "gzip"

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
```

By default, the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy).
If you need storing logs in other tenant, then specify the needed tenant via `custom_headers` at `output.elasticsearch` section.
For example, the following `vector.toml` config instructs Logstash to store the data to `(AccountID=12, ProjectID=34)` tenant:

```toml
[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.request.headers]
    AccountID = "12"
    ProjectID = "34"
```

More info about output tuning you can find in [these docs](https://vector.dev/docs/reference/configuration/sinks/elasticsearch/).

[Here is a demo](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker/victorialogs/vector-docker)
for running Vector with VictoriaLogs with docker-compose and collecting logs from docker-containers
to VictoriaLogs (via [Elasticsearch API](https://docs.victoriametrics.com/VictoriaLogs/ingestion/#elasticsearch-bulk-api)).

The ingested log entries can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).

See also [data ingestion troubleshooting](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#troubleshooting) docs.
added more info and examples about data ingestion and collectors to VictoriaLogs docs (#4490) 2023-06-21 14:58:43 +00:00			`# Vector setup`

			`[Vector](http://vector.dev) log collector supports`
			`[Elasticsearch sink](https://vector.dev/docs/reference/configuration/sinks/elasticsearch/) compatible with`
			`[VictoriaMetrics Elasticsearch bulk API](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#elasticsearch-bulk-api).`

			Specify [`sinks.vlogs`](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html) with `type=elasticsearch` section in the `vector.toml`
			`for sending the collected logs to VictoriaLogs:`

			```toml
			`[sinks.vlogs]`
			`inputs = [ "your_input" ]`
			`type = "elasticsearch"`
			`endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]`
			`mode = "bulk"`
			`api_version = "v8"`
			`healthcheck.enabled = false`

			`[sinks.vlogs.query]`
			`_msg_field = "message"`
			`_time_field = "timestamp"`
			`_stream_fields = "host,container_name"`
			```

			Substitute the `localhost:9428` address inside `endpoints` section with the real TCP address of VictoriaLogs.

			Replace `your_input` with the name of the `inputs` section, which collects logs. See [these docs](https://vector.dev/docs/reference/configuration/sources/) for details.

			The `_msg_field` parameter must contain the field name with the log message generated by Vector. This is usually `message` field.
			`See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) for details.`

			The `_time_field` parameter must contain the field name with the log timestamp generated by Vector. This is usually `@timestamp` field.
			`See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) for details.`

			It is recommended specifying comma-separated list of field names, which uniquely identify every log stream collected by Vector, in the `_stream_fields` parameter.
			`See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) for details.`

			`If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) aren't needed,`
			then VictoriaLogs can be instructed to ignore them during data ingestion - just pass `ignore_fields` parameter with comma-separated list of fields to ignore.
			For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:

			```toml
			`[sinks.vlogs]`
			`inputs = [ "your_input" ]`
			`type = "elasticsearch"`
			`endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]`
			`mode = "bulk"`
			`api_version = "v8"`
			`healthcheck.enabled = false`

			`[sinks.vlogs.query]`
			`_msg_field = "message"`
			`_time_field = "timestamp"`
			`_stream_fields = "host,container_name"`
			`ignore_fields = "log.offset,event.original"`
			```

			More details about `_msg_field`, `_time_field`, `_stream_fields` and `ignore_fields` are
			`available [here](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters).`

			When Vector ingests logs into VictoriaLogs at a high rate, then it may be needed to tune `batch.max_events` option.
			`For example, the following config is optimized for higher than usual ingestion rate:`

			```toml
			`[sinks.vlogs]`
			`inputs = [ "your_input" ]`
			`type = "elasticsearch"`
			`endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]`
			`mode = "bulk"`
			`api_version = "v8"`
			`healthcheck.enabled = false`

			`[sinks.vlogs.query]`
			`_msg_field = "message"`
			`_time_field = "timestamp"`
			`_stream_fields = "host,container_name"`

			`[sinks.vlogs.batch]`
			`max_events = 1000`
			```

			If the Vector sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `compression` option.
			`This usually allows saving network bandwidth and costs by up to 5 times:`

			```toml
			`[sinks.vlogs]`
			`inputs = [ "your_input" ]`
			`type = "elasticsearch"`
			`endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]`
			`mode = "bulk"`
			`api_version = "v8"`
			`healthcheck.enabled = false`
			`compression = "gzip"`

			`[sinks.vlogs.query]`
			`_msg_field = "message"`
			`_time_field = "timestamp"`
			`_stream_fields = "host,container_name"`
			```

			By default, the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy).
			If you need storing logs in other tenant, then specify the needed tenant via `custom_headers` at `output.elasticsearch` section.
			For example, the following `vector.toml` config instructs Logstash to store the data to `(AccountID=12, ProjectID=34)` tenant:

			```toml
			`[sinks.vlogs]`
			`inputs = [ "your_input" ]`
			`type = "elasticsearch"`
			`endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]`
			`mode = "bulk"`
			`api_version = "v8"`
			`healthcheck.enabled = false`

			`[sinks.vlogs.query]`
			`_msg_field = "message"`
			`_time_field = "timestamp"`
			`_stream_fields = "host,container_name"`

			`[sinks.vlogs.request.headers]`
			`AccountID = "12"`
			`ProjectID = "34"`
			```

			`More info about output tuning you can find in [these docs](https://vector.dev/docs/reference/configuration/sinks/elasticsearch/).`

			`[Here is a demo](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker/victorialogs/vector-docker)`
			`for running Vector with VictoriaLogs with docker-compose and collecting logs from docker-containers`
			`to VictoriaLogs (via [Elasticsearch API](https://docs.victoriametrics.com/VictoriaLogs/ingestion/#elasticsearch-bulk-api)).`

			`The ingested log entries can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).`

			`See also [data ingestion troubleshooting](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#troubleshooting) docs.`