VictoriaMetrics/docs/VictoriaLogs/data-ingestion/Vector.md

5.4 KiB

Vector setup

Vector log collector supports Elasticsearch sink compatible with VictoriaMetrics Elasticsearch bulk API.

Specify sinks.vlogs with type=elasticsearch section in the vector.toml for sending the collected logs to VictoriaLogs:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

Substitute the localhost:9428 address inside endpoints section with the real TCP address of VictoriaLogs.

Replace your_input with the name of the inputs section, which collects logs. See these docs for details.

The _msg_field parameter must contain the field name with the log message generated by Vector. This is usually message field. See these docs for details.

The _time_field parameter must contain the field name with the log timestamp generated by Vector. This is usually @timestamp field. See these docs for details.

It is recommended specifying comma-separated list of field names, which uniquely identify every log stream collected by Vector, in the _stream_fields parameter. See these docs for details.

If some log fields aren't needed, then VictoriaLogs can be instructed to ignore them during data ingestion - just pass ignore_fields parameter with comma-separated list of fields to ignore. For example, the following config instructs VictoriaLogs to ignore log.offset and event.original fields in the ingested logs:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
    ignore_fields = "log.offset,event.original"

More details about _msg_field, _time_field, _stream_fields and ignore_fields are available here.

When Vector ingests logs into VictoriaLogs at a high rate, then it may be needed to tune batch.max_events option. For example, the following config is optimized for higher than usual ingestion rate:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.batch]
    max_events = 1000

If the Vector sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via compression option. This usually allows saving network bandwidth and costs by up to 5 times:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false
  compression = "gzip"

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

By default, the ingested logs are stored in the (AccountID=0, ProjectID=0) tenant. If you need storing logs in other tenant, then specify the needed tenant via custom_headers at output.elasticsearch section. For example, the following vector.toml config instructs Logstash to store the data to (AccountID=12, ProjectID=34) tenant:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.request.headers]
    AccountID = "12"
    ProjectID = "34"

More info about output tuning you can find in these docs.

Here is a demo for running Vector with VictoriaLogs with docker-compose and collecting logs from docker-containers to VictoriaLogs (via Elasticsearch API).

The ingested log entries can be queried according to these docs.

See also data ingestion troubleshooting docs.