VictoriaMetrics/docs/VictoriaLogs/data-ingestion/Filebeat.md
2023-07-06 21:35:22 -07:00

4 KiB

Filebeat setup

Specify output.elasicsearch section in the filebeat.yml for sending the collected logs to VictoriaLogs:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.hostname,log.file.path"

Substitute the localhost:9428 address inside hosts section with the real TCP address of VictoriaLogs.

See these docs for details on the parameters section.

It is recommended verifying whether the initial setup generates the needed log fields and uses the correct stream fields. This can be done by specifying debug parameter and inspecting VictoriaLogs logs then:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.hostname,log.file.path"
    debug: "1"

If some log fields must be skipped during data ingestion, then they can be put into ignore_fields parameter. For example, the following config instructs VictoriaLogs to ignore log.offset and event.original fields in the ingested logs:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.name,log.file.path"
    ignore_fields: "log.offset,event.original"

When Filebeat ingests logs into VictoriaLogs at a high rate, then it may be needed to tune worker and bulk_max_size options. For example, the following config is optimized for higher than usual ingestion rate:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.name,log.file.path"
  worker: 8
  bulk_max_size: 1000

If the Filebeat sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via compression_level option. This usually allows saving network bandwidth and costs by up to 5 times:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.name,log.file.path"
  compression_level: 1

By default, the ingested logs are stored in the (AccountID=0, ProjectID=0) tenant. If you need storing logs in other tenant, then specify the needed tenant via headers at output.elasticsearch section. For example, the following filebeat.yml config instructs Filebeat to store the data to (AccountID=12, ProjectID=34) tenant:

output.elasticsearch:
  hosts: ["http://localhost:9428/insert/elasticsearch/"]
  headers:
    AccountID: 12
    ProjectID: 34
  parameters:
    _msg_field: "message"
    _time_field: "@timestamp"
    _stream_fields: "host.name,log.file.path"

See also: