VictoriaMetrics/docs/VictoriaLogs/data-ingestion/README.md

9 KiB
Raw Blame History

Data ingestion

VictoriaLogs can accept logs from the following log collectors:

See also Log collectors and data ingestion formats in VictoriaMetrics.

The ingested logs can be queried according to these docs.

See also data ingestion troubleshooting docs.

HTTP APIs

VictoriaLogs supports the following data ingestion HTTP APIs:

VictoriaLogs accepts optional HTTP parameters at data ingestion HTTP APIs.

Elasticsearch bulk API

VictoriaLogs accepts logs in Elasticsearch bulk API / OpenSearch Bulk API format at http://localhost:9428/insert/elasticsearch/_bulk endpoint.

The following command pushes a single log line to Elasticsearch bulk API at VictoriaLogs:

echo '{"create":{}}
{"_msg":"cannot open file","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
' | curl -X POST -H 'Content-Type: application/json' --data-binary @- http://localhost:9428/insert/elasticsearch/_bulk

The following command verifies that the data has been successfully pushed to VictoriaLogs by querying it:

curl http://localhost:9428/select/logsql/query -d 'query=host.name:host123'

The command should return the following response:

{"_msg":"cannot open file","_stream":"{}","_time":"2023-06-21T04:24:24Z","host.name":"host123"}

JSON stream API

VictoriaLogs supports HTTP API on /insert/jsonline endpoint for data ingestion where body contains a JSON object in each line (separated by \n).

Here is an example:

POST http://localhost:9428/insert/jsonline/?_stream_fields=stream&_msg_field=log&_time_field=date
Content-Type: application/jsonl
{ "log": { "level": "info", "message": "hello world" }, "date": "20230620T15:31:23Z", "stream": "stream1" }
{ "log": { "level": "error", "message": "oh no!" }, "date": "20230620T15:32:10Z", "stream": "stream1" }
{ "log": { "level": "info", "message": "hello world" }, "date": "20230620T15:35:11Z", "stream": "stream2" }

HTTP parameters

VictoriaLogs accepts the following parameters at data ingestion HTTP APIs:

  • _msg_field - it must contain the name of the log field with the log message generated by the log shipper. This is usually the message field for Filebeat and Logstash. If the _msg_field parameter isn't set, then VictoriaLogs reads the log message from the _msg field.

  • _time_field - it must contain the name of the log field with the log timestamp generated by the log shipper. This is usually the @timestamp field for Filebeat and Logstash. If the _time_field parameter isn't set, then VictoriaLogs reads the timestamp from the _time field. If this field doesn't exist, then the current timestamp is used.

  • _stream_fields - it should contain comma-separated list of log field names, which uniquely identify every log stream collected the log shipper. If the _stream_fields parameter isn't set, then all the ingested logs are written to default log stream - {}.

  • ignore_fields - this parameter may contain the list of log field names, which must be ignored during data ingestion.

  • debug - if this parameter is set to 1, then the ingested logs aren't stored in VictoriaLogs. Instead, the ingested data is logged by VictoriaLogs, so it can be investigated later.

See also HTTP headers.

HTTP headers

VictoriaLogs accepts optional AccountID and ProjectID headers at data ingestion HTTP APIs. These headers may contain the needed tenant to ingest data to. See multitenancy docs for details.

Troubleshooting

VictoriaLogs provides the following command-line flags, which can help debugging data ingestion issues:

  • -logNewStreams - if this flag is passed to VictoriaLogs, then it logs all the newly registered log streams. This may help debugging high cardinality issues.
  • -logIngestedRows - if this flag is passed to VictoriaLogs, then it logs all the ingested log entries. See also debug parameter.

VictoriaLogs exposes various metrics, which may help debugging data ingestion issues:

  • vl_rows_ingested_total - the number of ingested log entries since the last VictoriaLogs restart. If this number icreases over time, then logs are successfully ingested into VictoriaLogs. The ingested logs can be inspected in the following ways:
    • By passing debug=1 parameter to every request to data ingestion APIs. The ingested rows aren't stored in VictoriaLogs in this case. Instead, they are logged, so they can be investigated later. The vl_rows_dropped_total metric is incremented for each logged row.
    • By passing -logIngestedRows command-line flag to VictoriaLogs. In this case it logs all the ingested data, so it can be investigated later.
  • vl_streams_created_total - the number of created log streams since the last VictoriaLogs restart. If this metric grows rapidly during extended periods of time, then this may lead to high cardinality issues. The newly created log streams can be inspected in logs by passing -logNewStreams command-line flag to VictoriaLogs.

Log collectors and data ingestion formats

Here is the list of supported collectors and their ingestion formats supported by VictoriaLogs:

Collector Elasticsearch JSON Stream
filebeat Yes No
fluentbit No Yes
logstash Yes No
vector Yes No