23dcec3911
This commit adds the following changes:
- Added support to push datadog logs with examples of how to ingest data
using Vector and Fluentbit
- Updated VictoriaLogs examples directory structure to have single
container image for victorialogs, agent (fluentbit, vector, etc) but
multiple configurations for different protocols
Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6632
(cherry picked from commit
|
||
---|---|---|
.. | ||
_index.md | ||
DataDogAgent.md | ||
Filebeat.md | ||
Fluentbit.md | ||
Fluentd.md | ||
Journald.md | ||
Logstash.md | ||
opentelemetry.md | ||
Promtail.md | ||
README.md | ||
syslog.md | ||
Telegraf.md | ||
Vector.md |
VictoriaLogs can accept logs from the following log collectors:
- Syslog, Rsyslog and Syslog-ng - see these docs.
- Filebeat - see these docs.
- Fluentbit - see these docs.
- Fluentd - see these docs.
- Logstash - see these docs.
- Vector - see these docs.
- Promtail (aka Grafana Loki, Grafana Agent or Grafana Alloy) - see these docs.
- Telegraf - see these docs.
- OpenTelemetry Collector - see these docs.
- Journald - see these docs.
The ingested logs can be queried according to these docs.
See also:
HTTP APIs
VictoriaLogs supports the following data ingestion HTTP APIs:
- Elasticsearch bulk API. See these docs.
- JSON stream API aka ndjson. See these docs.
- Loki JSON API. See these docs.
- OpenTelemetry API. See these docs.
- Journald export format.
VictoriaLogs accepts optional HTTP parameters at data ingestion HTTP APIs.
Elasticsearch bulk API
VictoriaLogs accepts logs in Elasticsearch bulk API
/ OpenSearch Bulk API format
at http://localhost:9428/insert/elasticsearch/_bulk
endpoint.
The following command pushes a single log line to VictoriaLogs:
echo '{"create":{}}
{"_msg":"cannot open file","_time":"0","host.name":"host123"}
' | curl -X POST -H 'Content-Type: application/json' --data-binary @- http://localhost:9428/insert/elasticsearch/_bulk
It is possible to push thousands of log lines in a single request to this API.
If the timestamp field is set to "0"
,
then the current timestamp at VictoriaLogs side is used per each ingested log line.
Otherwise the timestamp field must be in one of the following formats:
-
ISO8601 or RFC3339. For example,
2023-06-20T15:32:10Z
or2023-06-20 15:32:10.123456789+02:00
. If timezone information is missing (for example,2023-06-20 15:32:10
), then the time is parsed in the local timezone of the host where VictoriaLogs runs. -
Unix timestamp in seconds or in milliseconds. For example,
1686026893
(seconds) or1686026893735
(milliseconds).
See these docs for details on fields, which must be present in the ingested log messages.
The API accepts various http parameters, which can change the data ingestion behavior - these docs for details.
The following command verifies that the data has been successfully ingested to VictoriaLogs by querying it:
curl http://localhost:9428/select/logsql/query -d 'query=host.name:host123'
The command should return the following response:
{"_msg":"cannot open file","_stream":"{}","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
The response by default contains all the log fields. See how to query specific fields.
The duration of requests to /insert/elasticsearch/_bulk
can be monitored with vl_http_request_duration_seconds{path="/insert/elasticsearch/_bulk"}
metric.
See also:
- How to debug data ingestion.
- HTTP parameters, which can be passed to the API.
- How to query VictoriaLogs.
JSON stream API
VictoriaLogs accepts JSON line stream aka ndjson at http://localhost:9428/insert/jsonline
endpoint.
The following command pushes multiple log lines to VictoriaLogs:
echo '{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream1" }
{ "log": { "level": "error", "message": "oh no!" }, "date": "0", "stream": "stream1" }
{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream2" }
' | curl -X POST -H 'Content-Type: application/stream+json' --data-binary @- \
'http://localhost:9428/insert/jsonline?_stream_fields=stream&_time_field=date&_msg_field=log.message'
It is possible to push unlimited number of log lines in a single request to this API.
If the timestamp field is set to "0"
,
then the current timestamp at VictoriaLogs side is used per each ingested log line.
Otherwise the timestamp field must be in one of the following formats:
-
ISO8601 or RFC3339. For example,
2023-06-20T15:32:10Z
or2023-06-20 15:32:10.123456789+02:00
. If timezone information is missing (for example,2023-06-20 15:32:10
), then the time is parsed in the local timezone of the host where VictoriaLogs runs. -
Unix timestamp in seconds or in milliseconds. For example,
1686026893
(seconds) or1686026893735
(milliseconds).
See these docs for details on fields, which must be present in the ingested log messages.
The API accepts various http parameters, which can change the data ingestion behavior - these docs for details.
The following command verifies that the data has been successfully ingested into VictoriaLogs by querying it:
curl http://localhost:9428/select/logsql/query -d 'query=log.level:*'
The command should return the following response:
{"_msg":"hello world","_stream":"{stream=\"stream2\"}","_time":"2023-06-20T13:35:11.56789Z","log.level":"info"}
{"_msg":"hello world","_stream":"{stream=\"stream1\"}","_time":"2023-06-20T15:31:23Z","log.level":"info"}
{"_msg":"oh no!","_stream":"{stream=\"stream1\"}","_time":"2023-06-20T15:32:10.567Z","log.level":"error"}
The response by default contains all the log fields. See how to query specific fields.
The duration of requests to /insert/jsonline
can be monitored with vl_http_request_duration_seconds{path="/insert/jsonline"}
metric.
See also:
- How to debug data ingestion.
- HTTP parameters, which can be passed to the API.
- How to query VictoriaLogs.
Loki JSON API
VictoriaLogs accepts logs in Loki JSON API format at http://localhost:9428/insert/loki/api/v1/push
endpoint.
The following command pushes a single log line to Loki JSON API at VictoriaLogs:
curl -H "Content-Type: application/json" -XPOST "http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job" --data-raw \
'{"streams": [{ "stream": { "instance": "host123", "job": "app42" }, "values": [ [ "0", "foo fizzbuzz bar" ] ] }]}'
It is possible to push thousands of log streams and log lines in a single request to this API.
The API accepts various http parameters, which can change the data ingestion behavior - these docs for details.
There is no need in specifying _msg_field
and _time_field
query args, since VictoriaLogs automatically extracts log message and timestamp from the ingested Loki data.
The following command verifies that the data has been successfully ingested into VictoriaLogs by querying it:
curl http://localhost:9428/select/logsql/query -d 'query=fizzbuzz'
The command should return the following response:
{"_msg":"foo fizzbuzz bar","_stream":"{instance=\"host123\",job=\"app42\"}","_time":"2023-07-20T23:01:19.288676497Z"}
The response by default contains all the log fields. See how to query specific fields.
The duration of requests to /insert/loki/api/v1/push
can be monitored with vl_http_request_duration_seconds{path="/insert/loki/api/v1/push"}
metric.
See also:
- How to debug data ingestion.
- HTTP parameters, which can be passed to the API.
- How to query VictoriaLogs.
HTTP parameters
VictoriaLogs accepts the following configuration parameters via HTTP headers or via HTTP query string args at data ingestion HTTP APIs. HTTP query string parameters have priority over HTTP Headers.
HTTP Query string parameters
All the HTTP-based data ingestion protocols support the following HTTP query string args:
-
_msg_field
- the name of the log field containing log message. This is usually themessage
field for Filebeat and Logstash.The
_msg_field
arg may contain comma-separated list of field names. In this case the first non-empty field from the list is treated as log message.If the
_msg_field
arg isn't set, then VictoriaLogs reads the log message from the_msg
field. If the_msg
field is empty, then it is set to-defaultMsgValue
command-line flag value. -
_time_field
- the name of the log field containing log timestamp. This is usually the@timestamp
field for Filebeat and Logstash.If the
_time_field
arg isn't set, then VictoriaLogs reads the timestamp from the_time
field. If this field doesn't exist, then the current timestamp is used. -
_stream_fields
- comma-separated list of log field names, which uniquely identify every log stream.If the
_stream_fields
arg isn't set, then all the ingested logs are written to default log stream -{}
. -
ignore_fields
- an optional comma-separated list of log field names, which must be ignored during data ingestion. -
extra_fields
- an optional comma-separated list log fields, which must be added to all the ingested logs. The format of everyextra_fields
entry isfield_name=field_value
. If the log entry contains fields from theextra_fields
, then they are overwritten by the values specified inextra_fields
. -
debug
- if this arg is set to1
, then the ingested logs aren't stored in VictoriaLogs. Instead, the ingested data is logged by VictoriaLogs, so it can be investigated later.
See also HTTP headers.
HTTP headers
All the HTTP-based data ingestion protocols support the following HTTP Headers additionally to HTTP query args:
-
AccountID
- accountID of the tenant to ingest data to. See multitenancy docs for details. -
ProjectID
- projectID of the tenant to ingest data to. See multitenancy docs for details. -
VL-Msg-Field
- the name of the log field containing log message. This is usually themessage
field for Filebeat and Logstash.The
VL-Msg-Field
header may contain comma-separated list of field names. In this case the first non-empty field from the list is treated as log message.If the
VL-Msg-Field
header isn't set, then VictoriaLogs reads log message from the_msg
field. If the_msg
field is empty, then it is set to-defaultMsgValue
command-line flag value. -
VL-Time-Field
- the name of the log field containing log timestamp. This is usually the@timestamp
field for Filebeat and Logstash.If the
VL-Time-Field
header isn't set, then VictoriaLogs reads the timestamp from the_time
field. If this field doesn't exist, then the current timestamp is used. -
VL-Stream-Fields
- comma-separated list of log field names, which uniquely identify every log stream.If the
VL-Stream-Fields
header isn't set, then all the ingested logs are written to default log stream -{}
. -
VL-Ignore-Fields
- an optional comma-separated list of log field names, which must be ignored during data ingestion. -
VL-Extra-Field
- an optional comma-separated list of log fields, which must be added to all the ingested logs. The format of everyextra_fields
entry isfield_name=field_value
. If the log entry contains fields from theextra_fields
, then they are overwritten by the values specified inextra_fields
. -
VL-Debug
- if this parameter is set to1
, then the ingested logs aren't stored in VictoriaLogs. Instead, the ingested data is logged by VictoriaLogs, so it can be investigated later.
See also HTTP Query string parameters.
Troubleshooting
The following command can be used for verifying whether the data is successfully ingested into VictoriaLogs:
curl http://localhost:9428/select/logsql/query -d 'query=*' | head
This command selects all the data ingested into VictoriaLogs via HTTP query API
using any value filter,
while head
cancels query execution after reading the first 10 log lines. See these docs
for more details on how head
integrates with VictoriaLogs.
The response by default contains all the log fields. See how to query specific fields.
VictoriaLogs provides the following command-line flags, which can help debugging data ingestion issues:
-logNewStreams
- if this flag is passed to VictoriaLogs, then it logs all the newly registered log streams. This may help debugging high cardinality issues.-logIngestedRows
- if this flag is passed to VictoriaLogs, then it logs all the ingested log entries. See alsodebug
parameter.
VictoriaLogs exposes various metrics, which may help debugging data ingestion issues:
vl_rows_ingested_total
- the number of ingested log entries since the last VictoriaLogs restart. If this number increases over time, then logs are successfully ingested into VictoriaLogs. The ingested logs can be inspected in the following ways:- By passing
debug=1
parameter to every request to data ingestion APIs. The ingested rows aren't stored in VictoriaLogs in this case. Instead, they are logged, so they can be investigated later. Thevl_rows_dropped_total
metric is incremented for each logged row. - By passing
-logIngestedRows
command-line flag to VictoriaLogs. In this case it logs all the ingested data, so it can be investigated later.
- By passing
vl_streams_created_total
- the number of created log streams since the last VictoriaLogs restart. If this metric grows rapidly during extended periods of time, then this may lead to high cardinality issues. The newly created log streams can be inspected in logs by passing-logNewStreams
command-line flag to VictoriaLogs.
Log collectors and data ingestion formats
Here is the list of log collectors and their ingestion formats supported by VictoriaLogs:
How to setup the collector | Format: Elasticsearch | Format: JSON Stream | Format: Loki | Format: syslog | Format: OpenTelemetry | Format: Journald | Format: DataDog |
---|---|---|---|---|---|---|---|
Rsyslog | Yes | No | No | Yes | No | No | No |
Syslog-ng | Yes, v1, v2 | No | No | Yes | No | No | No |
Filebeat | Yes | No | No | No | No | No | No |
Fluentbit | No | Yes | Yes | Yes | Yes | No | Yes |
Logstash | Yes | No | No | Yes | Yes | No | Yes |
Vector | Yes | Yes | Yes | No | Yes | No | Yes |
Promtail | No | No | Yes | No | No | No | No |
OpenTelemetry Collector | Yes | No | Yes | Yes | Yes | No | Yes |
Telegraf | Yes | Yes | Yes | Yes | Yes | No | No |
Fluentd | Yes | Yes | Yes | Yes | No | No | No |
Journald | No | No | No | No | No | Yes | No |
DataDog Agent | No | No | No | No | No | No | Yes |