github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-03-11 15:34:56 +00:00

Author	SHA1	Message	Date
Andrii Chubatiuk	67f8fa66ed	app/vlinsert: support floats for elasticseach timestamps (#8472 ) ### Describe Your Changes fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8470 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-03-10 13:48:48 +01:00
Andrii Chubatiuk	26fba57cfa	lib/protoparser/opentelemetry: properly marshal nested attributes into JSON Previously, opentelemetry attribute parsed added extra field names according to golang JSON parser spec for structs: ``` struct AnyValue{ StringValue string } ``` Was serialized into: ``` {"StringValue": "some-string"} ``` While opentelemetry-collector serializes it as ``` "some-string" ``` This commit changes this behaviour it makes parses compatible with opentelemetry-collector format. See test cases for examples. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8384	2025-03-05 16:35:07 +01:00
Andrii Chubatiuk	7a1c84b6ec	vlinsert: accept ES ping requests to endpoint without trailing slash (#8354 ) ### Describe Your Changes related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8353 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-02-25 15:48:40 +01:00
Zhu Jiekun	ca1d1bc12b	app/vlinsert: properly ingest journald logs with single-character name entity This commit changes journald ingestion validation regex: from `^[A-Z_][A-Z0-9_]+` to `^[A-Z_][A-Z0-9_]*`. It's needed to properly support entities with single-character names. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8314	2025-02-17 15:49:20 +01:00
Aliaksandr Valialkin	5fdf6df804	app/vlinsert: add a link to the pull request at systemd repository, which enables compression support This should simplify maintenance of this code in the future. While at it, clarify the change at the docs/VictoriaLogs/CHANGELOG.md. This is a follow-up commit for `3c9f9f49b0`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8264 Updates https://github.com/systemd/systemd/pull/34822	2025-02-12 22:07:35 +01:00
Andrii Chubatiuk	3a27073634	app/vlinsert: add OpenTelemetry ingested logs trace_id and span_id This commit parses additional optional fields from OpenTelemetry logs protocol. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8255	2025-02-12 10:47:38 +01:00
Andrii Chubatiuk	3c9f9f49b0	app/vlinsert: add journald content negotiation, which enables compression on a client Adding Accept-Encoding response header to support content negotiation, which was introduced in [this PR](https://github.com/systemd/systemd/pull/34822) and enables compression on journald	2025-02-12 10:45:40 +01:00
Aliaksandr Valialkin	ebac07bcf6	app/vlinsert: continue parsing JSON lines in the input stream after parse errors Previosly the parsing of the input stream was stopped after the first parse error. This isn't what most users expect when ingesting JSON lines in a stream where some JSON lines may be invalid.	2025-02-10 15:00:58 +01:00
Aliaksandr Valialkin	c26cbf57dd	app/vlinsert: accept timestamps with microsecond and nanosecond precision at _time field	2025-02-09 22:41:38 +01:00
Aliaksandr Valialkin	95f182053b	lib/logstorage: remove unnecesary abstraction - RowsFormatter It is better to use the AppendFieldsToJSON function directly instead of hiding it under RowsFormatter abstraction.	2025-01-28 18:03:18 +01:00
Aliaksandr Valialkin	3c036e0d31	lib/logstorage: ignore logs with too long field names during data ingestion Previously too long field names were silently truncated. This is not what most users expect. It is better ignoring the whole log entry in this case and logging it with the WARNING message, so human operator could notice and fix the ingestion of incorrect logs ASAP. The commit also adds and updates the following entries to VictoriaLogs faq: - https://docs.victoriametrics.com/victorialogs/faq/#how-many-fields-a-single-log-entry-may-contain - https://docs.victoriametrics.com/victorialogs/faq/#what-is-the-maximum-supported-field-name-length - https://docs.victoriametrics.com/victorialogs/faq/#what-length-a-log-record-is-expected-to-have These entries are referred at `-insert.maxLineSizeBytes` and `-insert.maxFieldsPerLine` command-line descriptions and at the WARNING messages, which are emitted when log entries are ignored because of some of these limits are exceeded.	2025-01-28 16:55:48 +01:00
Aliaksandr Valialkin	e794582f31	app/vlinsert/insertutils: avoid excess copying of lines at LineReader.buf 1. Do not copy every line from LineReader.buf to LineReader.Line - just refer the line at LineReader.buf. 2. Do not copy the next found line to the beginning of LineReader.buf - just track the next line start index with LineReader.bufOffset. This reduces memory copying when many lines are read into LineReader.buf by a single read() syscall.	2025-01-12 03:01:45 +01:00
Andrii Chubatiuk	f9cd408ca9	datadog-serverless: fixed metrics and logs ingestion from Datadog serverless extensions for AWS and GCP (#7769 ) fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7761 ### Describe Your Changes - datadog /api/v2/logs api supports message field in json format, which is not documented and is used by serverless extension. This PR allows message field to be both string and object type. Also added support of not documented timestamp field - added `-datadog.streamFields` and `-datadog.ignoreFields` flags to configure default stream fields for datadog logs, where there's no alternative option to pass extra headers and query args - added ingest `max` and `min` values of data, which are ingested using `datadogsketches` API, which is also actively used by serverless extensions - use default `.` separator instead of `_` for sketches metric names until metrics are not sanitized	2024-12-23 09:57:48 +01:00
Andrii Chubatiuk	891ad8f202	app/vlinsert: loki healthcheck endpoint (#7864 ) ### Describe Your Changes fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7824 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-12-18 14:59:44 +01:00
Aliaksandr Valialkin	17b813ba28	app/vlinsert: use default set of log stream fields for Loki and OpenTelemetry protocols if _stream_fields query arg is empty Loki protocol supports a list of log stream labels - see https://grafana.com/docs/loki/latest/get-started/labels/ OpenTelemetry protocol also supports a list of log stream labels, which are named resource attributes there. See https://opentelemetry.io/docs/concepts/resources/#semantic-attributes-with-sdk-provided-default-value Simplify logs' ingestion into VictoriaLogs for these protocols by allowing the data ingestion without the need to specify _stream_fields query arg or VL-Stream-Fields HTTP header. In this case the upstream log stream fields are used during data ingestion. The set of log stream fields can be overriden via _stream_fields query arg and via VL-Stream-Fields HTTP header if needed. Thanks to @AndrewChubatiuk for the initial idea and implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7554	2024-12-04 13:57:23 +01:00
Aliaksandr Valialkin	6a71921565	lib/logstorage: ignore logs with too many fields instead of trying to store them The storage isn't designed to work efficiently with logs containing too many log fields. It is better to emit a warning to the user and ignore such logs instead of trying to store them. This will allow fixing the issue by the user ASAP, and won't lead to excess resource usage at VictoriaLogs side, such as RAM, CPU, disk IO and disk space. While at it, ignore too long logs with the size exceeding the maximum block size during data ingestion. This should prevent from possible issues when dealing with such long logs if they were stored in the storage. Emit a warning in this case, so the user could identify and fix the issue ASAP. This is a follow-up for `22e6385f56` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7568	2024-12-04 12:18:34 +01:00
Aliaksandr Valialkin	7e924d7ecf	app/vlinsert: properly skip too long lines at Elasticsearch bulk import protocol Previously too long line in Elasticsearch bulk import protocol resulted in clsoing the client stream and ignoring the rest of log messages in the stream. Now only the too long message is ignored properly, while the rest of log messages are read successfully. This is a follow-up for 61e7c77ce25967269192ed2e201f67d8c48b972e	2024-12-04 12:18:32 +01:00
Aliaksandr Valialkin	480a8be48f	app/vlinsert: track vl_rows_ingested_total metric in a single place Previously vl_rows_ingested_total metric was tracked individually per each supported data ingestion protocols. It is better from maintainability PoV tracking this metric consistently in a single place - at logMessageProcessor.AddRow() function in the same way as vl_bytes_ingested_total metric is tracked. This is a follow-up for `50bfa689c9`	2024-12-04 12:18:30 +01:00
Aliaksandr Valialkin	c58d0549a8	app/vlinsert: continue parsing lines after too long lines in JSON line stream and Elasticsearch bulk import stream Previously all the lines after the too long line in the stream were ignored. This wasn't expected by most users.	2024-12-04 12:18:28 +01:00
Aliaksandr Valialkin	50bfa689c9	app/vlinsert: expose vl_bytes_ingested_total metric This metric tracks an approximate amounts of bytes processed when parsing the ingested logs. The metric is exposed individually per every supported data ingestion protocol. The protocol name is exposed via "type" label in order to be consistent with vl_rows_ingested_total metric. Thanks to @tenmozes for the initial idea and implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7682 While at it, remove the unneeded "format" label from vl_rows_ingested_total metric. The "type" label must be enough for encoding the data ingestion format.	2024-11-30 17:25:57 +01:00
Aliaksandr Valialkin	342f84c569	app/vlinsert/loki: show the original request body on parse errors This should simplify debugging. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7490	2024-11-08 22:00:58 +01:00
Aliaksandr Valialkin	4f0bec6f03	app/vlinsert/syslog: allow changing the default set of log fields to use as stream fields during syslog data ingestion Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7488 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7480 See https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/#stream-fields	2024-11-08 21:21:08 +01:00
Aliaksandr Valialkin	cd60a4c589	app/vlinsert/syslog: add an ability to drop and add fields during data ingestion via Syslog protocol See https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/#dropping-fields and https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/#adding-extra-fields	2024-11-08 20:57:59 +01:00
Aliaksandr Valialkin	3d75c39ff4	app/vlinsert/loki: follow-up for `3aeb1b96a2` - Disallow more than 3 items in Loki line entry, since it must contain two mandatory entries: timestamp and message, plus one optional entry - structured metadata. See https://grafana.com/docs/loki/latest/reference/loki-http-api/#ingest-logs - Update references to structured metadata docs in Loki, in order to simplify further maintenance of the code - Move the change from bugfix to feature at docs/VictoriaLogs/CHANGELOG.md, since VictoriaLogs never supported structured metadata over JSON Loki protocol. The support for structured metadata in protobuf Loki protocol has been added in `ac06569c49` , which has been included in v0.28.0-victorialogs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7431 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7432	2024-11-06 19:23:38 +01:00
Zhu Jiekun	3aeb1b96a2	app/vlinisert/loki: properly parse json logs with structured metadata Loki protocol supports optional `metadata` object for each ingested line. It's added as 3rd field at the (ts,msg,metadata) tuple. Previously, loki request json parsers rejected log line if tuple size != 2. This commit allows optional tuple field. It parses it as json object and adds it as log metadata fields to the log message stream. related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7431 --------- Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-11-06 17:25:05 +01:00
Andrii Chubatiuk	e0930687f1	vlinsert: support datadog logs This commit adds the following changes: - Added support to push datadog logs with examples of how to ingest data using Vector and Fluentbit - Updated VictoriaLogs examples directory structure to have single container image for victorialogs, agent (fluentbit, vector, etc) but multiple configurations for different protocols Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6632	2024-11-05 16:52:35 +01:00
Aliaksandr Valialkin	4478e48eb6	app/vlinsert: implement the ability to add extra fields to the ingested logs This can be done via extra_fields query arg or via VL-Extra-Fields HTTP header. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7354#issuecomment-2448671445	2024-11-01 20:06:17 +01:00
Aliaksandr Valialkin	d2dce13df6	app/vlinsert: typo fix after `16ee470da6`	2024-10-30 17:59:49 +01:00
Aliaksandr Valialkin	16ee470da6	app/vlinsert: accept logs with empty _msg field In this case the _msg field is set to the value specified in the -defaultMsgValue command-line flag. This should simplify first-time migration to VictoriaLogs from other systems.	2024-10-30 14:59:38 +01:00
Aliaksandr Valialkin	ed73f8350b	app/vlinsert: allow specifying comma-separated list of fields containing log message via _msg_field query arg and VL-Msg-Field HTTP request header This msy be useful when ingesting logs from different sources, which store the log message in different fields. For example, `_msg_field=message,event.data,some_field` will get log message from the first non-empty field: `message`, `event.data` and `some_field`.	2024-10-30 14:17:33 +01:00
Andrii Chubatiuk	7e60afb6fc	app/vlinsert: adds journald ingestion support This commit allows to ingest logs with journald format. https://www.freedesktop.org/software/systemd/man/latest/systemd-journal-remote.service.html related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4618	2024-10-27 20:36:33 +01:00
Aliaksandr Valialkin	bac193e50b	app/vlselect: do not show empty fields in query results Empty fields are treated as non-existing fields by VictoriaLogs data model. So there is no sense in returning empty fields in query results, since they may mislead and confuse users.	2024-10-14 23:43:58 +02:00
Aliaksandr Valialkin	806bc2ac58	app/vlinsert: support unix timestamps in seconds and milliseconds in JSON stream data ingestion API	2024-09-28 21:56:50 +02:00
Aliaksandr Valialkin	7d7d7c03bc	app/vlinsert: accept unix timestamp in seconds additionally to milliseconds at ElasticSearch bulk API Timestamps in seconds are sometimes used for data ingestion via ElasticSearch bulk API	2024-09-28 21:19:54 +02:00
Aliaksandr Valialkin	037652d5ae	app/vlinsert: support `_time` field without timezone information during data ingestion Use local timezone of the host server in this case. The timezone can be overridden with TZ environment variable if needed. While at it, allow using whitespace instead of T as a delimiter between data and time in the ingested _time field. For example, '2024-09-20 10:20:30' is now accepted during data ingestion. This is valid ISO8601 format, which is used by some log shippers, so it should be supported. This format is also known as SQL datetime format. Also assume local time zone when time without timezone information is passed to querying APIs. Previously such a time was parsed in UTC timezone. Add `Z` to the end of the time string if the old behaviour is preferred. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6721	2024-09-26 12:49:35 +02:00
Aliaksandr Valialkin	6b775ca68c	app/vlinsert/insertutils: add a link to docs why _msg field must be non-empty	2024-09-26 09:53:17 +02:00
Zhu Jiekun	7185fe012b	feature: [victorialogs] drop logs without non-empty _msg field (#7056 ) ### Describe Your Changes VictoriaLogs allows logs without `_msg` field or `_msg` field is empty. This lead to incorrect search result. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785 This pull request search for non-empty `_msg` field before log entry is added to `LogRows`. New counter `vl_rows_dropped_total{reason="msg_not_exist"}` is introduced. Example log output: ``` 2024-09-23T02:33:19.719Z warn app/vlinsert/insertutils/common_params.go:189 dropping log line without _msg field; [{@timestamp 2024-09-18T13:42:16.600000000Z} {Attributes.array.attribute ["many","values"]} {Attributes.boolean.attribute true} {Attributes.double.attribute 637.704} {Attributes.int.attribute 10} {Attributes.map.attribute.some.map.key some value} {Attributes.string.attribute some string} {Body Example ddddddddddlog record} {Resource.service.name my.service} {Scope.my.scope.attribute some scope attribute} {Scope.name my.library} {Scope.version 1.0.0} {SeverityNumber 10} {SeverityText Information} {SpanId eee19b7ec3c1b174} {TraceFlags 0} {TraceId 5b8efff798038103d269b633813fc60c}] ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). - [ ] Benchmark for potential performance loss. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-09-26 09:35:28 +02:00
f41gh7	7b0aaf1ea2	follow-up after `01430a155c` * properly check SeverityNumber at FormatSeverity function it could be negative, which could cause panic for victorialogs	2024-09-04 15:36:34 +02:00
Andrii Chubatiuk	01430a155c	vlinsert: added opentelemetry logs support Commit adds the following changes: * Adds support of OpenTelemetry logs for Victoria Logs with protobuf encoded messages * json encoding is not supported for the following reasons: - It brings a lot of fragile code, which works inefficiently. - json encoding is impossible to use with language SDK. * splits metrics and logs structures at lib/protoparser/opentelemetry/pb package. * adds docs with examples for opentelemetry logs. --- Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4839 Co-authored-by: AndrewChubatiuk <andrew.chubatiuk@gmail.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-09-03 20:12:05 +02:00
f41gh7	8b36529b32	follow-up after `1731c0eabf` * updates change log * adds VL-Debug http header * updates doc * extracts only the first value of http headers for VL-Stream-Fields and VL-Ignore-Fields. It makes behaviour the same as Query string args. And allows to easily configure client applications. Since most of the client collectors don't support multi value headers. Signed-off-by: f41gh7 <nik@victoriametrics.com>	2024-09-03 19:16:10 +02:00
Andrii Chubatiuk	1731c0eabf	app/vlinsert: support getting _msg_field, _time_field, _stream_fields and _ignore_fields from headers * Many collectors don't support forwarding url query params to the remote system. It makes impossible to define stream fields for it. Workaround with proxy between VictoriaLogs and log shipper is too complicated solution. * This commit adds the following changes: * Adds fallback to to headers params, if query param is empty for: _msg_field -> VL-Msg-Field _stream_fields -> VL-Stream-Fields _ignore_fields -> VL-Ignore-Fields _time_field -> VL-Time-Field * removes deprecations from victorialogs compose files, added more output format examples for logstash, telegraf, fluent-bit related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5310	2024-09-03 17:43:26 +02:00
Zakhar Bessarab	58b6c54da2	app/vlinsert/elasticsearch: add fake response for logstash requests (#6742 ) ### Describe Your Changes This is needed in order to support standard Elasticsearch output in Logstash pipelines. See: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6660 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-08-06 15:43:33 +02:00
Aliaksandr Valialkin	ac06569c49	app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages	2024-07-10 03:05:17 +02:00
Aliaksandr Valialkin	08c32232a6	app/vlinsert/loki: remove unused functions from the generated protobuf code	2024-07-10 00:18:48 +02:00
Aliaksandr Valialkin	e11f0aa9ec	app/vlinsert/insertutils: flush the ingested logs from in-memory buffer to storage every second Previously the in-memory buffer could remain unflushed for long periods of time under low ingestion rate. The ingested logs weren't visible for search during this time.	2024-07-02 01:38:19 +02:00
Aliaksandr Valialkin	ba6f82069f	app/vlinsert/syslog: add an ability to use log ingestion time as the _time field	2024-07-02 01:38:19 +02:00
Aliaksandr Valialkin	d7185f1b77	app/vlinsert/syslog: properly skip empty lines in Syslog protocol Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6548	2024-06-28 14:09:28 +02:00
Aliaksandr Valialkin	3eacd43fff	lib/logstorage: parse syslog structured data into separate fields in order to simplify further querying of this data	2024-06-25 14:53:39 +02:00
Aliaksandr Valialkin	9e1c037249	lib/logstorage: properly parse timezone offset at TryParseTimestampRFC3339Nano() The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets. While at it, make tryParseTimestampISO8601 function private in order to prevent from improper usage of this function from outside the lib/logstorage package. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508	2024-06-25 14:53:38 +02:00
Aliaksandr Valialkin	3eda4617c0	app/vlinsert: properly parse timestamps with nanosecond precision at /insert/jsonline HTTP endpoint This has been broken in `2b6a634ec0`	2024-06-18 00:23:25 +02:00

1 2

83 commits