Commit graph

61 commits

Author SHA1 Message Date
Aliaksandr Valialkin
bab5348b8b
app/vlinsert/syslog: add an ability to drop and add fields during data ingestion via Syslog protocol
See https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/#dropping-fields
and https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/#adding-extra-fields
2024-11-14 17:21:27 +01:00
Aliaksandr Valialkin
1c7d8adc65
app/vlinsert/loki: follow-up for 3aeb1b96a2
- Disallow more than 3 items in Loki line entry, since it must contain two mandatory entries: timestamp and message,
  plus one optional entry - structured metadata. See https://grafana.com/docs/loki/latest/reference/loki-http-api/#ingest-logs

- Update references to structured metadata docs in Loki, in order to simplify further maintenance of the code

- Move the change from bugfix to feature at docs/VictoriaLogs/CHANGELOG.md, since VictoriaLogs never supported
  structured metadata over JSON Loki protocol. The support for structured metadata in protobuf Loki protocol
  has been added in ac06569c49 , which has been included in v0.28.0-victorialogs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7431
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7432

(cherry picked from commit 3d75c39ff4)
2024-11-07 13:00:20 +01:00
Zhu Jiekun
e113a40142
app/vlinisert/loki: properly parse json logs with structured metadata
Loki protocol supports optional `metadata` object for each ingested line. It's added as 3rd field at the (ts,msg,metadata) tuple. Previously,  loki request json parsers rejected log line if tuple size != 2.

This commit allows optional tuple field. It parses it as json object and adds it as log metadata fields to the  log message stream.

related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7431

---------
Co-authored-by: f41gh7 <nik@victoriametrics.com>

(cherry picked from commit 3aeb1b96a2)
2024-11-07 13:00:18 +01:00
Andrii Chubatiuk
23dcec3911
vlinsert: support datadog logs
This commit adds the following changes:

- Added support to push datadog logs with examples of how to ingest data
using Vector and Fluentbit
- Updated VictoriaLogs examples directory structure to have single
container image for victorialogs, agent (fluentbit, vector, etc) but
multiple configurations for different protocols

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6632

(cherry picked from commit e0930687f1)
2024-11-06 13:58:16 +01:00
Aliaksandr Valialkin
fced48d540
app/vlinsert: implement the ability to add extra fields to the ingested logs
This can be done via extra_fields query arg or via VL-Extra-Fields HTTP header.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7354#issuecomment-2448671445

(cherry picked from commit 4478e48eb6)
2024-11-04 10:23:16 -03:00
Aliaksandr Valialkin
0eaec9f376
app/vlinsert: typo fix after 16ee470da6
(cherry picked from commit d2dce13df6)
2024-10-31 14:11:07 +01:00
Aliaksandr Valialkin
b4f5e5c1b8
app/vlinsert: accept logs with empty _msg field
In this case the _msg field is set to the value specified in the -defaultMsgValue command-line flag.

This should simplify first-time migration to VictoriaLogs from other systems.

(cherry picked from commit 16ee470da6)
2024-10-30 15:19:52 +01:00
Aliaksandr Valialkin
8baa5177aa
app/vlinsert: allow specifying comma-separated list of fields containing log message via _msg_field query arg and VL-Msg-Field HTTP request header
This msy be useful when ingesting logs from different sources, which store the log message in different fields.
For example, `_msg_field=message,event.data,some_field` will get log message from the first non-empty field:
`message`, `event.data` and `some_field`.

(cherry picked from commit ed73f8350b)
2024-10-30 15:19:52 +01:00
Andrii Chubatiuk
da14c7b26e
app/vlinsert: adds journald ingestion support
This commit allows to ingest logs with journald format. 

https://www.freedesktop.org/software/systemd/man/latest/systemd-journal-remote.service.html

related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4618
2024-10-27 20:42:41 +01:00
Aliaksandr Valialkin
0881e5fd5c
app/vlselect: do not show empty fields in query results
Empty fields are treated as non-existing fields by VictoriaLogs data model.
So there is no sense in returning empty fields in query results, since they may mislead and confuse users.

(cherry picked from commit bac193e50b)
2024-10-15 11:49:32 +02:00
Aliaksandr Valialkin
0f1b3852dd
app/vlinsert: support unix timestamps in seconds and milliseconds in JSON stream data ingestion API 2024-09-28 21:57:19 +02:00
Aliaksandr Valialkin
b8fa213310
app/vlinsert: accept unix timestamp in seconds additionally to milliseconds at ElasticSearch bulk API
Timestamps in seconds are sometimes used for data ingestion via ElasticSearch bulk API
2024-09-28 21:21:19 +02:00
Aliaksandr Valialkin
4d27933041
app/vlinsert: support _time field without timezone information during data ingestion
Use local timezone of the host server in this case. The timezone can be overridden
with TZ environment variable if needed.

While at it, allow using whitespace instead of T as a delimiter between data and time
in the ingested _time field. For example, '2024-09-20 10:20:30' is now accepted
during data ingestion. This is valid ISO8601 format, which is used by some log shippers,
so it should be supported. This format is also known as SQL datetime format.

Also assume local time zone when time without timezone information is passed to querying APIs.
Previously such a time was parsed in UTC timezone. Add `Z` to the end of the time string
if the old behaviour is preferred.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6721
2024-09-26 12:50:14 +02:00
Aliaksandr Valialkin
7309058282
app/vlinsert/insertutils: add a link to docs why _msg field must be non-empty 2024-09-26 09:53:24 +02:00
Zhu Jiekun
3fa72b2c1b
feature: [victorialogs] drop logs without non-empty _msg field (#7056)
### Describe Your Changes

VictoriaLogs allows logs without `_msg` field or `_msg` field is empty.
This lead to incorrect search result. See:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785

This pull request search for non-empty `_msg` field before log entry is
added to `LogRows`.

New counter `vl_rows_dropped_total{reason="msg_not_exist"}` is
introduced.

Example log output:
```
2024-09-23T02:33:19.719Z        warn    app/vlinsert/insertutils/common_params.go:189   dropping log line without _msg field; [{@timestamp 2024-09-18T13:42:16.600000000Z} {Attributes.array.attribute ["many","values"]} {Attributes.boolean.attribute true} {Attributes.double.attribute 637.704} {Attributes.int.attribute 10} {Attributes.map.attribute.some.map.key some value} {Attributes.string.attribute some string} {Body Example ddddddddddlog record} {Resource.service.name my.service} {Scope.my.scope.attribute some scope attribute} {Scope.name my.library} {Scope.version 1.0.0} {SeverityNumber 10} {SeverityText Information} {SpanId eee19b7ec3c1b174} {TraceFlags 0} {TraceId 5b8efff798038103d269b633813fc60c}]
```

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
- [ ] Benchmark for potential performance loss.

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-09-26 09:35:58 +02:00
f41gh7
64361c2d7a
follow-up after 01430a155c
* properly check SeverityNumber at FormatSeverity function
 it could be negative, which could cause panic for victorialogs
2024-09-04 15:39:55 +02:00
Andrii Chubatiuk
711f2cc4f2
vlinsert: added opentelemetry logs support
Commit adds the following changes:

* Adds support of OpenTelemetry logs for Victoria Logs with protobuf encoded messages

*  json encoding is not supported for the following reasons:
   - It brings a lot of fragile code, which works inefficiently.
   - json encoding is impossible to use with language SDK.

* splits metrics and logs structures at lib/protoparser/opentelemetry/pb package.

* adds docs with examples for opentelemetry logs.

---
Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4839

Co-authored-by: AndrewChubatiuk <andrew.chubatiuk@gmail.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2024-09-03 20:24:01 +02:00
f41gh7
dcc525b388
follow-up after 1731c0eabf
* updates change log
* adds VL-Debug http header
* updates doc
* extracts only the first value of http headers for VL-Stream-Fields and VL-Ignore-Fields.
  It makes behaviour the same as Query string args. And allows to easily configure client applications.
  Since most of the client collectors don't support multi value headers.

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2024-09-03 20:24:01 +02:00
Andrii Chubatiuk
d5fe4566e5
app/vlinsert: support getting _msg_field, _time_field, _stream_fields and _ignore_fields from headers
*  Many collectors don't support forwarding url query params to the remote system. It makes impossible to define stream fields for it. Workaround with proxy between VictoriaLogs and log shipper is too complicated solution.

* This commit adds the following changes:
 * Adds fallback to to headers params, if query param is empty for:
     _msg_field -> VL-Msg-Field
    _stream_fields -> VL-Stream-Fields
    _ignore_fields -> VL-Ignore-Fields
    _time_field -> VL-Time-Field
 * removes deprecations from victorialogs compose files, added more
output format examples for logstash, telegraf, fluent-bit

 related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5310
2024-09-03 20:24:00 +02:00
Zakhar Bessarab
a3a0bafe76
app/vlinsert/elasticsearch: add fake response for logstash requests (#6742)
### Describe Your Changes

This is needed in order to support standard Elasticsearch output in
Logstash pipelines.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6660

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 58b6c54da2)
2024-08-06 16:30:11 +02:00
Aliaksandr Valialkin
a1decb5ca1
app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages 2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin
73ca22bb7d
app/vlinsert/loki: remove unused functions from the generated protobuf code 2024-07-10 00:22:10 +02:00
Aliaksandr Valialkin
0912a652d5
app/vlinsert/insertutils: flush the ingested logs from in-memory buffer to storage every second
Previously the in-memory buffer could remain unflushed for long periods of time under low ingestion rate.
The ingested logs weren't visible for search during this time.
2024-07-02 01:39:45 +02:00
Aliaksandr Valialkin
ab28a1f93e
app/vlinsert/syslog: add an ability to use log ingestion time as the _time field 2024-07-02 01:39:45 +02:00
Aliaksandr Valialkin
c9fc8079c4
app/vlinsert/syslog: properly skip empty lines in Syslog protocol
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6548
2024-06-28 14:09:45 +02:00
Aliaksandr Valialkin
f24123a776
lib/logstorage: parse syslog structured data into separate fields in order to simplify further querying of this data 2024-06-25 14:54:25 +02:00
Aliaksandr Valialkin
1716c4e609
lib/logstorage: properly parse timezone offset at TryParseTimestampRFC3339Nano()
The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets.

While at it, make tryParseTimestampISO8601 function private in order to prevent
from improper usage of this function from outside the lib/logstorage package.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508
2024-06-25 14:54:24 +02:00
Aliaksandr Valialkin
ee114ca59c
app/vlinsert: properly parse timestamps with nanosecond precision at /insert/jsonline HTTP endpoint
This has been broken in 2b6a634ec0
2024-06-18 00:24:11 +02:00
Aliaksandr Valialkin
c10a646d19
app/vlinsert/syslog: allow accepting syslog messages with different configs at different ports 2024-06-17 23:16:58 +02:00
Aliaksandr Valialkin
f74e6b0674
app/vlinsert: properly parse length-delimited syslog messages sent over TCP according to RFC5425 2024-06-17 22:30:28 +02:00
Aliaksandr Valialkin
1750991119
lib/logstorage: work-in-progress 2024-06-17 12:13:25 +02:00
Aliaksandr Valialkin
3152df2bce
lib/logstorage: work-in-progress 2024-05-25 00:31:55 +02:00
Aliaksandr Valialkin
04d0dd2542
lib/logstorage: work-in-progress 2024-05-22 21:01:28 +02:00
Aliaksandr Valialkin
582e7d5439
lib/logstorage: work-in-progress 2024-05-20 04:09:15 +02:00
Aliaksandr Valialkin
00f59d6ddf
all: fix golangci-lint(revive) warnings after 0c0ed61ce7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001
2024-04-03 03:00:45 +03:00
Khanh Quoc Le
03e5ebaea9
Add _stream fields log (#5068) 2023-11-17 16:04:13 +01:00
Aliaksandr Valialkin
36a1fdca6c
all: consistently use %w instead of %s in when error is passed to fmt.Errorf()
This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.
2023-10-26 09:44:40 +02:00
Aliaksandr Valialkin
120f3bc467
lib/logstorage: follow-up for 8a23d08c21
- Compare the actual free disk space to the value provided via -storage.minFreeDiskSpaceBytes
  directly inside the Storage.IsReadOnly(). This should work fast in most cases.
  This simplifies the logic at lib/storage.

- Do not take into account -storage.minFreeDiskSpaceBytes during background merges, since
  it results in uncontrolled growth of small parts when the free disk space approaches -storage.minFreeDiskSpaceBytes.
  The background merge logic uses another mechanism for determining whether there is enough
  disk space for the merge - it reserves the needed disk space before the merge
  and releases it after the merge. This prevents from out of disk space errors during background merge.

- Properly handle corner cases for flushing in-memory data to disk when the storage
  enters read-only mode. This is better than losing the in-memory data.

- Return back Storage.MustAddRows() instead of Storage.AddRows(),
  since the only case when AddRows() can return error is when the storage is in read-only mode.
  This case must be handled by the caller by calling Storage.IsReadOnly()
  before adding rows to the storage.
  This simplifies the code a bit, since the caller of Storage.MustAddRows() shouldn't handle
  errors returned by Storage.AddRows().

- Properly store parsed logs to Storage if parts of the request contain invalid log lines.
  Previously the parsed logs could be lost in this case.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4945
2023-10-02 20:38:00 +02:00
hagen1778
25a006099d
app/vlinsert/loki: make fmt
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-02 20:35:45 +02:00
Zakhar Bessarab
dfdada055c
lib/logstorage: switch to read-only mode when running out of disk space (#4945)
* lib/logstorage: switch to read-only mode when running out of disk space

Added support of `--storage.minFreeDiskSpaceBytes` command-line flag to allow graceful handling of running out of disk space at `--storageDataPath`.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/logstorage: fix error handling logic during merge

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/logstorage: fix log level

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-10-02 17:09:57 +02:00
Aliaksandr Valialkin
cc8f2bee0d
app/vlinsert: follow-up for d570763c91
- Switch from summary to histogram for vl_http_request_duration_seconds metric.
  This allows calculating request duration quantiles across multiple hosts
  via histogram_quantile(0.99, sum(vl_http_request_duration_seconds_bucket) by (vmrange)).
- Take into account only successfully processed data ingestion requests
  when updating vl_http_request_duration_seconds histogram.
  Failed requests are ignored, since they may significantly skew measurements.
- Clarify the description of the change at docs/VictoriaLogs/CHANGELOG.md.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4934
2023-09-19 00:45:28 +02:00
Aliaksandr Valialkin
9e50ea6cce
app/vlinsert/insertutils: cosmetic changes after 8d3e574c31 2023-09-19 00:35:51 +02:00
crossoverJie
fb13887573
app/vlinsert: Add vl_http_request_duration_seconds metrics (#4934) 2023-09-19 00:10:24 +02:00
Zakhar Bessarab
b0c8f7f718
app/vlinsert: add flag to limit amount of fields per line (#4976)
Adding limit on ingestion allows to avoid issues like this one https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762
Such issues are often caused by misconfigurtion on log persing/ingestion side and preventing such rows from being ingested allows to avoid performance implications created by storing such log rows.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-09-18 12:00:02 +02:00
Zakhar Bessarab
82060727ec
app/vlinsert/loki: add handler for healthcheck endpoint (#4885)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-08-30 16:37:54 +02:00
Zakhar Bessarab
bae701e612
app/vlinsert/elasticsearch: add a command-line flag to provide ES version (#4778)
* app/vlinsert/elasticsearch: add a command-line flag to provide ES version

Adds a flag which will allow to change version which will be reported by ES endpoint for compatibility checks performed by external logs shippers(such as filebeat).
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4777

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Document the -elasticsearch.version command-line flag

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4777

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-11 06:57:01 -07:00
Zakhar Bessarab
d800545055
app/vlinsert/loki: manually remove bloat dependecies for generate proto file (#4686)
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2023-07-22 15:37:30 -07:00
Aliaksandr Valialkin
8c59813c17
app/vlinsert/loki: fix build for architectures where int is 32-bit 2023-07-20 21:55:28 -07:00
Aliaksandr Valialkin
30098ac8bd
app/vlinsert/loki: follow-up after 09df5b66fd
- Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki

- Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header

- Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler

- Check JSON field types more strictly.

- Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients,
  which store timestamps in float64 instead of int64.

- Optimize parsing of Loki labels in Prometheus text exposition format.

- Simplify tests.

- Remove lib/slicesutil, since there are no more users for it.

- Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels
  as stream fields in most Loki setups.

- Allow empty of missing timestamps in the ingested logs.
  The current timestamp at VictoriaLogs side is then used for the ingested logs.
  This simplifies debugging and testing of the provided HTTP-based data ingestion APIs.

The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB
after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 .
This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation
similar to the one used at lib/prompb or lib/prompbmarshal .
2023-07-20 21:52:11 -07:00
Zakhar Bessarab
7e24edd506
app/vlinsert/loki: fix compatibility with latest MetricsQL lib (#4675)
Loki uses default labels format without "or" operator. This format can't create a list of LabelFilters, so only first set of LabelFilters should be used.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-20 17:06:09 -07:00