mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
docs/VictoriaLogs: change the structure of the docs in order to be more maintainable
The change is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4477
This commit is contained in:
parent
e21b3bceab
commit
fd6c2dd02e
10 changed files with 580 additions and 520 deletions
|
@ -28,7 +28,7 @@ var (
|
|||
logNewStreams = flag.Bool("logNewStreams", false, "Whether to log creation of new streams; this can be useful for debugging of high cardinality issues with log streams; "+
|
||||
"see https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields ; see also -logIngestedRows")
|
||||
logIngestedRows = flag.Bool("logIngestedRows", false, "Whether to log all the ingested log entries; this can be useful for debugging of data ingestion; "+
|
||||
"see https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion ; see also -logNewStreams")
|
||||
"see https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/ ; see also -logNewStreams")
|
||||
)
|
||||
|
||||
// Init initializes vlstorage.
|
||||
|
|
|
@ -1,6 +1,7 @@
|
|||
# LogsQL
|
||||
|
||||
LogsQL is a simple yet powerful query language for VictoriaLogs. It provides the following features:
|
||||
LogsQL is a simple yet powerful query language for [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/).
|
||||
It provides the following features:
|
||||
|
||||
- Full-text search across [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
|
||||
See [word filter](#word-filter), [phrase filter](#phrase-filter) and [prefix filter](#prefix-filter).
|
||||
|
@ -13,9 +14,9 @@ LogsQL is a simple yet powerful query language for VictoriaLogs. It provides the
|
|||
If you aren't familiar with VictoriaLogs, then start with [key concepts docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html).
|
||||
|
||||
Then follow these docs:
|
||||
- [How to run VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/#how-to-run-victorialogs).
|
||||
- [how to ingest data into VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
- [How to query VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/#querying).
|
||||
- [How to run VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/QuickStart.html).
|
||||
- [how to ingest data into VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
- [How to query VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/querying/).
|
||||
|
||||
The simplest LogsQL query is just a [word](#word), which must be found in the [log message](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field).
|
||||
For example, the following query finds all the logs with `error` word:
|
||||
|
@ -148,7 +149,7 @@ _time:[now-5m,now] log.level:error !app:(buggy_app OR foobar)
|
|||
|
||||
The `app` field uniquely identifies the application instance if a single instance runs per each unique `app`.
|
||||
In this case it is recommended associating the `app` field with [log stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields)
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion). This usually improves both compression rate
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/). This usually improves both compression rate
|
||||
and query performance when querying the needed streams via [`_stream` filter](#stream-filter).
|
||||
If the `app` field is associated with the log stream, then the query above can be rewritten to more performant one:
|
||||
|
||||
|
@ -1001,7 +1002,7 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
## Transformations
|
||||
|
||||
It is possible to perform various transformations on the [selected log entries](#filters) at client side
|
||||
with `jq`, `awk`, `cut`, etc. Unix commands according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
with `jq`, `awk`, `cut`, etc. Unix commands according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
LogsQL will support the following transformations for the [selected](#filters) log entries:
|
||||
|
||||
|
@ -1023,7 +1024,7 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
## Post-filters
|
||||
|
||||
It is possible to perform post-filtering on the [selected log entries](#filters) at client side with `grep` or similar Unix commands
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
LogsQL will support post-filtering on the original [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and fields created by various [transformations](#transformations). The following post-filters will be supported:
|
||||
|
@ -1036,7 +1037,7 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
## Stats
|
||||
|
||||
It is possible to perform stats calculations on the [selected log entries](#filters) at client side with `sort`, `uniq`, etc. Unix commands
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
LogsQL will support calculating the following stats based on the [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and fields created by [transformations](#transformations):
|
||||
|
@ -1058,10 +1059,10 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
## Sorting
|
||||
|
||||
By default VictoriaLogs doesn't sort the returned results because of performance and efficiency concerns
|
||||
described [here](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
described [here](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
It is possible to sort the [selected log entries](#filters) at client side with `sort` Unix command
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
LogsQL will support results' sorting by the given set of [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
|
||||
|
||||
|
@ -1070,7 +1071,7 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
## Limiters
|
||||
|
||||
It is possible to limit the returned results with `head`, `tail`, `less`, etc. Unix commands
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
LogsQL will support the ability to limit the number of returned results alongside the ability to page the returned results.
|
||||
Additionally, LogsQL will provide the ability to select fields, which must be returned in the response.
|
||||
|
|
68
docs/VictoriaLogs/QuickStart.md
Normal file
68
docs/VictoriaLogs/QuickStart.md
Normal file
|
@ -0,0 +1,68 @@
|
|||
# VictoriaLogs Quick Start
|
||||
|
||||
It is recommended to read [README](https://docs.victoriametrics.com/VictoriaLogs/)
|
||||
and [Key Concepts](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html)
|
||||
before you start working with VictoriaLogs.
|
||||
|
||||
## How to install and run VictoriaLogs
|
||||
|
||||
There are the following options exist:
|
||||
|
||||
- [To run Docker image](#docker-image)
|
||||
- [To build VictoriaLogs from source code](#building-from-source-code)
|
||||
|
||||
### Docker image
|
||||
|
||||
You can run VictoriaLogs in a Docker container. It is the easiest way to start using VictoriaLogs.
|
||||
Here is the command to run VictoriaLogs in a Docker container:
|
||||
|
||||
```bash
|
||||
docker run --rm -it -p 9428:9428 -v ./victoria-logs-data:/victoria-logs-data \
|
||||
docker.io/victoriametrics/victoria-logs:heads-public-single-node-0-ga638f5e2b
|
||||
```
|
||||
|
||||
### Building from source code
|
||||
|
||||
Follow the following steps in order to build VictoriaLogs from source code:
|
||||
|
||||
- Checkout VictoriaLogs source code. It is located in the VictoriaMetrics repository:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/VictoriaMetrics/VictoriaMetrics
|
||||
cd VictoriaMetrics
|
||||
```
|
||||
|
||||
- Build VictoriaLogs. The build command requires [Go 1.20](https://golang.org/doc/install).
|
||||
|
||||
```bash
|
||||
make victoria-logs
|
||||
```
|
||||
|
||||
- Run the built binary:
|
||||
|
||||
```bash
|
||||
bin/victoria-logs
|
||||
```
|
||||
|
||||
VictoriaLogs is ready for [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/)
|
||||
and [querying](https://docs.victoriametrics.com/VictoriaLogs/querying/) at the TCP port `9428` now!
|
||||
It has no any external dependencies, so it may run in various environments without additional setup and configuration.
|
||||
VictoriaLogs automatically adapts to the available CPU and RAM resources. It also automatically setups and creates
|
||||
the needed indexes during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
|
||||
It is possible to change the TCP port via `-httpListenAddr` command-line flag. For example, the following command
|
||||
starts VictoriaLogs, which accepts incoming requests at port `9200` (aka ElasticSearch HTTP API port):
|
||||
|
||||
```bash
|
||||
/path/to/victoria-logs -httpListenAddr=:9200
|
||||
```
|
||||
|
||||
VictoriaLogs stores the ingested data to the `victoria-logs-data` directory by default. The directory can be changed
|
||||
via `-storageDataPath` command-line flag. See [these docs](https://docs.victoriametrics.com/VictoriaLogs/#storage) for details.
|
||||
|
||||
By default VictoriaLogs stores [log entries](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html) with timestamps
|
||||
in the time range `[now-7d, now]`, while dropping logs outside the given time range.
|
||||
E.g. it uses the retention of 7 days. Read [these docs](https://docs.victoriametrics.com/VictoriaLogs/#retention) on how to control the retention
|
||||
for the [ingested](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) logs.
|
||||
|
||||
It is recommended setting up monitoring of VictoriaLogs according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/#monitoring).
|
|
@ -4,20 +4,17 @@ VictoriaLogs is log management and log analytics system from [VictoriaMetrics](h
|
|||
|
||||
It provides the following key features:
|
||||
|
||||
- VictoriaLogs can accept logs from popular log collectors, which support
|
||||
[ElasticSearch data ingestion format](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html). See [these docs](#data-ingestion).
|
||||
[Grafana Loki data ingestion format](https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki) will be supported in the near future -
|
||||
see [the Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html).
|
||||
- VictoriaLogs is much easier to setup and operate comparing to ElasticSearch and Grafana Loki. See [these docs](#operation).
|
||||
- VictoriaLogs can accept logs from popular log collectors. See [these docs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
- VictoriaLogs is much easier to setup and operate comparing to ElasticSearch and Grafana Loki. See [these docs](https://docs.victoriametrics.com/VictoriaLogs/QuickStart.md).
|
||||
- VictoriaLogs provides easy yet powerful query language with full-text search capabilities across
|
||||
all the [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) -
|
||||
see [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html).
|
||||
- VictoriaLogs can be seamlessly combined with good old Unix tools for log analysis such as `grep`, `less`, `sort`, `jq`, etc.
|
||||
See [these docs](#querying-via-command-line) for details.
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line) for details.
|
||||
- VictoriaLogs capacity and performance scales lineraly with the available resources (CPU, RAM, disk IO, disk space).
|
||||
It runs smoothly on both Raspberry PI and a server with hundreds of CPU cores and terabytes of RAM.
|
||||
- VictoriaLogs can handle much bigger data volumes than ElasticSearch and Grafana Loki when running on comparable hardware.
|
||||
- VictoriaLogs supports multitenancy - see [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy).
|
||||
- VictoriaLogs supports multitenancy - see [these docs](#multitenancy).
|
||||
- VictoriaLogs supports out of order logs' ingestion aka backfilling.
|
||||
|
||||
VictoriaLogs is at Preview stage now. It is ready for evaluation in production and verifying claims given above.
|
||||
|
@ -26,470 +23,7 @@ See the [Roadmap](https://docs.victoriametrics.com/VictoriaLogs/Roadmap.html) fo
|
|||
|
||||
If you have questions about VictoriaLogs, then feel free asking them at [VictoriaMetrics community Slack chat](https://slack.victoriametrics.com/).
|
||||
|
||||
## Operation
|
||||
|
||||
### How to run VictoriaLogs
|
||||
|
||||
There are the following options exist now:
|
||||
|
||||
- To run Docker image:
|
||||
|
||||
```bash
|
||||
docker run --rm -it -p 9428:9428 -v ./victoria-logs-data:/victoria-logs-data \
|
||||
docker.io/victoriametrics/victoria-logs:heads-public-single-node-0-ga638f5e2b
|
||||
```
|
||||
|
||||
- To build VictoriaLogs from source code:
|
||||
|
||||
Checkout VictoriaLogs source code. It is located in the VictoriaMetrics repository:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/VictoriaMetrics/VictoriaMetrics
|
||||
cd VictoriaMetrics
|
||||
```
|
||||
|
||||
Then build VictoriaLogs. The build command requires [Go 1.20](https://golang.org/doc/install).
|
||||
|
||||
```bash
|
||||
make victoria-logs
|
||||
```
|
||||
|
||||
Then run the built binary:
|
||||
|
||||
```bash
|
||||
bin/victoria-logs
|
||||
```
|
||||
|
||||
VictoriaLogs is ready to [receive logs](#data-ingestion) and [query logs](#querying) at the TCP port `9428` now!
|
||||
It has no any external dependencies, so it may run in various environments without additional setup and configuration.
|
||||
VictoriaLogs automatically adapts to the available CPU and RAM resources. It also automatically setups and creates
|
||||
the needed indexes during [data ingestion](#data-ingestion).
|
||||
|
||||
It is possible to change the TCP port via `-httpListenAddr` command-line flag. For example, the following command
|
||||
starts VictoriaLogs, which accepts incoming requests at port `9200` (aka ElasticSearch HTTP API port):
|
||||
|
||||
```bash
|
||||
/path/to/victoria-logs -httpListenAddr=:9200
|
||||
```
|
||||
|
||||
VictoriaLogs stores the ingested data to the `victoria-logs-data` directory by default. The directory can be changed
|
||||
via `-storageDataPath` command-line flag. See [these docs](#storage) for details.
|
||||
|
||||
By default VictoriaLogs stores log entries with timestamps in the time range `[now-7d, now]`, while dropping logs outside the given time range.
|
||||
E.g. it uses the retention of 7 days. Read [these docs](#retention) on how to control the retention for the [ingested](#data-ingestion) logs.
|
||||
|
||||
It is recommended setting up monitoring of VictoriaLogs according to [these docs](#monitoring).
|
||||
|
||||
### Data ingestion
|
||||
|
||||
VictoriaLogs supports the following data ingestion approaches:
|
||||
|
||||
- Via [Filebeat](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html). See [these docs](#filebeat-setup).
|
||||
- Via [Logstash](https://www.elastic.co/guide/en/logstash/current/introduction.html). See [these docs](#logstash-setup).
|
||||
|
||||
The ingested logs can be queried according to [these docs](#querying).
|
||||
|
||||
See also [data ingestion troubleshooting](#data-ingestion-troubleshooting) docs.
|
||||
|
||||
#### Filebeat setup
|
||||
|
||||
Specify [`output.elasicsearch`](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html) section in the `filebeat.yml`
|
||||
for sending the collected logs to VictoriaLogs:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.hostname,log.file.path"
|
||||
```
|
||||
|
||||
Substitute the `localhost:9428` address inside `hosts` section with the real TCP address of VictoriaLogs.
|
||||
|
||||
See [these docs](#data-ingestion-parameters) for details on the `parameters` section.
|
||||
|
||||
It is recommended to verify whether the initial setup generates the needed [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and uses the correct [stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This can be done by specifying `debug` [parameter](#data-ingestion-parameters):
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.hostname,log.file.path"
|
||||
debug: "1"
|
||||
```
|
||||
|
||||
If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) must be skipped
|
||||
during data ingestion, then they can be put into `ignore_fields` [parameter](#data-ingestion-parameters).
|
||||
For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
ignore_fields: "log.offset,event.original"
|
||||
```
|
||||
|
||||
When Filebeat ingests logs into VictoriaLogs at a high rate, then it may be needed to tune `worker` and `bulk_max_size` options.
|
||||
For example, the following config is optimized for higher than usual ingestion rate:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
worker: 8
|
||||
bulk_max_size: 1000
|
||||
```
|
||||
|
||||
If the Filebeat sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `compression_level` option.
|
||||
This usually allows saving network bandwidth and costs by up to 5 times:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
compression_level: 1
|
||||
```
|
||||
|
||||
By default the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy).
|
||||
If you need storing logs in other tenant, then specify the needed tenant via `headers` at `output.elasticsearch` section.
|
||||
For example, the following `filebeat.yml` config instructs Filebeat to store the data to `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
headers:
|
||||
AccountID: 12
|
||||
ProjectID: 34
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
```
|
||||
|
||||
The ingested log entries can be queried according to [these docs](#querying).
|
||||
|
||||
See also [data ingestion troubleshooting](#data-ingestion-troubleshooting) docs.
|
||||
|
||||
#### Logstash setup
|
||||
|
||||
Specify [`output.elasticsearch`](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html) section in the `logstash.conf` file
|
||||
for sending the collected logs to VictoriaLogs:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.name,process.name"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Substitute `localhost:9428` address inside `hosts` with the real TCP address of VictoriaLogs.
|
||||
|
||||
See [these docs](#data-ingestion-parameters) for details on the `parameters` section.
|
||||
|
||||
It is recommended to verify whether the initial setup generates the needed [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and uses the correct [stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This can be done by specifying `debug` [parameter](#data-ingestion-parameters):
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.name,process.name"
|
||||
"debug" => "1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) must be skipped
|
||||
during data ingestion, then they can be put into `ignore_fields` [parameter](#data-ingestion-parameters).
|
||||
For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
"ignore_fields" => "log.offset,event.original"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the Logstash sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `http_compression: true` option.
|
||||
This usually allows saving network bandwidth and costs by up to 5 times:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
}
|
||||
http_compression => true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
By default the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy).
|
||||
If you need storing logs in other tenant, then specify the needed tenant via `custom_headers` at `output.elasticsearch` section.
|
||||
For example, the following `logstash.conf` config instructs Logstash to store the data to `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
custom_headers => {
|
||||
"AccountID" => "1"
|
||||
"ProjectID" => "2"
|
||||
}
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The ingested log entries can be queried according to [these docs](#querying).
|
||||
|
||||
See also [data ingestion troubleshooting](#data-ingestion-troubleshooting) docs.
|
||||
|
||||
#### Data ingestion parameters
|
||||
|
||||
VictoriaLogs accepts the following parameters at [data ingestion](#data-ingestion) HTTP APIs:
|
||||
|
||||
- `_msg_field` - it must contain the name of the [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
with the [log message](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) generated by the log shipper.
|
||||
This is usually the `message` field for Filebeat and Logstash.
|
||||
If the `_msg_field` parameter isn't set, then VictoriaLogs reads the log message from the `_msg` field.
|
||||
|
||||
- `_time_field` - it must contain the name of the [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
with the [log timestamp](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) generated by the log shipper.
|
||||
This is usually the `@timestamp` field for Filebeat and Logstash.
|
||||
If the `_time_field` parameter isn't set, then VictoriaLogs reads the timestamp from the `_time` field.
|
||||
If this field doesn't exist, then the current timestamp is used.
|
||||
|
||||
- `_stream_fields` - it should contain comma-separated list of [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) names,
|
||||
which uniquely identify every [log stream](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) collected the log shipper.
|
||||
If the `_stream_fields` parameter isn't set, then all the ingested logs are written to default log stream - `{}`.
|
||||
|
||||
- `ignore_fields` - this parameter may contain the list of [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) names,
|
||||
which must be ignored during [data ingestion](#data-ingestion).
|
||||
|
||||
- `debug` - if this parameter is set to `1`, then the [ingested](#data-ingestion) logs aren't stored in VictoriaLogs. Instead,
|
||||
the ingested data is logged by VictoriaLogs, so it can be investigated later.
|
||||
|
||||
#### Data ingestion troubleshooting
|
||||
|
||||
VictoriaLogs provides the following command-line flags, which can help debugging data ingestion issues:
|
||||
|
||||
- `-logNewStreams` - if this flag is passed to VictoriaLogs, then it logs all the newly
|
||||
registered [log streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This may help debugging [high cardinality issues](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#high-cardinality).
|
||||
- `-logIngestedRows` - if this flag is passed to VictoriaLogs, then it logs all the ingested
|
||||
[log entries](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
|
||||
See also `debug` [parameter](#data-ingestion-parameters).
|
||||
|
||||
VictoriaLogs exposes various [metrics](#monitoring), which may help debugging data ingestion issues:
|
||||
|
||||
- `vl_rows_ingested_total` - the number of ingested [log entries](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
since the last VictoriaLogs restart. If this number icreases over time, then logs are successfully ingested into VictoriaLogs.
|
||||
The ingested logs can be inspected in the following ways:
|
||||
- By passing `debug=1` parameter to every request to [data ingestion endpoints](#data-ingestion). The ingested rows aren't stored in VictoriaLogs
|
||||
in this case. Instead, they are logged, so they can be investigated later. The `vl_rows_dropped_total` [metric](#monitoring) is incremented for each logged row.
|
||||
- By passing `-logIngestedRows` command-line flag to VictoriaLogs. In this case it logs all the ingested data, so it can be investigated later.
|
||||
- `vl_streams_created_total` - the number of created [log streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields)
|
||||
since the last VictoriaLogs restart. If this metric grows rapidly during extended periods of time, then this may lead
|
||||
to [high cardinality issues](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#high-cardinality).
|
||||
The newly created log streams can be inspected in logs by passing `-logNewStreams` command-line flag to VictoriaLogs.
|
||||
|
||||
### Querying
|
||||
|
||||
VictoriaLogs can be queried at the `/select/logsql/query` endpoint. The [LogsQL](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html)
|
||||
query must be passed via `query` argument. For example, the following query returns all the log entries with the `error` word:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
The `query` argument can be passed either in the request url itself (aka HTTP GET request) or via request body
|
||||
with the `x-www-form-urlencoded` encoding (aka HTTP POST request). The HTTP POST is useful for sending long queries
|
||||
when they do not fit the maximum url length of the used clients and proxies.
|
||||
|
||||
See [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) for details on what can be passed to the `query` arg.
|
||||
The `query` arg must be properly encoded with [percent encoding](https://en.wikipedia.org/wiki/URL_encoding) when passing it to `curl`
|
||||
or similar tools.
|
||||
|
||||
The `/select/logsql/query` endpoint returns [a stream of JSON lines](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON),
|
||||
where each line contains JSON-encoded log entry in the form `{field1="value1",...,fieldN="valueN"}`.
|
||||
Example response:
|
||||
|
||||
```
|
||||
{"_msg":"error: disconnect from 19.54.37.22: Auth fail [preauth]","_stream":"{}","_time":"2023-01-01T13:32:13Z"}
|
||||
{"_msg":"some other error","_stream":"{}","_time":"2023-01-01T13:32:15Z"}
|
||||
```
|
||||
|
||||
The matching lines are sent to the response stream as soon as they are found in VictoriaLogs storage.
|
||||
This means that the returned response may contain billions of lines for queries matching too many log entries.
|
||||
The response can be interrupted at any time by closing the connection to VictoriaLogs server.
|
||||
This allows post-processing the returned lines at the client side with the usual Unix commands such as `grep`, `jq`, `less`, `head`, etc.
|
||||
See [these docs](#querying-via-command-line) for more details.
|
||||
|
||||
The returned lines aren't sorted by default, since sorting disables the ability to send matching log entries to response stream as soon as they are found.
|
||||
Query results can be sorted either at VictoriaLogs side according [to these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#sorting)
|
||||
or at client side with the usual `sort` command according to [these docs](#querying-via-command-line).
|
||||
|
||||
By default the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#multitenancy) is queried.
|
||||
If you need querying other tenant, then specify the needed tenant via http request headers. For example, the following query searches
|
||||
for log messages at `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -H 'AccountID: 12' -H 'ProjectID: 34' -d 'query=error'
|
||||
```
|
||||
|
||||
The number of requests to `/select/logsql/query` can be [monitored](#monitoring) with `vl_http_requests_total{path="/select/logsql/query"}` metric.
|
||||
|
||||
#### Querying via command-line
|
||||
|
||||
VictoriaLogs provides good integration with `curl` and other command-line tools because of the following features:
|
||||
|
||||
- VictoriaLogs sends the matching log entries to the response stream as soon as they are found.
|
||||
This allows forwarding the response stream to arbitrary [Unix pipes](https://en.wikipedia.org/wiki/Pipeline_(Unix)).
|
||||
- VictoriaLogs automatically adjusts query execution speed to the speed of the client, which reads the response stream.
|
||||
For example, if the response stream is piped to `less` command, then the query is suspended
|
||||
until the `less` command reads the next block from the response stream.
|
||||
- VictoriaLogs automatically cancels query execution when the client closes the response stream.
|
||||
For example, if the query response is piped to `head` command, then VictoriaLogs stops executing the query
|
||||
when the `head` command closes the response stream.
|
||||
|
||||
These features allow executing queries at command-line interface, which potentially select billions of rows,
|
||||
without the risk of high resource usage (CPU, RAM, disk IO) at VictoriaLogs server.
|
||||
|
||||
For example, the following query can return very big number of matching log entries (e.g. billions) if VictoriaLogs contains
|
||||
many log messages with the `error` [word](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#word):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
If the command returns "never-ending" response, then just press `ctrl+C` at any time in order to cancel the query.
|
||||
VictoriaLogs notices that the response stream is closed, so it cancels the query and instantly stops consuming CPU, RAM and disk IO for this query.
|
||||
|
||||
Then just use `head` command for investigating the returned log messages and narrowing down the query:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | head -10
|
||||
```
|
||||
|
||||
The `head -10` command reads only the first 10 log messages from the response and then closes the response stream.
|
||||
This automatically cancels the query at VictoriaLogs side, so it stops consuming CPU, RAM and disk IO resources.
|
||||
|
||||
Sometimes it may be more convenient to use `less` command instead of `head` during the investigation of the returned response:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | less
|
||||
```
|
||||
|
||||
The `less` command reads the response stream on demand, when the user scrolls down the output.
|
||||
VictoriaLogs suspends query execution when `less` stops reading the response stream.
|
||||
It doesn't consume CPU and disk IO resources during this time. It resumes query execution
|
||||
when the `less` continues reading the response stream.
|
||||
|
||||
Suppose that the initial investigation of the returned query results helped determining that the needed log messages contain
|
||||
`cannot open file` [phrase](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#phrase-filter).
|
||||
Then the query can be narrowed down to `error AND "cannot open file"`
|
||||
(see [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#logical-filter) about `AND` operator).
|
||||
Then run the updated command in order to continue the investigation:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error AND "cannot open file"' | head
|
||||
```
|
||||
|
||||
Note that the `query` arg must be properly encoded with [percent encoding](https://en.wikipedia.org/wiki/URL_encoding) when passing it to `curl`
|
||||
or similar tools.
|
||||
|
||||
The `pipe the query to "head" or "less" -> investigate the results -> refine the query` iteration
|
||||
can be repeated multiple times until the needed log messages are found.
|
||||
|
||||
The returned VictoriaLogs query response can be post-processed with any combination of Unix commands,
|
||||
which are usually used for log analysis - `grep`, `jq`, `awk`, `sort`, `uniq`, `wc`, etc.
|
||||
|
||||
For example, the following command uses `wc -l` Unix command for counting the number of log messages
|
||||
with the `error` [word](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#word)
|
||||
received from [streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) with `app="nginx"` field
|
||||
during the last 5 minutes:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_stream:{app="nginx"} AND _time:[now-5m,now] AND error' | wc -l
|
||||
```
|
||||
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#stream-filter) about `_stream` filter,
|
||||
[these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#time-filter) about `_time` filter
|
||||
and [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#logical-filter) about `AND` operator.
|
||||
|
||||
The following example shows how to sort query results by the [`_time` field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | jq -r '._time + " " + ._msg' | sort | less
|
||||
```
|
||||
|
||||
This command uses `jq` for extracting [`_time`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field)
|
||||
and [`_msg`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) fields from the returned results,
|
||||
and piping them to `sort` command.
|
||||
|
||||
Note that the `sort` command needs to read all the response stream before returning the sorted results. So the command above
|
||||
can take non-trivial amounts of time if the `query` returns too many results. The solution is to narrow down the `query`
|
||||
before sorting the results. See [these tips](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#performance-tips)
|
||||
on how to narrow down query results.
|
||||
|
||||
The following example calculates stats on the number of log messages received during the last 5 minutes
|
||||
grouped by `log.level` [field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_time:[now-5m,now] log.level:*' | jq -r '."log.level"' | sort | uniq -c
|
||||
```
|
||||
|
||||
The query selects all the log messages with non-empty `log.level` field via ["any value" filter](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#any-value-filter),
|
||||
then pipes them to `jq` command, which extracts the `log.level` field value from the returned JSON stream, then the extracted `log.level` values
|
||||
are sorted with `sort` command and, finally, they are passed to `uniq -c` command for calculating the needed stats.
|
||||
|
||||
See also:
|
||||
|
||||
- [Key concepts](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html).
|
||||
- [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html).
|
||||
|
||||
|
||||
### Monitoring
|
||||
## Monitoring
|
||||
|
||||
VictoriaLogs exposes internal metrics in Prometheus exposition format at `http://localhost:9428/metrics` page.
|
||||
It is recommended to set up monitoring of these metrics via VictoriaMetrics
|
||||
|
@ -498,8 +32,7 @@ vmagent (see [these docs](https://docs.victoriametrics.com/vmagent.html#how-to-c
|
|||
|
||||
VictoriaLogs emits own logs to stdout. It is recommended investigating these logs during troubleshooting.
|
||||
|
||||
|
||||
### Retention
|
||||
## Retention
|
||||
|
||||
By default VictoriaLogs stores log entries with timestamps in the time range `[now-7d, now]`, while dropping logs outside the given time range.
|
||||
E.g. it uses the retention of 7 days. The retention can be configured with `-retentionPeriod` command-line flag.
|
||||
|
@ -512,11 +45,11 @@ For example, the following command starts VictoriaLogs with the retention of 8 w
|
|||
/path/to/victoria-logs -retentionPeriod=8w
|
||||
```
|
||||
|
||||
VictoriaLogs stores the [ingested](#data-ingestion) logs in per-day partition directories. It automatically drops partition directories
|
||||
outside the configured retention.
|
||||
VictoriaLogs stores the [ingested](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) logs in per-day partition directories.
|
||||
It automatically drops partition directories outside the configured retention.
|
||||
|
||||
VictoriaLogs automatically drops logs at [data ingestion](#data-ingestion) stage if they have timestamps outside the configured retention.
|
||||
A sample of dropped logs is logged with `WARN` message in order to simplify troubleshooting.
|
||||
VictoriaLogs automatically drops logs at [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) stage
|
||||
if they have timestamps outside the configured retention. A sample of dropped logs is logged with `WARN` message in order to simplify troubleshooting.
|
||||
The `vl_rows_dropped_total` [metric](#monitoring) is incremented each time an ingested log entry is dropped because of timestamp outside the retention.
|
||||
It is recommended setting up the following alerting rule at [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to be notified
|
||||
when logs with wrong timestamps are ingested into VictoriaLogs:
|
||||
|
@ -536,7 +69,7 @@ For example, the following command starts VictoriaLogs, which accepts logs with
|
|||
/path/to/victoria-logs -futureRetention=1y
|
||||
```
|
||||
|
||||
### Storage
|
||||
## Storage
|
||||
|
||||
VictoriaLogs stores all its data in a single directory - `victoria-logs-data`. The path to the directory can be changed via `-storageDataPath` command-line flag.
|
||||
For example, the following command starts VictoriaLogs, which stores the data at `/var/lib/victoria-logs`:
|
||||
|
@ -546,3 +79,15 @@ For example, the following command starts VictoriaLogs, which stores the data at
|
|||
```
|
||||
|
||||
VictoriaLogs automatically creates the `-storageDataPath` directory on the first run if it is missing.
|
||||
|
||||
## Multitenancy
|
||||
|
||||
VictoriaLogs supports multitenancy. A tenant is identified by `(AccountID, ProjectID)` pair, where `AccountID` and `ProjectID` are arbitrary 32-bit unsigned integeres.
|
||||
The `AccountID` and `ProjectID` fields can be set during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/)
|
||||
and [querying](https://docs.victoriametrics.com/VictoriaLogs/querying/) via `AccountID` and `ProjectID` request headers.
|
||||
|
||||
If `AccountID` and/or `ProjectID` request headers aren't set, then the default `0` value is used.
|
||||
|
||||
VictoriaLogs has very low overhead for per-tenant management, so it is OK to have thousands of tenants in a single VictoriaLogs instance.
|
||||
|
||||
VictoriaLogs doesn't perform per-tenant authorization. Use [vmauth](https://docs.victoriametrics.com/vmauth.html) or similar tools for per-tenant authorization.
|
||||
|
|
|
@ -1,20 +1,21 @@
|
|||
# VictoriaLogs roadmap
|
||||
|
||||
The VictoriaLogs Preview is ready for evaluation in production. It is recommended running it alongside the existing solutions
|
||||
such as ElasticSearch and Grafana Loki and comparing their resource usage and usability.
|
||||
The [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/) Preview is ready for evaluation in production.
|
||||
It is recommended running it alongside the existing solutions such as ElasticSearch and Grafana Loki
|
||||
and comparing their resource usage and usability.
|
||||
It isn't recommended migrating from existing solutions to VictoriaLogs Preview yet.
|
||||
|
||||
The following functionality is available in VictoriaLogs Preview:
|
||||
|
||||
- [Data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
- [Querying](https://docs.victoriametrics.com/VictoriaLogs/#querying).
|
||||
- [Querying via command-line](https://docs.victoriametrics.com/VictoriaLogs/#querying-via-command-line).
|
||||
- [Data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
- [Querying](https://docs.victoriametrics.com/VictoriaLogs/querying/).
|
||||
- [Querying via command-line](https://docs.victoriametrics.com/VictoriaLogs/querying/#command-line).
|
||||
|
||||
See [operation docs](https://docs.victoriametrics.com/VictoriaLogs/#operation) for details.
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/) for details.
|
||||
|
||||
The following functionality is planned in the future versions of VictoriaLogs:
|
||||
|
||||
- Support for [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion) from popular log collectors and formats:
|
||||
- Support for [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) from popular log collectors and formats:
|
||||
- Promtail (aka Grafana Loki)
|
||||
- Vector.dev
|
||||
- Fluentbit
|
||||
|
|
93
docs/VictoriaLogs/data-ingestion/Filebeat.md
Normal file
93
docs/VictoriaLogs/data-ingestion/Filebeat.md
Normal file
|
@ -0,0 +1,93 @@
|
|||
# Filebeat setup
|
||||
|
||||
Specify [`output.elasicsearch`](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html) section in the `filebeat.yml`
|
||||
for sending the collected logs to [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/):
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.hostname,log.file.path"
|
||||
```
|
||||
|
||||
Substitute the `localhost:9428` address inside `hosts` section with the real TCP address of VictoriaLogs.
|
||||
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters) for details on the `parameters` section.
|
||||
|
||||
It is recommended verifying whether the initial setup generates the needed [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and uses the correct [stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This can be done by specifying `debug` [parameter](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters)
|
||||
and inspecting VictoriaLogs logs then:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.hostname,log.file.path"
|
||||
debug: "1"
|
||||
```
|
||||
|
||||
If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) must be skipped
|
||||
during data ingestion, then they can be put into `ignore_fields` [parameter](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters).
|
||||
For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
ignore_fields: "log.offset,event.original"
|
||||
```
|
||||
|
||||
When Filebeat ingests logs into VictoriaLogs at a high rate, then it may be needed to tune `worker` and `bulk_max_size` options.
|
||||
For example, the following config is optimized for higher than usual ingestion rate:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
worker: 8
|
||||
bulk_max_size: 1000
|
||||
```
|
||||
|
||||
If the Filebeat sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `compression_level` option.
|
||||
This usually allows saving network bandwidth and costs by up to 5 times:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
compression_level: 1
|
||||
```
|
||||
|
||||
By default the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy).
|
||||
If you need storing logs in other tenant, then specify the needed tenant via `headers` at `output.elasticsearch` section.
|
||||
For example, the following `filebeat.yml` config instructs Filebeat to store the data to `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```yml
|
||||
output.elasticsearch:
|
||||
hosts: ["http://localhost:9428/insert/elasticsearch/"]
|
||||
headers:
|
||||
AccountID: 12
|
||||
ProjectID: 34
|
||||
parameters:
|
||||
_msg_field: "message"
|
||||
_time_field: "@timestamp"
|
||||
_stream_fields: "host.name,log.file.path"
|
||||
```
|
||||
|
||||
The ingested log entries can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).
|
||||
|
||||
See also [data ingestion troubleshooting](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#troubleshooting) docs.
|
100
docs/VictoriaLogs/data-ingestion/Logstash.md
Normal file
100
docs/VictoriaLogs/data-ingestion/Logstash.md
Normal file
|
@ -0,0 +1,100 @@
|
|||
# Logstash setup
|
||||
|
||||
Specify [`output.elasticsearch`](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html) section in the `logstash.conf` file
|
||||
for sending the collected logs to [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/):
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.name,process.name"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Substitute `localhost:9428` address inside `hosts` with the real TCP address of VictoriaLogs.
|
||||
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters) for details on the `parameters` section.
|
||||
|
||||
It is recommended verifying whether the initial setup generates the needed [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
and uses the correct [stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This can be done by specifying `debug` [parameter](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters)
|
||||
and inspecting VictoriaLogs logs then:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.name,process.name"
|
||||
"debug" => "1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) must be skipped
|
||||
during data ingestion, then they can be put into `ignore_fields` [parameter](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters).
|
||||
For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
"ignore_fields" => "log.offset,event.original"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the Logstash sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via `http_compression: true` option.
|
||||
This usually allows saving network bandwidth and costs by up to 5 times:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
}
|
||||
http_compression => true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
By default the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy).
|
||||
If you need storing logs in other tenant, then specify the needed tenant via `custom_headers` at `output.elasticsearch` section.
|
||||
For example, the following `logstash.conf` config instructs Logstash to store the data to `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```conf
|
||||
output {
|
||||
elasticsearch {
|
||||
hosts => ["http://localhost:9428/insert/elasticsearch/"]
|
||||
custom_headers => {
|
||||
"AccountID" => "1"
|
||||
"ProjectID" => "2"
|
||||
}
|
||||
parameters => {
|
||||
"_msg_field" => "message"
|
||||
"_time_field" => "@timestamp"
|
||||
"_stream_fields" => "host.hostname,process.name"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The ingested log entries can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).
|
||||
|
||||
See also [data ingestion troubleshooting](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#troubleshooting) docs.
|
106
docs/VictoriaLogs/data-ingestion/README.md
Normal file
106
docs/VictoriaLogs/data-ingestion/README.md
Normal file
|
@ -0,0 +1,106 @@
|
|||
# Data ingestion
|
||||
|
||||
[VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/) can accept logs from the following log collectors:
|
||||
|
||||
- Filebeat. See [how to setup Filebeat for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Filebeat.html).
|
||||
- Logstash. See [how to setup Logstash for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Logstash.html).
|
||||
|
||||
The ingested logs can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).
|
||||
|
||||
See also [data ingestion troubleshooting](#troubleshooting) docs.
|
||||
|
||||
## HTTP APIs
|
||||
|
||||
VictoriaLogs supports the following data ingestion HTTP APIs:
|
||||
|
||||
- Elasticsearch bulk API. See [these docs](#elasticsearch-bulk-api).
|
||||
- JSON stream API aka [ndjson](http://ndjson.org/). See [these docs](#json-stream-api).
|
||||
|
||||
VictoriaLogs accepts optional [HTTP parameters](#http-parameters) at data ingestion HTTP APIs.
|
||||
|
||||
### Elasticsearch bulk API
|
||||
|
||||
VictoriaLogs accepts logs in [Elasticsearch bulk API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)
|
||||
format at `http://localhost:9428/insert/elasticsearch/_bulk` endpoint.
|
||||
|
||||
The following command pushes a single log line to Elasticsearch bulk API at VictoriaLogs:
|
||||
|
||||
```bash
|
||||
echo '{"create":{}}
|
||||
{"_msg":"cannot open file","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
|
||||
' | curl -X POST -H 'Content-Type: application/json' --data-binary @- http://localhost:9428/insert/elasticsearch/_bulk
|
||||
```
|
||||
|
||||
The following command verifies that the data has been successfully pushed to VictoriaLogs by [querying](https://docs.victoriametrics.com/VictoriaLogs/querying/) it:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=host.name:host123'
|
||||
```
|
||||
|
||||
The command should return the following response:
|
||||
|
||||
```bash
|
||||
{"_msg":"cannot open file","_stream":"{}","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
|
||||
```
|
||||
|
||||
### JSON stream API
|
||||
|
||||
TODO: document JSON stream API
|
||||
|
||||
|
||||
### HTTP parameters
|
||||
|
||||
VictoriaLogs accepts the following parameters at [data ingestion HTTP APIs](#http-apis):
|
||||
|
||||
- `_msg_field` - it must contain the name of the [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
with the [log message](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) generated by the log shipper.
|
||||
This is usually the `message` field for Filebeat and Logstash.
|
||||
If the `_msg_field` parameter isn't set, then VictoriaLogs reads the log message from the `_msg` field.
|
||||
|
||||
- `_time_field` - it must contain the name of the [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
with the [log timestamp](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) generated by the log shipper.
|
||||
This is usually the `@timestamp` field for Filebeat and Logstash.
|
||||
If the `_time_field` parameter isn't set, then VictoriaLogs reads the timestamp from the `_time` field.
|
||||
If this field doesn't exist, then the current timestamp is used.
|
||||
|
||||
- `_stream_fields` - it should contain comma-separated list of [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) names,
|
||||
which uniquely identify every [log stream](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) collected the log shipper.
|
||||
If the `_stream_fields` parameter isn't set, then all the ingested logs are written to default log stream - `{}`.
|
||||
|
||||
- `ignore_fields` - this parameter may contain the list of [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) names,
|
||||
which must be ignored during data ingestion.
|
||||
|
||||
- `debug` - if this parameter is set to `1`, then the ingested logs aren't stored in VictoriaLogs. Instead,
|
||||
the ingested data is logged by VictoriaLogs, so it can be investigated later.
|
||||
|
||||
See also [HTTP headers](#http-headers).
|
||||
|
||||
### HTTP headers
|
||||
|
||||
VictoriaLogs accepts optional `AccountID` and `ProjectID` headers at [data ingestion HTTP APIs](#http-apis).
|
||||
These headers may contain the needed tenant to ingest data to. See [multitenancy docs](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy) for details.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
VictoriaLogs provides the following command-line flags, which can help debugging data ingestion issues:
|
||||
|
||||
- `-logNewStreams` - if this flag is passed to VictoriaLogs, then it logs all the newly
|
||||
registered [log streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
|
||||
This may help debugging [high cardinality issues](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#high-cardinality).
|
||||
- `-logIngestedRows` - if this flag is passed to VictoriaLogs, then it logs all the ingested
|
||||
[log entries](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
|
||||
See also `debug` [parameter](#http-parameters).
|
||||
|
||||
VictoriaLogs exposes various [metrics](https://docs.victoriametrics.com/VictoriaLogs/#monitoring), which may help debugging data ingestion issues:
|
||||
|
||||
- `vl_rows_ingested_total` - the number of ingested [log entries](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
|
||||
since the last VictoriaLogs restart. If this number icreases over time, then logs are successfully ingested into VictoriaLogs.
|
||||
The ingested logs can be inspected in the following ways:
|
||||
- By passing `debug=1` parameter to every request to [data ingestion APIs](#http-apis). The ingested rows aren't stored in VictoriaLogs
|
||||
in this case. Instead, they are logged, so they can be investigated later.
|
||||
The `vl_rows_dropped_total` [metric](https://docs.victoriametrics.com/VictoriaLogs/#monitoring) is incremented for each logged row.
|
||||
- By passing `-logIngestedRows` command-line flag to VictoriaLogs. In this case it logs all the ingested data, so it can be investigated later.
|
||||
- `vl_streams_created_total` - the number of created [log streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields)
|
||||
since the last VictoriaLogs restart. If this metric grows rapidly during extended periods of time, then this may lead
|
||||
to [high cardinality issues](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#high-cardinality).
|
||||
The newly created log streams can be inspected in logs by passing `-logNewStreams` command-line flag to VictoriaLogs.
|
|
@ -2,7 +2,8 @@
|
|||
|
||||
## Data model
|
||||
|
||||
VictoriaLogs works with structured logs. Every log entry may contain arbitrary number of `key=value` pairs (aka fields).
|
||||
[VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/) works with structured logs.
|
||||
Every log entry may contain arbitrary number of `key=value` pairs (aka fields).
|
||||
A single log entry can be expressed as a single-level [JSON](https://www.json.org/json-en.html) object with string keys and values.
|
||||
For example:
|
||||
|
||||
|
@ -18,7 +19,7 @@ For example:
|
|||
```
|
||||
|
||||
VictoriaLogs automatically transforms multi-level JSON (aka nested JSON) into single-level JSON
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion) according to the following rules:
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) according to the following rules:
|
||||
|
||||
- Nested dictionaries are flattened by concatenating dictionary keys with `.` char. For example, the following multi-level JSON
|
||||
is transformed into the following single-level JSON:
|
||||
|
@ -61,7 +62,7 @@ during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-inge
|
|||
```
|
||||
|
||||
Both label name and label value may contain arbitrary chars. Such chars must be encoded
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion)
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/)
|
||||
according to [JSON string encoding](https://www.rfc-editor.org/rfc/rfc7159.html#section-7).
|
||||
Unicode chars must be encoded with [UTF-8](https://en.wikipedia.org/wiki/UTF-8) encoding:
|
||||
|
||||
|
@ -72,7 +73,7 @@ Unicode chars must be encoded with [UTF-8](https://en.wikipedia.org/wiki/UTF-8)
|
|||
}
|
||||
```
|
||||
|
||||
VictoriaLogs automatically indexes all the fields in all the [ingested](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion) logs.
|
||||
VictoriaLogs automatically indexes all the fields in all the [ingested](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) logs.
|
||||
This enables [full-text search](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) across all the fields.
|
||||
|
||||
VictoriaLogs supports the following field types:
|
||||
|
@ -95,9 +96,9 @@ log entry, which can be ingested into VictoriaLogs:
|
|||
```
|
||||
|
||||
If the actual log message has other than `_msg` field name, then it is possible to specify the real log message field
|
||||
via `_msg_field` query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
via `_msg_field` query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
For example, if log message is located in the `event.original` field, then specify `_msg_field=event.original` query arg
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
|
||||
### Time field
|
||||
|
||||
|
@ -112,9 +113,9 @@ For example:
|
|||
```
|
||||
|
||||
If the actual timestamp has other than `_time` field name, then it is possible to specify the real timestamp
|
||||
field via `_time_field` query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
field via `_time_field` query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
For example, if timestamp is located in the `event.created` field, then specify `_time_field=event.created` query arg
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
|
||||
If `_time` field is missing, then the data ingestion time is used as log entry timestamp.
|
||||
|
||||
|
@ -142,7 +143,7 @@ so it stores all the received log entries in a single default stream - `{}`.
|
|||
This may lead to not-so-optimal resource usage and query performance.
|
||||
|
||||
Therefore it is recommended specifying stream-level fields via `_stream_fields` query arg
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion).
|
||||
during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/).
|
||||
For example, if logs from Kubernetes containers have the following fields:
|
||||
|
||||
```json
|
||||
|
@ -156,7 +157,7 @@ For example, if logs from Kubernetes containers have the following fields:
|
|||
```
|
||||
|
||||
then sepcify `_stream_fields=kubernetes.namespace,kubernetes.node.name,kubernetes.pod.name,kubernetes.container.name`
|
||||
query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion) in order to properly store
|
||||
query arg during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/) in order to properly store
|
||||
per-container logs into distinct streams.
|
||||
|
||||
#### How to determine which fields must be associated with log streams?
|
||||
|
@ -185,8 +186,8 @@ VictoriaLogs works perfectly with such fields unless they are associated with [l
|
|||
Never associate high-cardinality fields with [log streams](#stream-fields), since this may result
|
||||
to the following issues:
|
||||
|
||||
- Performance degradation during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion)
|
||||
and [querying](https://docs.victoriametrics.com/VictoriaLogs/#querying)
|
||||
- Performance degradation during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/)
|
||||
and [querying](https://docs.victoriametrics.com/VictoriaLogs/querying/)
|
||||
- Increased memory usage
|
||||
- Increased CPU usage
|
||||
- Increased disk space usage
|
||||
|
@ -206,14 +207,3 @@ E.g. the `trace_id:XXXX-YYYY-ZZZZ` query usually works faster than the `_msg:"tr
|
|||
|
||||
See [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) for more details.
|
||||
|
||||
## Multitenancy
|
||||
|
||||
VictoriaLogs supports multitenancy. A tenant is identified by `(AccountID, ProjectID)` pair, where `AccountID` and `ProjectID` are arbitrary 32-bit unsigned integeres.
|
||||
The `AccountID` and `ProjectID` fields can be set during [data ingestion](https://docs.victoriametrics.com/VictoriaLogs/#data-ingestion)
|
||||
and [querying](https://docs.victoriametrics.com/VictoriaLogs/#querying) via `AccountID` and `ProjectID` request headers.
|
||||
|
||||
If `AccountID` and/or `ProjectID` request headers aren't set, then the default `0` value is used.
|
||||
|
||||
VictoriaLogs has very low overhead for per-tenant management, so it is OK to have thousands of tenants in a single VictoriaLogs instance.
|
||||
|
||||
VictoriaLogs doesn't perform per-tenant authorization. Use [vmauth](https://docs.victoriametrics.com/vmauth.html) or similar tools for per-tenant authorization.
|
||||
|
|
156
docs/VictoriaLogs/querying/README.md
Normal file
156
docs/VictoriaLogs/querying/README.md
Normal file
|
@ -0,0 +1,156 @@
|
|||
# Querying
|
||||
|
||||
[VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/) can be queried at the `/select/logsql/query` endpoint.
|
||||
The [LogsQL](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) query must be passed via `query` argument.
|
||||
For example, the following query returns all the log entries with the `error` word:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
The `query` argument can be passed either in the request url itself (aka HTTP GET request) or via request body
|
||||
with the `x-www-form-urlencoded` encoding (aka HTTP POST request). The HTTP POST is useful for sending long queries
|
||||
when they do not fit the maximum url length of the used clients and proxies.
|
||||
|
||||
See [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) for details on what can be passed to the `query` arg.
|
||||
The `query` arg must be properly encoded with [percent encoding](https://en.wikipedia.org/wiki/URL_encoding) when passing it to `curl`
|
||||
or similar tools.
|
||||
|
||||
The `/select/logsql/query` endpoint returns [a stream of JSON lines](http://ndjson.org/),
|
||||
where each line contains JSON-encoded log entry in the form `{field1="value1",...,fieldN="valueN"}`.
|
||||
Example response:
|
||||
|
||||
```
|
||||
{"_msg":"error: disconnect from 19.54.37.22: Auth fail [preauth]","_stream":"{}","_time":"2023-01-01T13:32:13Z"}
|
||||
{"_msg":"some other error","_stream":"{}","_time":"2023-01-01T13:32:15Z"}
|
||||
```
|
||||
|
||||
The matching lines are sent to the response stream as soon as they are found in VictoriaLogs storage.
|
||||
This means that the returned response may contain billions of lines for queries matching too many log entries.
|
||||
The response can be interrupted at any time by closing the connection to VictoriaLogs server.
|
||||
This allows post-processing the returned lines at the client side with the usual Unix commands such as `grep`, `jq`, `less`, `head`, etc.
|
||||
See [these docs](#command-line) for more details.
|
||||
|
||||
The returned lines aren't sorted by default, since sorting disables the ability to send matching log entries to response stream as soon as they are found.
|
||||
Query results can be sorted either at VictoriaLogs side according [to these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#sorting)
|
||||
or at client side with the usual `sort` command according to [these docs](#command-line).
|
||||
|
||||
By default the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy) is queried.
|
||||
If you need querying other tenant, then specify the needed tenant via http request headers. For example, the following query searches
|
||||
for log messages at `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -H 'AccountID: 12' -H 'ProjectID: 34' -d 'query=error'
|
||||
```
|
||||
|
||||
The number of requests to `/select/logsql/query` can be [monitored](https://docs.victoriametrics.com/VictoriaLogs/#monitoring)
|
||||
with `vl_http_requests_total{path="/select/logsql/query"}` metric.
|
||||
|
||||
## Command-line
|
||||
|
||||
VictoriaLogs integrates well with `curl` and other command-line tools during querying because of the following features:
|
||||
|
||||
- VictoriaLogs sends the matching log entries to the response stream as soon as they are found.
|
||||
This allows forwarding the response stream to arbitrary [Unix pipes](https://en.wikipedia.org/wiki/Pipeline_(Unix)).
|
||||
- VictoriaLogs automatically adjusts query execution speed to the speed of the client, which reads the response stream.
|
||||
For example, if the response stream is piped to `less` command, then the query is suspended
|
||||
until the `less` command reads the next block from the response stream.
|
||||
- VictoriaLogs automatically cancels query execution when the client closes the response stream.
|
||||
For example, if the query response is piped to `head` command, then VictoriaLogs stops executing the query
|
||||
when the `head` command closes the response stream.
|
||||
|
||||
These features allow executing queries at command-line interface, which potentially select billions of rows,
|
||||
without the risk of high resource usage (CPU, RAM, disk IO) at VictoriaLogs server.
|
||||
|
||||
For example, the following query can return very big number of matching log entries (e.g. billions) if VictoriaLogs contains
|
||||
many log messages with the `error` [word](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#word):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
If the command returns "never-ending" response, then just press `ctrl+C` at any time in order to cancel the query.
|
||||
VictoriaLogs notices that the response stream is closed, so it cancels the query and instantly stops consuming CPU, RAM and disk IO for this query.
|
||||
|
||||
Then just use `head` command for investigating the returned log messages and narrowing down the query:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | head -10
|
||||
```
|
||||
|
||||
The `head -10` command reads only the first 10 log messages from the response and then closes the response stream.
|
||||
This automatically cancels the query at VictoriaLogs side, so it stops consuming CPU, RAM and disk IO resources.
|
||||
|
||||
Sometimes it may be more convenient to use `less` command instead of `head` during the investigation of the returned response:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | less
|
||||
```
|
||||
|
||||
The `less` command reads the response stream on demand, when the user scrolls down the output.
|
||||
VictoriaLogs suspends query execution when `less` stops reading the response stream.
|
||||
It doesn't consume CPU and disk IO resources during this time. It resumes query execution
|
||||
when the `less` continues reading the response stream.
|
||||
|
||||
Suppose that the initial investigation of the returned query results helped determining that the needed log messages contain
|
||||
`cannot open file` [phrase](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#phrase-filter).
|
||||
Then the query can be narrowed down to `error AND "cannot open file"`
|
||||
(see [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#logical-filter) about `AND` operator).
|
||||
Then run the updated command in order to continue the investigation:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error AND "cannot open file"' | head
|
||||
```
|
||||
|
||||
Note that the `query` arg must be properly encoded with [percent encoding](https://en.wikipedia.org/wiki/URL_encoding) when passing it to `curl`
|
||||
or similar tools.
|
||||
|
||||
The `pipe the query to "head" or "less" -> investigate the results -> refine the query` iteration
|
||||
can be repeated multiple times until the needed log messages are found.
|
||||
|
||||
The returned VictoriaLogs query response can be post-processed with any combination of Unix commands,
|
||||
which are usually used for log analysis - `grep`, `jq`, `awk`, `sort`, `uniq`, `wc`, etc.
|
||||
|
||||
For example, the following command uses `wc -l` Unix command for counting the number of log messages
|
||||
with the `error` [word](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#word)
|
||||
received from [streams](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) with `app="nginx"` field
|
||||
during the last 5 minutes:
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_stream:{app="nginx"} AND _time:[now-5m,now] AND error' | wc -l
|
||||
```
|
||||
|
||||
See [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#stream-filter) about `_stream` filter,
|
||||
[these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#time-filter) about `_time` filter
|
||||
and [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#logical-filter) about `AND` operator.
|
||||
|
||||
The following example shows how to sort query results by the [`_time` field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | jq -r '._time + " " + ._msg' | sort | less
|
||||
```
|
||||
|
||||
This command uses `jq` for extracting [`_time`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field)
|
||||
and [`_msg`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field) fields from the returned results,
|
||||
and piping them to `sort` command.
|
||||
|
||||
Note that the `sort` command needs to read all the response stream before returning the sorted results. So the command above
|
||||
can take non-trivial amounts of time if the `query` returns too many results. The solution is to narrow down the `query`
|
||||
before sorting the results. See [these tips](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#performance-tips)
|
||||
on how to narrow down query results.
|
||||
|
||||
The following example calculates stats on the number of log messages received during the last 5 minutes
|
||||
grouped by `log.level` [field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model):
|
||||
|
||||
```bash
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_time:[now-5m,now] log.level:*' | jq -r '."log.level"' | sort | uniq -c
|
||||
```
|
||||
|
||||
The query selects all the log messages with non-empty `log.level` field via ["any value" filter](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#any-value-filter),
|
||||
then pipes them to `jq` command, which extracts the `log.level` field value from the returned JSON stream, then the extracted `log.level` values
|
||||
are sorted with `sort` command and, finally, they are passed to `uniq -c` command for calculating the needed stats.
|
||||
|
||||
See also:
|
||||
|
||||
- [Key concepts](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html).
|
||||
- [LogsQL docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html).
|
Loading…
Reference in a new issue