mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-12-31 15:06:26 +00:00
wip
This commit is contained in:
parent
7916bb2789
commit
6bd0bdc241
2 changed files with 107 additions and 47 deletions
|
@ -37,6 +37,8 @@ For example, the following query finds all the logs with `error` word:
|
|||
error
|
||||
```
|
||||
|
||||
See [how to send queries to VictoriaLogs](https://docs.victoriametrics.com/victorialogs/querying/).
|
||||
|
||||
If the queried [word](#word) clashes with LogsQL keywords, then just wrap it into quotes.
|
||||
For example, the following query finds all the log messages with `and` [word](#word):
|
||||
|
||||
|
@ -80,11 +82,32 @@ Typical LogsQL query constists of multiple [filters](#filters) joined with `AND`
|
|||
So LogsQL allows omitting `AND` words. For example, the following query is equivalent to the query above:
|
||||
|
||||
```logsql
|
||||
error _time:5m
|
||||
_time:5m error
|
||||
```
|
||||
|
||||
The query returns all the [log fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) by default.
|
||||
See [how to query specific fields](#querying-specific-fields).
|
||||
The query returns logs in arbitrary order because sorting of big amounts of logs may require non-trivial amounts of CPU and RAM.
|
||||
The number of logs with `error` word over the last 5 minutes isn't usually too big (e.g. less than a few millions), so it is OK to sort them with [`sort` pipe](#sort-pipe).
|
||||
The following query sorts the selected logs by [`_time`](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field) field:
|
||||
|
||||
```logsql
|
||||
_time:5m error | sort by (_time)
|
||||
```
|
||||
|
||||
It is unlikely you are going to investigate more than a few hundreds of logs returned by the query above. So you can limit the number of returned logs
|
||||
with [`limit` pipe](#limit-pipe). The following query returns the last 10 logs with the `error` word over the last 5 minutes:
|
||||
|
||||
```logsql
|
||||
_time:5m error | sort by (_time) desc | limit 10
|
||||
```
|
||||
|
||||
By default VictoriaLogs returns all the [log fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model).
|
||||
If you need only the given set of fields, then add [`fields` pipe](#fields-pipe) to the end of the query. For example, the following query returns only
|
||||
[`_time`](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field), [`_stream`](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields)
|
||||
and [`_msg`](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field) fields:
|
||||
|
||||
```logsql
|
||||
error _time:5m | fields _time, _stream, _msg
|
||||
```
|
||||
|
||||
Suppose the query above selects too many rows because some buggy app pushes invalid error logs to VictoriaLogs. Suppose the app adds `buggy_app` [word](#word) to every log line.
|
||||
Then the following query removes all the logs from the buggy app, allowing us paying attention to the real errors:
|
||||
|
@ -93,8 +116,10 @@ Then the following query removes all the logs from the buggy app, allowing us pa
|
|||
_time:5m error NOT buggy_app
|
||||
```
|
||||
|
||||
This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `!` char.
|
||||
So the following query is equivalent to the previous one:
|
||||
This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `!` char
|
||||
(the `!` char is used instead of `-` char as a shorthand for `NOT` operator becasue it nicely combines with [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
|
||||
and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`).
|
||||
The following query is equivalent to the previous one:
|
||||
|
||||
```logsql
|
||||
_time:5m error !buggy_app
|
||||
|
@ -113,17 +138,15 @@ This query can be rewritten to more clear query with the `OR` [operator](#logica
|
|||
_time:5m error !(buggy_app OR foobar)
|
||||
```
|
||||
|
||||
Note that the parentheses are required here, since otherwise the query won't return the expected results.
|
||||
The query `error !buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar`. This query may return error logs
|
||||
from the buggy app if they contain `foobar` [word](#word). This query also continues returning all the error logs from the second buggy app.
|
||||
This is because of different priorities for `NOT`, `AND` and `OR` operators.
|
||||
Read [these docs](#logical-filter) for more details. There is no need in remembering all these priority rules -
|
||||
just wrap the needed query parts into explicit parentheses if you aren't sure in priority rules.
|
||||
The parentheses are **required** here, since otherwise the query won't return the expected results.
|
||||
The query `error !buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar` according to [priorities for AND, OR and NOT operator](#logical-filters).
|
||||
This query returns logs with `foobar` [word](#word), even if do not contain `error` word or contain `buggy_app` word.
|
||||
So it is recommended wrapping the needed query parts into explicit parentheses if you are unsure in priority rules.
|
||||
As an additional bonus, explicit parentheses make queries easier to read and maintain.
|
||||
|
||||
Queries above assume that the `error` [word](#word) is stored in the [log message](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field).
|
||||
This word can be stored in other [field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) such as `log.level`.
|
||||
How to select error logs in this case? Just add the `log.level:` prefix in front of the `error` word:
|
||||
If this word is stored in other [field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) such as `log.level`, then add `log.level:` prefix
|
||||
in front of the `error` word:
|
||||
|
||||
```logsq
|
||||
_time:5m log.level:error !(buggy_app OR foobar)
|
||||
|
@ -158,8 +181,16 @@ If the `app` field is associated with the log stream, then the query above can b
|
|||
_time:5m log.level:error _stream:{app!~"buggy_app|foobar"}
|
||||
```
|
||||
|
||||
This query completely skips scanning for logs from `buggy_app` and `foobar` apps, thus significantly reducing disk read IO and CPU time
|
||||
needed for performing the query.
|
||||
This query skips scanning for [log messages](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field) from `buggy_app` and `foobar` apps.
|
||||
It inpsects only `log.level` and [`_stream`](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields) labels.
|
||||
This significantly reduces disk read IO and CPU time needed for performing the query.
|
||||
|
||||
LogsQL also provides [functions for statistics calculation](#stats-pipe) over the selected logs. For example, the following query returns the number of logs
|
||||
with the `error` word for the last 5 minutes:
|
||||
|
||||
```logsql
|
||||
_time:5m error | stats count() logs_with_error
|
||||
```
|
||||
|
||||
Finally, it is recommended reading [performance tips](#performance-tips).
|
||||
|
||||
|
@ -177,13 +208,16 @@ These words are taken into account by full-text search filters such as
|
|||
|
||||
#### Query syntax
|
||||
|
||||
LogsQL query must contain [filters](#filters) for selecting the matching logs. At least a single filter is required.
|
||||
LogsQL query must contain at least a single [filter](#filters) for selecting the matching logs.
|
||||
For example, the following query selects all the logs for the last 5 minutes by using [`_time` filter](#time-filter):
|
||||
|
||||
```logsql
|
||||
_time:5m
|
||||
```
|
||||
|
||||
Tip: try [`*` filter](https://docs.victoriametrics.com/victorialogs/logsql/#any-value-filter), which selects all the logs stored in VictoriaLogs.
|
||||
Do not worry - this doesn't crash VictoriaLogs, even if it contains trillions of logs. In the worst case it will return
|
||||
|
||||
Additionally to filters, LogQL query may contain arbitrary mix of optional actions for processing the selected logs. These actions are delimited by `|` and are known as [`pipes`](#pipes).
|
||||
For example, the following query uses [`stats` pipe](#stats-pipe) for returning the number of [log messages](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field)
|
||||
with the `error` [word](#word) for the last 5 minutes:
|
||||
|
@ -2492,3 +2526,5 @@ Internally duration values are converted into nanoseconds.
|
|||
This rule doesn't apply to [time filter](#time-filter) and [stream filter](#stream-filter), which can be put at any place of the query.
|
||||
- Move more specific filters, which match lower number of log entries, to the beginning of the query.
|
||||
This rule doesn't apply to [time filter](#time-filter) and [stream filter](#stream-filter), which can be put at any place of the query.
|
||||
- If the selected logs are passed to [pipes](#pipes) for further transformations and statistics' calculations, then it is recommended
|
||||
reducing the number of selected logs by using more specific [filters](#filters), which return lower number of logs to process by [pipes](#pipes).
|
||||
|
|
|
@ -43,8 +43,8 @@ For example, the following query returns all the log entries with the `error` wo
|
|||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
The response by default contains all the [log fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model).
|
||||
See [how to query specific fields](https://docs.victoriametrics.com/victorialogs/logsql/#querying-specific-fields).
|
||||
The response by default contains all the [fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) for the selected logs.
|
||||
Use [`fields` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#fields-pipe) for selecting only the needed fields.
|
||||
|
||||
The `query` argument can be passed either in the request url itself (aka HTTP GET request) or via request body
|
||||
with the `x-www-form-urlencoded` encoding (aka HTTP POST request). The HTTP POST is useful for sending long queries
|
||||
|
@ -56,7 +56,8 @@ or similar tools.
|
|||
|
||||
By default the `/select/logsql/query` returns all the log entries matching the given `query`. The response size can be limited in the following ways:
|
||||
|
||||
- By closing the response stream at any time. In this case VictoriaLogs stops query execution and frees all the resources occupied by the request.
|
||||
- By closing the response stream at any time. VictoriaLogs stops query execution and frees all the resources occupied by the request as soon as it detects closed client connection.
|
||||
So it is safe running [`*` query](https://docs.victoriametrics.com/victorialogs/logsql/#any-value-filter), which selects all the logs, even if trillions of logs are stored in VictoriaLogs.
|
||||
- By specifying the maximum number of log entries, which can be returned in the response via `limit` query arg. For example, the following request returns
|
||||
up to 10 matching log entries:
|
||||
```sh
|
||||
|
@ -68,7 +69,7 @@ By default the `/select/logsql/query` returns all the log entries matching the g
|
|||
```
|
||||
- By adding [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter). The time range for the query can be specified via optional
|
||||
`start` and `end` query ars formatted according to [these docs](https://docs.victoriametrics.com/single-server-victoriametrics/#timestamp-formats).
|
||||
- By adding other [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the query.
|
||||
- By adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the query, which select lower number of logs.
|
||||
|
||||
The `/select/logsql/query` endpoint returns [a stream of JSON lines](https://jsonlines.org/),
|
||||
where each line contains JSON-encoded log entry in the form `{field1="value1",...,fieldN="valueN"}`.
|
||||
|
@ -79,18 +80,18 @@ Example response:
|
|||
{"_msg":"some other error","_stream":"{}","_time":"2023-01-01T13:32:15Z"}
|
||||
```
|
||||
|
||||
The matching lines are sent to the response stream as soon as they are found in VictoriaLogs storage.
|
||||
Logs lines are sent to the response stream as soon as they are found in VictoriaLogs storage.
|
||||
This means that the returned response may contain billions of lines for queries matching too many log entries.
|
||||
The response can be interrupted at any time by closing the connection to VictoriaLogs server.
|
||||
This allows post-processing the returned lines at the client side with the usual Unix commands such as `grep`, `jq`, `less`, `head`, etc.
|
||||
See [these docs](#command-line) for more details.
|
||||
This allows post-processing the returned lines at the client side with the usual Unix commands such as `grep`, `jq`, `less`, `head`, etc.,
|
||||
without worrying about resource usage at VictoriaLogs side. See [these docs](#command-line) for more details.
|
||||
|
||||
The returned lines aren't sorted, since sorting disables the ability to send matching log entries to response stream as soon as they are found.
|
||||
Query results can be sorted either at VictoriaLogs side according [to these docs](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
|
||||
The returned lines aren't sorted by default, since sorting disables the ability to send matching log entries to response stream as soon as they are found.
|
||||
Query results can be sorted either at VictoriaLogs side via [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
|
||||
or at client side with the usual `sort` command according to [these docs](#command-line).
|
||||
|
||||
By default the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/victorialogs/#multitenancy) is queried.
|
||||
If you need querying other tenant, then specify the needed tenant via http request headers. For example, the following query searches
|
||||
If you need querying other tenant, then specify it via `AccounID` and `ProjectID` http request headers. For example, the following query searches
|
||||
for log messages at `(AccountID=12, ProjectID=34)` tenant:
|
||||
|
||||
```sh
|
||||
|
@ -100,9 +101,15 @@ curl http://localhost:9428/select/logsql/query -H 'AccountID: 12' -H 'ProjectID:
|
|||
The number of requests to `/select/logsql/query` can be [monitored](https://docs.victoriametrics.com/victorialogs/#monitoring)
|
||||
with `vl_http_requests_total{path="/select/logsql/query"}` metric.
|
||||
|
||||
See also:
|
||||
|
||||
- [Querying hits stats](#querying-hits-stats)
|
||||
- [Querying streams](#querying-streams)
|
||||
- [HTTP API](#http-api)
|
||||
- [Querying stream field names](#querying-stream-field-names)
|
||||
- [Querying stream field values](#querying-stream-field-values)
|
||||
- [Querying field names](#querying-field-names)
|
||||
- [Querying field values](#querying-field-values)
|
||||
|
||||
|
||||
### Querying hits stats
|
||||
|
||||
|
@ -454,32 +461,25 @@ There are three modes of displaying query results:
|
|||
- `Table` - displays query results as a table.
|
||||
- `JSON` - displays raw JSON response from [HTTP API](#http-api).
|
||||
|
||||
This is the first version that has minimal functionality. It comes with the following limitations:
|
||||
|
||||
- The number of query results is always limited to 1000 lines. Iteratively add
|
||||
more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the query
|
||||
in order to get full response with less than 1000 lines.
|
||||
- Queries are always executed against [tenant](https://docs.victoriametrics.com/victorialogs/#multitenancy) `0`.
|
||||
|
||||
These limitations will be removed in future versions.
|
||||
|
||||
To get around the current limitations, you can use an alternative - the [command line interface](#command-line).
|
||||
This is the first version that has minimal functionality and may contain bugs.
|
||||
It is recommended trying [command line interface](#command-line), which has no known bugs :)
|
||||
|
||||
## Command-line
|
||||
|
||||
VictoriaLogs integrates well with `curl` and other command-line tools during querying because of the following features:
|
||||
|
||||
- VictoriaLogs sends the matching log entries to the response stream as soon as they are found.
|
||||
This allows forwarding the response stream to arbitrary [Unix pipes](https://en.wikipedia.org/wiki/Pipeline_(Unix)).
|
||||
- VictoriaLogs automatically adjusts query execution speed to the speed of the client, which reads the response stream.
|
||||
- Matching log entries are sent to the response stream as soon as they are found.
|
||||
This allows forwarding the response stream to arbitrary [Unix pipes](https://en.wikipedia.org/wiki/Pipeline_(Unix))
|
||||
without waiting until the response finishes.
|
||||
- Query execution speed is automatically adjusted to the speed of the client, which reads the response stream.
|
||||
For example, if the response stream is piped to `less` command, then the query is suspended
|
||||
until the `less` command reads the next block from the response stream.
|
||||
- VictoriaLogs automatically cancels query execution when the client closes the response stream.
|
||||
- Query is automatically canceled when the client closes the response stream.
|
||||
For example, if the query response is piped to `head` command, then VictoriaLogs stops executing the query
|
||||
when the `head` command closes the response stream.
|
||||
|
||||
These features allow executing queries at command-line interface, which potentially select billions of rows,
|
||||
without the risk of high resource usage (CPU, RAM, disk IO) at VictoriaLogs server.
|
||||
without the risk of high resource usage (CPU, RAM, disk IO) at VictoriaLogs.
|
||||
|
||||
For example, the following query can return very big number of matching log entries (e.g. billions) if VictoriaLogs contains
|
||||
many log messages with the `error` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word):
|
||||
|
@ -488,8 +488,8 @@ many log messages with the `error` [word](https://docs.victoriametrics.com/victo
|
|||
curl http://localhost:9428/select/logsql/query -d 'query=error'
|
||||
```
|
||||
|
||||
If the command returns "never-ending" response, then just press `ctrl+C` at any time in order to cancel the query.
|
||||
VictoriaLogs notices that the response stream is closed, so it cancels the query and instantly stops consuming CPU, RAM and disk IO for this query.
|
||||
If the command above returns "never-ending" response, then just press `ctrl+C` at any time in order to cancel the query.
|
||||
VictoriaLogs notices that the response stream is closed, so it cancels the query and stops consuming CPU, RAM and disk IO for this query.
|
||||
|
||||
Then just use `head` command for investigating the returned log messages and narrowing down the query:
|
||||
|
||||
|
@ -500,6 +500,12 @@ curl http://localhost:9428/select/logsql/query -d 'query=error' | head -10
|
|||
The `head -10` command reads only the first 10 log messages from the response and then closes the response stream.
|
||||
This automatically cancels the query at VictoriaLogs side, so it stops consuming CPU, RAM and disk IO resources.
|
||||
|
||||
Alternatively, you can limit the number of returned logs at VictoriaLogs side via [`limit` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#limit-pipe):
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error | limit 10'
|
||||
```
|
||||
|
||||
Sometimes it may be more convenient to use `less` command instead of `head` during the investigation of the returned response:
|
||||
|
||||
```sh
|
||||
|
@ -509,7 +515,7 @@ curl http://localhost:9428/select/logsql/query -d 'query=error' | less
|
|||
The `less` command reads the response stream on demand, when the user scrolls down the output.
|
||||
VictoriaLogs suspends query execution when `less` stops reading the response stream.
|
||||
It doesn't consume CPU and disk IO resources during this time. It resumes query execution
|
||||
when the `less` continues reading the response stream.
|
||||
after the `less` continues reading the response stream.
|
||||
|
||||
Suppose that the initial investigation of the returned query results helped determining that the needed log messages contain
|
||||
`cannot open file` [phrase](https://docs.victoriametrics.com/victorialogs/logsql/#phrase-filter).
|
||||
|
@ -543,7 +549,13 @@ See [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#stream-fi
|
|||
[these docs](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter) about `_time` filter
|
||||
and [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#logical-filter) about `AND` operator.
|
||||
|
||||
The following example shows how to sort query results by the [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field):
|
||||
Alternatively, you can count the number of matching logs at VictoriaLogs side with [`stats` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stats-pipe):
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_stream:{app="nginx"} AND _time:5m AND error | stats count() logs_with_error'
|
||||
```
|
||||
|
||||
The following example shows how to sort query results by the [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field) with traditional Unix tools:
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error' | jq -r '._time + " " + ._msg' | sort | less
|
||||
|
@ -558,8 +570,14 @@ can take non-trivial amounts of time if the `query` returns too many results. Th
|
|||
before sorting the results. See [these tips](https://docs.victoriametrics.com/victorialogs/logsql/#performance-tips)
|
||||
on how to narrow down query results.
|
||||
|
||||
Alternatively, sorting of matching logs can be performed at VictoriaLogs side via [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe):
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=error | sort by (_time)' | less
|
||||
```
|
||||
|
||||
The following example calculates stats on the number of log messages received during the last 5 minutes
|
||||
grouped by `log.level` [field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model):
|
||||
grouped by `log.level` [field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) with traditional Unix tools:
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_time:5m log.level:*' | jq -r '."log.level"' | sort | uniq -c
|
||||
|
@ -569,6 +587,12 @@ The query selects all the log messages with non-empty `log.level` field via ["an
|
|||
then pipes them to `jq` command, which extracts the `log.level` field value from the returned JSON stream, then the extracted `log.level` values
|
||||
are sorted with `sort` command and, finally, they are passed to `uniq -c` command for calculating the needed stats.
|
||||
|
||||
Alternatively, all the stats calculations above can be performed at VictoriaLogs side via [`stats by(...)`](https://docs.victoriametrics.com/victorialogs/logsql/#stats-by-fields):
|
||||
|
||||
```sh
|
||||
curl http://localhost:9428/select/logsql/query -d 'query=_time:5m log.level:* | stats by (log.level) count() matching_logs'
|
||||
```
|
||||
|
||||
See also:
|
||||
|
||||
- [Key concepts](https://docs.victoriametrics.com/victorialogs/keyconcepts/).
|
||||
|
|
Loading…
Reference in a new issue