### Describe Your Changes
This is a follow-up PR containing workflow related part of the initial
[PR#6362](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6362).
It adds spell checking control based on
[cspell](https://github.com/streetsidesoftware/cspell). The related
configs are consolidated under `cspell` directory.
You can add VictoriaMetrics specific terms into `cspell/custom-dict.txt`
(it contains ~30 items atm). All other absent commonly used terms should
be added directly to respective
[cspell-dicts](https://github.com/streetsidesoftware/cspell-dicts/blob/main/CONTRIBUTING.md).
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Arkadii Yakovets <ark@victoriametrics.com>
(cherry picked from commit 92b22581e6
)
24 KiB
sort | weight | title | menu | ||||||
---|---|---|---|---|---|---|---|---|---|
100 | 100 | LogsQL examples |
|
LogsQL examples
How to select recently ingested logs?
Run the following query:
_time:5m
It returns logs over the last 5 minutes by using _time
filter.
The logs are returned in arbitrary order because of performance reasons.
Add sort
pipe to the query if you need sorting
the returned logs by some field (usually _time
field):
_time:5m | sort by (_time)
If the number of returned logs is too big, it may be limited with the limit
pipe.
For example, the following query returns 10 most recent logs, which were ingested during the last 5 minutes:
_time:5m | sort by (_time desc) | limit 10
See also:
How to select logs with the given word in log message?
Just put the needed word in the query.
For example, the following query returns all the logs with the error
word
in log message:
error
If the number of returned logs is too big, then add _time
filter
for limiting the time range for the selected logs. For example, the following query returns logs with error
word
over the last hour:
error _time:1h
If the number of returned logs is still too big, then consider adding more specific filters
to the query. For example, the following query selects logs with error
word,
which do not contain kubernetes
word, over the last hour:
error !kubernetes _time:1h
The logs are returned in arbitrary order because of performance reasons. Add sort
pipe
for sorting logs by the needed fields. For example, the following query
sorts the selected logs by _time
field:
error _time:1h | sort by (_time)
See also:
- How to select logs with all the given words in log message?
- How to select logs with some of the given words in log message?
- How to skip logs with the given word in log message?
- Filtering by phrase
- Filtering by prefix
- Filtering by regular expression
- Filtering by substring
How to skip logs with the given word in log message?
Use NOT
logical filter. For example, the following query returns all the logs
without the INFO
word in the log message:
!INFO
If the number of returned logs is too big, then add _time
filter
for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour:
!INFO _time:1h
If the number of returned logs is still too big, then consider adding more specific filters
to the query. For example, the following query selects logs without INFO
word,
which contain error
word, over the last hour:
!INFO error _time:1h
The logs are returned in arbitrary order because of performance reasons. Add sort
pipe
for sorting logs by the needed fields. For example, the following query
sorts the selected logs by _time
field:
!INFO _time:1h | sort by (_time)
See also:
- How to select logs with all the given words in log message?
- How to select logs with some of given words in log message?
- Filtering by phrase
- Filtering by prefix
- Filtering by regular expression
- Filtering by substring
How to select logs with all the given words in log message?
Just enumerate the needed words in the query, by deliming them with whitespace.
For example, the following query selects logs containing both error
and kubernetes
words
in the log message:
error kubernetes
This query uses AND
logical filter.
If the number of returned logs is too big, then add _time
filter
for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour:
error kubernetes _time:1h
If the number of returned logs is still too big, then consider adding more specific filters
to the query. For example, the following query selects logs with error
and kubernetes
words
from log streams containing container="my-app"
field, over the last hour:
error kubernetes _stream:{container="my-app"} _time:1h
The logs are returned in arbitrary order because of performance reasons. Add sort
pipe
for sorting logs by the needed fields. For example, the following query
sorts the selected logs by _time
field:
error kubernetes _time:1h | sort by (_time)
See also:
- How to select logs with some of given words in log message?
- How to skip logs with the given word in log message?
- Filtering by phrase
- Filtering by prefix
- Filtering by regular expression
- Filtering by substring
How to select logs with some of the given words in log message?
Put the needed words into (...)
, by delimiting them with or
.
For example, the following query selects logs with error
, ERROR
or Error
words
in the log message:
(error or ERROR or Error)
This query uses OR
logical filter.
If the number of returned logs is too big, then add _time
filter
for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour:
(error or ERROR or Error) _time:1h
If the number of returned logs is still too big, then consider adding more specific filters
to the query. For example, the following query selects logs without error
, ERROR
or Error
words,
which do not contain kubernetes
word, over the last hour:
(error or ERROR or Error) !kubernetes _time:1h
The logs are returned in arbitrary order because of performance reasons. Add sort
pipe
for sorting logs by the needed fields. For example, the following query
sorts the selected logs by _time
field:
(error or ERROR or Error) _time:1h | sort by (_time)
See also:
- How to select logs with all the given words in log message?
- How to skip logs with the given word in log message?
- Filtering by phrase
- Filtering by prefix
- Filtering by regular expression
- Filtering by substring
How to select logs from the given application instance?
Make sure the application is properly configured with stream-level log fields.
Then just use _stream
filter for selecting logs for the given application instance.
For example, if the application contains job="app-42"
and instance="host-123:5678"
stream fields,
then the following query selects all the logs from this application:
_stream:{job="app-42",instance="host-123:5678"}
If the number of returned logs is too big, it is recommended adding _time
filter
to the query in order to reduce the number of matching logs. For example, the following query returns logs for the given application for the last day:
_stream:{job="app-42",instance="host-123:5678"} _time:1d
If the number of returned logs is still too big, then consider adding more specific filters
to the query. For example, the following query selects logs from the given log stream,
which contain error
word in the log message,
over the last day:
_stream:{job="app-42",instance="host-123:5678"} error _time:1d
The logs are returned in arbitrary order because of performance reasons. Use sort
pipe
for sorting the returned logs by the needed fields. For example, the following query sorts the selected logs
by _time
:
_stream:{job="app-42",instance="host-123:5678"} _time:1d | sort by (_time)
See also:
- How to determine applications with the most logs?
- How to skip logs with the given word in log message?
How to count the number of matching logs?
Use count()
stats function. For example, the following query returns
the number of results returned by your_query_here
:
your_query_here | count()
How to determine applications with the most logs?
Run the following query:
_time:5m | stats by (_stream) count() as logs | sort by (logs desc) | limit 10
This query returns top 10 application instances (aka log streams) with the most logs over the last 5 minutes.
This query uses the following LogsQL features:
_time
filter for selecting logs on the given time range (5 minutes in the query above).stats
pipe for calculating the number of logs. per each_stream
.count
stats function is used for calculating the needed stats.sort
pipe for sorting the stats bylogs
field in descending order.limit
pipe for limiting the number of returned results to 10.
This query can be simplified into the following one, which uses top
pipe:
_time:5m | top 10 by (_stream)
See also:
- How to filter out data after stats calculation?
- How to calculate the number of logs per the given interval?
- How to select logs from the given application instance?
How to parse JSON inside log message?
It is better from performance and resource usage PoV to avoid storing JSON inside log message. It is recommended storing individual JSON fields as log fields instead according to VictoriaLogs data model.
If you have to store JSON inside log message or inside any other log fields,
then the stored JSON can be parsed during query time via unpack_json
pipe.
For example, the following query unpacks JSON from the _msg
field
across all the logs for the last 5 minutes:
_time:5m | unpack_json
If you need to parse JSON array, then take a look at unroll
pipe.
How to extract some data from text log message?
Use extract
or extract_regexp
pipe.
For example, the following query extracts username
and user_id
fields from text log message:
_time:5m | extract "username=<username>, user_id=<user_id>,"
See also:
How to filter out data after stats calculation?
Use filter
pipe. For example, the following query
returns only log streams with more than 1000 logs
over the last 5 minutes:
_time:5m | stats by (_stream) count() rows | filter rows:>1000
How to calculate the number of logs per the given interval?
Use stats
by time bucket. For example, the following query
returns per-hour number of logs with the error
word for the last day:
_time:1d error | stats by (_time:1h) count() rows | sort by (_time)
This query uses sort
pipe in order to sort per-hour stats
by _time
.
How to calculate the number of logs per IPv4 subnetwork?
Use stats
by IPv4 bucket. For example, the following
query returns top 10 /24
subnetworks with the biggest number of logs for the last 5 minutes:
_time:5m | stats by (ip:/24) count() rows | sort by (rows desc) limit 10
This query uses sort
pipe in order to sort per-subnetwork stats
by descending number of rows and limiting the result to top 10 rows.
The query assumes the original logs have ip
field with the IPv4 address.
If the IPv4 address is located inside log message or any other text field,
then it can be extracted with the extract
or extract_regexp
pipes. For example, the following query
extracts IPv4 address from _msg
field and then returns top 10
/16
subnetworks with the biggest number of logs for the last 5 minutes:
_time:5m | extract_regexp "(?P<ip>([0-9]+[.]){3}[0-9]+)" | stats by (ip:/16) count() rows | sort by (rows desc) limit 10
How to calculate the number of logs per every value of the given field?
Use stats
by field. For example, the following query
calculates the number of logs per level
field for logs over the last 5 minutes:
_time:5m | stats by (level) count() rows
An alternative is to use field_values
pipe:
_time:5m | field_values level
How to get unique values for the given field?
Use uniq
pipe. For example, the following query returns unique values for the ip
field
over logs for the last 5 minutes:
_time:5m | uniq by (ip)
How to get unique sets of values for the given fields?
Use uniq
pipe. For example, the following query returns unique sets for (host
, path
) fields
over logs for the last 5 minutes:
_time:5m | uniq by (host, path)
How to return last N logs for the given query?
Use sort
pipe with limit. For example, the following query returns the last 10 logs with the error
word in the _msg
field
over the logs for the last 5 minutes:
_time:5m error | sort by (_time desc) limit 10
It sorts the matching logs by _time
field in descending order and then selects
the first 10 logs with the highest values for the _time
field.
If the query is sent to /select/logsql/query
HTTP API, then limit=N
query arg
can be passed to it in order to return up to N
latest log entries. For example, the following command returns up to 10 latest log entries with the error
word:
curl http://localhost:9428/select/logsql/query -d 'query=error' -d 'limit=10'
See also:
How to calculate the share of error logs to the total number of logs?
Use the following query:
_time:5m | stats count() logs, count() if (error) errors | math errors / logs
This query uses the following LogsQL features:
_time
filter for selecting logs on the given time range (last 5 minutes in the query above).stats
pipe with additional filtering for calculating the total number of logs and the number of logs with theerror
word on the selected time range.math
pipe for calculating the share of logs witherror
word comparing to the total number of logs.
How to select logs for working hours and weekdays?
Use day_range
and week_range
filters.
For example, the following query selects logs from Monday to Friday in working hours [08:00 - 18:00]
over the last 4 weeks:
_time:4w _time:week_range[Mon, Fri] _time:day_range[08:00, 18:00)
It uses implicit AND
logical filter for joining multiple filters
on _time
field.
How to find logs with the given phrase containing whitespace?
Use phrase filter
. For example, the following LogsQL query
returns logs with the cannot open file
phrase over the last 5 minutes:
_time:5m "cannot open file"