The change is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4477
54 KiB
LogsQL
LogsQL is a simple yet powerful query language for VictoriaLogs. It provides the following features:
- Full-text search across log fields. See word filter, phrase filter and prefix filter.
- Ability to combine filters into arbitrary complex logical filters.
- Ability to extract structured fields from unstructured logs at query time. See these docs.
- Ability to calculate various stats over the selected log entries. See these docs.
LogsQL tutorial
If you aren't familiar with VictoriaLogs, then start with key concepts docs.
Then follow these docs:
The simplest LogsQL query is just a word, which must be found in the log message.
For example, the following query finds all the logs with error
word:
error
This query matches logs with any timestamp, e.g. it may return logs from the previous year alongside recently ingested logs.
If the queried word clashes with LogsQL keywords, then just wrap it into quotes.
For example, the following query finds all the log messages with and
word:
"and"
It is OK to wrap any word into quotes. For example:
"error"
Moreover, it is possible to wrap phrases containing multiple words in quotes. For example, the following query
finds log messages with the error: cannot find file
phrase:
"error: cannot find file"
Usually logs from the previous year aren't so interesting comparing to the recently ingested logs.
So it is recommended adding time filter to the query.
For example, the following query returns logs with the error
word,
which were ingested into VictoriaLogs during the last 5 minutes:
error AND _time:[now-5m,now]
This query consists of two filters joined with AND
operator:
- The filter on the
error
word. - The filter on the
_time
field.
The AND
operator means that the log entry must match both filters in order to be selected.
Typical LogsQL query constists of multiple filters joined with AND
operator. It may be tiresome typing and then reading all these AND
words.
So LogsQL allows omitting AND
words. For example, the following query is equivalent to the query above:
error _time:[now-5m,now]
The query returns the following log fields by default:
Logs may contain arbitrary number of other fields. If you need obtaining some of these fields in query results,
then just refer them in the query with field_name:*
filter.
For example, the following query returns host.hostname
field additionally to _msg
, _stream
and _time
fields:
error _time:[now-5m,now] host.hostname:*
Suppose the query above selects too many rows because some buggy app pushes invalid error logs to VictoriaLogs. Suppose the app adds buggy_app
word to every log line.
Then the following query removes all the logs from the buggy app, allowing us paying attention to the real errors:
_time:[now-5m,now] error NOT buggy_app
This query uses NOT
operator for removing log lines from the buggy app. The NOT
operator is used frequently, so it can be substituted with !
char.
So the following query is equivalent to the previous one:
_time:[now-5m,now] error !buggy_app
Suppose another buggy app starts pushing invalid error logs to VictoriaLogs - it adds foobar
word to every emitted log line.
No problems - just add !foobar
to the query in order to remove these buggy logs:
_time:[now-5m,now] error !buggy_app !foobar
This query can be rewritten to more clear query with the OR
operator inside parentheses:
_time:[now-5m,now] error !(buggy_app OR foobar)
Note that the parentheses are required here, since otherwise the query won't return the expected results.
The query error !buggy_app OR foobar
is interpreted as (error AND NOT buggy_app) OR foobar
. This query may return error logs
from the buggy app if they contain foobar
word. This query also continues returning all the error logs from the second buggy app.
This is because of different priorities for NOT
, AND
and OR
operators.
Read these docs for more details. There is no need in remembering all these priority rules -
just wrap the needed query parts into explicit parentheses if you aren't sure in priority rules.
As an additional bonus, explicit parentheses make queries easier to read and maintain.
Queries above assume that the error
word is stored in the log message.
This word can be stored in other field such as log.level
.
How to select error logs in this case? Just add the log.level:
prefix in front of the error
word:
_time:[now-5m,now] log.level:error !(buggy_app OR foobar)
The field name can be wrapped into quotes if it contains special chars or keywords, which may clash with LogsQL syntax. Any word also can be wrapped into quotes. So the following query is equivalent to the previous one:
"_time":[now-5m,now] "log.level":"error" !("buggy_app" OR "foobar")
What if the application identifier - such as buggy_app
and foobar
- is stored in the app
field? Correct - just add app:
prefix in front of buggy_app
and foobar
:
_time:[now-5m,now] log.level:error !(app:buggy_app OR app:foobar)
The query can be simplified by moving the app:
prefix outside the parentheses:
_time:[now-5m,now] log.level:error !app:(buggy_app OR foobar)
The app
field uniquely identifies the application instance if a single instance runs per each unique app
.
In this case it is recommended associating the app
field with log stream fields
during data ingestion. This usually improves both compression rate
and query performance when querying the needed streams via _stream
filter.
If the app
field is associated with the log stream, then the query above can be rewritten to more performant one:
_time:[now-5m,now] log.level:error _stream:{app!~"buggy_app|foobar"}
This query completely skips scanning for logs from buggy_app
and foobar
apps, thus significantly reducing disk read IO and CPU time
needed for performing the query.
Finally, it is recommended reading performance tips.
Now you are familiar with LogsQL basics. Read query syntax if you want to continue learning LogsQL.
Key concepts
Word
LogsQL splits all the log fields into words
delimited by non-word chars such as whitespace, parens, punctuation chars, etc. For example, the foo: (bar,"тест")!
string
is split into foo
, bar
and тест
words. Words can contain arbitrary utf-8 chars.
These words are taken into account by full-text search filters such as
word filter, phrase filter and prefix filter.
Query syntax
LogsQL query consists of the following parts delimited by |
:
- Filters, which select log entries for further processing. This part is required in LogsQL. Other parts are optional.
- Optional stream context, which allows selecting surrounding log lines for the matching log lines.
- Optional transformations for the selected log fields. For example, an additional fields can be extracted or constructed from existing fields.
- Optional post-filters for post-filtering of the selected results. For example, post-filtering can filter results based on the fields constructed by transformations.
- Optional stats transformations, which can calculate various stats across selected results.
- Optional sorting, which can sort the results by the sepcified fields.
- Optional limiters, which can apply various limits on the selected results.
Filters
LogsQL supports various filters for searching for log messages (see below). They can be combined into arbitrary complex queries via logical filters.
Filters are applied to _msg
field by default.
If the filter must be applied to other log field,
then its' name followed by the colon must be put in front of the filter. For example, if error
word filter must be applied
to the log.level
field, then use log.level:error
query.
Field names and filter args can be put into quotes if they contain special chars, which may clash with LogsQL syntax. LogsQL supports quoting via double quotes "
,
single quotes '
and backticks:
"some 'field':123":i('some("value")') AND `other"value'`
If doubt, it is recommended quoting field names and filter args.
The list of LogsQL filters:
- Time filter - matches logs with
_time
field in the given time range - Stream filter - matches logs, which belong to the given streams
- Word filter - matches logs with the given word
- Phrase filter - matches logs with the given phrase
- Prefix filter - matches logs with the given word prefix or phrase prefix
- Empty value filter - matches logs without the given log field
- Any value filter - matches logs with the given non-empty log field
- Exact filter - matches logs with the exact value
- Exact prefix filter - matches logs starting with the given prefix
- Multi-exact filter - matches logs with one of the specified exact values
- Case-insensitive filter - matches logs with the given case-insensitive word, phrase or prefix
- Sequence filter - matches logs with the given sequence of words or phrases
- Regexp filter - matches logs for the given regexp
- Range filter - matches logs with numeric field values in the given range
- IPv4 range filter - matches logs with ip address field values in the given range
- String range filter - matches logs with field values in the given string range
- Length range filter - matches logs with field values of the given length range
- Logical filter - allows combining other filters
Time filter
VictoriaLogs scans all the logs per each query if it doesn't contain the filter on _time
field.
It uses various optimizations in order to speed up full scan queries without the _time
filter,
but such queries can be slow if the storage contains large number of logs over long time range. The easiest way to optimize queries
is to narrow down the search with the filter on _time
field.
For example, the following query returns log messages
ingested into VictoriaLogs during the last hour, which contain the error
word:
_time:(now-1h, now) AND error
The following formats are supported for _time
filter:
-
Fixed time:
_time:YYYY-MM-DD
- matches all the log messages for the particular day. For example,_time:2023-04-25
matches all the log messages for April 25, 2023 by UTC._time:YYYY-MM
- matches all the log messages for the particular month. For example,_time:2023-02
matches all the log messages for February, 2023 by UTC._time:YYYY
- matches all the log messages for the particular year. For example,_time:2023
matches all the log message for 2023 by UTC._time:YYYY-MM-DDTHH
- matches all the log messages for the particular hour. For example,_time:2023-04-25T22
matches all the log messages from22:00
to23:00
on April 25, 2023 by UTC._time:YYYY-MM-DDTHH:MM
- matches all the log messages for the particular minute. For example,_time:2023-04-25T22:45
matches all the log messages from22:45
to22:46
on April 25, 2023 by UTC._time:YYYY-MM-DDTHH:MM:SS
- matches all the log messages for the particular second. For example,_time:2023-04-25T22:45:59
matches all the log messages from22:45:59
to23:46:00
on April 25, 2023 by UTC.
-
Time range:
_time:[min_time, max_time]
- matches log messages on the time range[min_time, max_time]
, including bothmin_time
andmax_time
. Themin_time
andmax_time
can contain any format specified here. For example,_time:[2023-04-01, 2023-04-30]
matches log messages for the whole April, 2023 by UTC, e.g. it is equivalent to_time:2023-04
._time:[min_time, max_time)
- matches log messages on the time range[min_time, max_time)
, not includingmax_time
. Themin_time
andmax_time
can contain any format specified here. For example,_time:[2023-02-01, 2023-03-01)
matches log messages for the whole February, 2023 by UTC, e.g. it is equivalent to_time:2023-02
.
It is possible to specify time zone offset for all the absolute time formats by appending +hh:mm
or -hh:mm
suffix.
For example, _time:2023-04-25+05:30
matches all the log messages on April 25, 2023 by India time zone,
while _time:2023-02-07:00
matches all the log messages from February, 2023 by California time zone.
Performance tips:
-
It is recommended specifying the smallest possible time range during the search, since it reduces the amounts of log entries, which need to be scanned during the query. For example,
_time:[now-1h, now]
is usually faster than_time:[now-5h, now]
. -
While LogsQL supports arbitrary number of
_time:...
filters at any level of logical filters, it is recommended specifying a single_time
filter at the top level of the query.
See also:
Stream filter
VictoriaLogs provides an optimized way to select log entries, which belong to particular log streams.
This can be done via _stream:{...}
filter. The {...}
may contain arbitrary
Prometheus-compatible label selector
over fields associated with log streams.
For example, the following query selects log entries
with app
field equal to nginx
:
_stream:{app="nginx"}
This query is equivalent to the following exact() query, but the upper query usually works much faster:
app:exact("nginx")
Performance tips:
-
It is recommended using the most specific
_stream:{...}
filter matching the smallest number of log streams, which needs to be scanned by the rest of filters in the query. -
While LogsQL supports arbitrary number of
_stream:{...}
filters at any level of logical filters, it is recommended specifying a single_stream:...
filter at the top level of the query.
See also:
Word filter
The simplest LogsQL query consists of a single word to search in log messages. For example, the following query matches
log messages with error
word inside them:
error
This query matches the following log messages:
error
an error happened
error: cannot open file
This query doesn't match the following log messages:
ERROR
, since the filter is case-sensitive by default. Usei(error)
for this case. See these docs for details.multiple errors occurred
, since theerrors
word doesn't matcherror
word. Useerror*
for this case. See these docs for details.
By default the given word is searched in the _msg
field.
Specify the field name in front of the word and put a colon after it
if it must be searched in the given field. For example, the following query returns log entries containing the error
word in the log.level
field:
log.level:error
Both the field name and the word in the query can contain arbitrary utf-8-encoded chars. For example:
поле:значение
Both the field name and the word in the query can be put inside quotes if they contain special chars, which may clash with the query syntax.
For example, the following query searches for the ip 1.2.3.45
in the field ip:remote
:
"ip:remote":"1.2.3.45"
See also:
Phrase filter
Is you need to search for log messages with the specific phrase inside them, then just wrap the phrase in quotes.
The phrase can contain any chars, including whitespace, punctuation, parens, etc. They are taken into account during the search.
For example, the following query matches log messages with cannot open file
phrase inside them:
"cannot open file"
This query matches the following log messages:
ERROR: cannot open file /foo/bar/baz
cannot open file: permission denied
This query doesn't match the following log messages:
cannot open file
, since the number of whitespace chars between words doesn't match the number of whitespace chars in the search phrase. Useseq("cannot", "open", "file")
query instead. See these docs for details.open file: cannot do this
, since the message doesn't contain the full phrase requested in the query. If you need matching a message with all the words listed in the query, then usecannot AND open AND file
query. See these docs for details.cannot open files
, since the message ends withfiles
word instead offile
word. Use"cannot open file"*
query for this case. See these docs for details.Cannot open file: failure
, since theCannot
word starts with capital letter. Usei("cannot open file")
for this case. See these docs for details.
By default the given phrase is searched in the _msg
field.
Specify the field name in front of the phrase and put a colon after it
if it must be searched in the given field. For example, the following query returns log entries containing the cannot open file
phrase in the event.original
field:
event.original:"cannot open file"
Both the field name and the phrase can contain arbitrary utf-8-encoded chars. For example:
сообщение:"невозможно открыть файл"
The field name can be put inside quotes if it contains special chars, which may clash with the query syntax.
For example, the following query searches for the cannot open file
phrase in the field some:message
:
"some:message":"cannot open file"
See also:
Prefix filter
If you need to search for log messages with words / phrases containing some prefix, then just add *
char to the end of the word / phrase in the query.
For example, the following query returns log messages, which contain words with err
prefix:
err*
This query matches the following log messages:
err: foobar
cannot open file: error occurred
This query doesn't match the following log messages:
Error: foobar
, since theError
word starts with capital letter. Usei(err*)
for this case. See these docs for details.fooerror
, since thefooerror
word doesn't start witherr
. Usere("err")
for this case. See these docs for details.
Prefix filter can be applied to phrases. For example, the following query matches
log messages containing phrases with unexpected fail
prefix:
"unexpected fail"*
This query matches the following log messages:
unexpected fail: IO error
error:unexpected failure
This query doesn't match the following log messages:
unexpectedly failed
, since theunexpectedly
doesn't matchunexpected
word. Useunexpected* AND fail*
for this case. See these docs for details.failed to open file: unexpected EOF
, sincefailed
word occurs before theunexpected
word. Useunexpected AND fail*
for this case. See these docs for details.
By default the prefix filter is applied to the _msg
field.
Specify the needed field name in front of the prefix filter
in order to apply it to the given field. For example, the following query matches log.level
field containing any word with the err
prefix:
log.level:err*
If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes in the query.
For example, the following query matches log:level
field containing any word with the err
prefix.
"log:level":err*
Performance tips:
- Prefer using word filters and phrase filters combined via logical filter instead of prefix filter.
- Prefer moving word filters and phrase filters in front of prefix filter when using logical filter.
- See other performance tips.
See also:
Empty value filter
Sometimes it is needed to find log entries without the given log field.
This can be performed with log_field:""
syntax. For example, the following query matches log entries without host.hostname
field:
host.hostname:""
See also:
Any value filter
Sometimes it is needed to find log entries containing any non-empty value for the given log field.
This can be performed with log_field:*
syntax. For example, the following query matches log entries with non-empty host.hostname
field:
host.hostname:*
See also:
Exact filter
The word filter and phrase filter return log messages,
which contain the given word or phrase inside them. The message may contain additional text other than the requested word or phrase. If you need searching for log messages
or log fields with the exact value, then use the exact(...)
filter.
For example, the following query returns log messages wih the exact value fatal error: cannot find /foo/bar
:
exact("fatal error: cannot find /foo/bar")
The query doesn't match the following log messages:
-
fatal error: cannot find /foo/bar/baz
orsome-text fatal error: cannot find /foo/bar
, since they contain an additional text other than the specified in theexact()
filter. Use"fatal error: cannot find /foo/bar"
query in this case. See these docs for details. -
FATAL ERROR: cannot find /foo/bar
, since theexact()
filter is case-sensitive. Usei("fatal error: cannot find /foo/bar")
in this case. See these docs for details.
By default the exact()
filter is applied to the _msg
field.
Specify the field name in front of the exact()
filter and put a colon after it
if it must be searched in the given field. For example, the following query returns log entries with the exact error
value at log.level
field:
log.level:exact("error")
Both the field name and the phrase can contain arbitrary utf-8-encoded chars. For example:
log.уровень:exact("ошибка")
The field name can be put inside quotes if it contains special chars, which may clash with the query syntax.
For example, the following query matches the error
value in the field log:level
:
"log:level":exact("error")
See also:
Exact prefix filter
Sometimes it is needed to find log messages starting with some prefix. This can be done with the exact_prefix(...)
filter.
For example, the following query matches log messages, which start from Processing request
prefix:
exact_prefix("Processing request")
This filter matches the following log messages:
Processing request foobar
Processing requests from ...
It doesn't match the following log messages:
processing request foobar
, since the log message starts with lowercasep
. Useexact_prefix("processing request") OR exact_prefix("Processing request")
query in this case. See these docs for details.start: Processing request
, since the log message doesn't start withProcessing request
. Use"Processing request"
query in this case. See these docs for details.
By default the exact_prefix()
filter is applied to the _msg
field.
Specify the field name in front of the exact_prefix()
filter and put a colon after it
if it must be searched in the given field. For example, the following query returns log entries with log.level
field, which starts with err
prefix:
log.level:exact_prefix("err")
Both the field name and the phrase can contain arbitrary utf-8-encoded chars. For example:
log.уровень:exact_prefix("ошиб")
The field name can be put inside quotes if it contains special chars, which may clash with the query syntax.
For example, the following query matches log:level
values starting with err
prefix:
"log:level":exact_prefix("err")
See also:
Multi-exact filter
Sometimes it is needed to locate log messages with a field containing one of the given values. This can be done with multiple exact filters
combined into a single logical filter. For example, the following query matches log messages with log.level
field
containing either error
or fatal
exact values:
log.level:(exact("error") OR exact("fatal"))
While this solution works OK, LogsQL provides simpler and faster solution for this case - the in()
filter.
log.level:in("error", "fatal")
It works very fast for long lists passed to in()
.
The future VictoriaLogs versions will allow passing arbitrary queries into in()
filter.
For example, the following query selects all the logs for the last hour for users, who visited pages with admin
word in the path
during the last day:
_time:[now-1h,now] AND user_id:in(_time:[now-1d,now] AND path:admin | fields user_id)
See the Roadmap for details.
See also:
Case-insensitive filter
Case-insensitive filter can be applied to any word, phrase or prefix by wrapping the corresponding word filter,
phrase filter or prefix filter into i()
. For example, the following query returns
log messages with error
word in any case:
i(error)
The query matches the following log messages:
unknown error happened
ERROR: cannot read file
Error: unknown arg
An ErRoR occured
The query doesn't match the following log messages:
FooError
, since theFooError
word has superflouos prefixFoo
. Usere("(?i)error")
for this case. See these docs for details.too many Errors
, since theErrors
word has superflouos suffixs
. Usei(error*)
for this case.
By default the i()
filter is applied to the _msg
field.
Specify the needed field name in front of the filter
in order to apply it to the given field. For example, the following query matches log.level
field containing error
word in any case:
log.level:i(error)
If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes in the query.
For example, the following query matches log:level
field containing error
word in any case.
"log:level":i("error")
Performance tips:
- Prefer using case-sensitive filter over case-insensitive filter.
- Prefer moving word filter, phrase filter and prefix filter in front of case-sensitive filter when using logical filter.
- See other performance tips.
See also:
Sequence filter
Sometimes it is needed to find log messages
with words or phrases in a particular order. For example, if log messages with error
word followed by open file
phrase
must be found, then the following LogsQL query can be used:
seq("error", "open file")
This query matches some error: cannot open file /foo/bar
message, since the open file
phrase goes after the error
word.
The query doesn't match the cannot open file: error
message, since the open file
phrase is located in front of the error
word.
If you need matching log messages with both error
word and open file
phrase, then use error AND "open file"
query. See these docs
for details.
By default the seq()
filter is applied to the _msg
field.
Specify the needed field name in front of the filter
in order to apply it to the given field. For example, the following query matches event.original
field containing (error, "open file")
sequence:
event.original:seq(error, "open file")
If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes in the query.
For example, the following query matches event:original
field containing (error, "open file")
sequence:
"event:original":seq(error, "open file")
See also:
Regexp filter
LogsQL supports regular expression filter with re2 syntax via re(...)
expression.
For example, the following query returns all the log messages containing error
or warn
susbstrings:
re("error|warn")
The query matches the following log messages:
error: cannot read data
A warning has been raised
By default the re()
filter is applied to the _msg
field.
Specify the needed field name in front of the filter
in order to apply it to the given field. For example, the following query matches event.original
field containing either error
or warn
substrings:
event.original:re("error|warn")
If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes in the query.
For example, the following query matches event:original
field containing either error
or warn
substrings:
"event:original":re("error|warn")
Performance tips:
- Prefer combining simple word filter with logical filter instead of using regexp filter.
For example, the
re("error|warning")
query can be substituted witherror OR warning
query, which usually works much faster. See also multi-exact filter. - Prefer moving the regexp filter to the end of the logical filter, so lightweighter filters are executed first.
- Prefer using
exact_prefix("some prefix")
instead ofre("^some prefix")
, since the exact_prefix() works much faster than there()
filter. - See other performance tips.
See also:
Range filter
If you need to filter log message by some field containing only numeric values, then the range()
filter can be used.
For example, if the request.duration
field contains the request duration in seconds, then the following LogsQL query can be used
for searching for log entries with request durations exceeding 4.2 seconds:
request.duration:range(4.2, Inf)
The lower and the upper bounds of the range are excluded by default. If they must be included, then substitute the corresponding parentheses with square brackets. For example:
range[1, 10)
includes1
in the matching rangerange(1, 10]
includes10
in the matching rangerange[1, 10]
includes1
and10
in the matching range
Note that the range()
filter doesn't match log fields
with non-numeric values alongside numeric values. For example, range(1, 10)
doesn't match the request took 4.2 seconds
log message, since the 4.2
number is surrounded by other text.
Extract the numeric value from the message with parse(_msg, "the request took <request_duration> seconds")
transformation
and then apply the range()
post-filter to the extracted request_duration
field.
Performance tips:
- It is better to query pure numeric field instead of extracting numeric field from text field via transformations at query time.
- See other performance tips.
See also:
IPv4 range filter
If you need to filter log message by some field containing only IPv4 addresses such as 1.2.3.4
,
then the ipv4_range()
filter can be used. For example, the following query matches log entries with user.ip
address in the range [127.0.0.0 - 127.255.255.255]
:
user.ip:ipv4_range(127.0.0.0, 127.255.255.255)
The ipv4_range()
accepts also IPv4 subnetworks in CIDR notation.
For example, the following query is equivalent to the query above:
user.ip:ipv4_range("127.0.0.0/8")
If you need matching a single IPv4 address, then just put it inside ipv4_range()
. For example, the following query matches 1.2.3.4
IP
at user.ip
field:
user.ip:ipv4_range("1.2.3.4")
Note that the ipv4_range()
doesn't match a string with IPv4 address if this string contains other text. For example, ipv4_range("127.0.0.0/24")
doesn't match request from 127.0.0.1: done
log message,
since the 127.0.0.1
ip is surrounded by other text. Extract the IP from the message with parse(_msg, "request from <ip>: done")
transformation
and then apply the ipv4_range()
post-filter to the extracted ip
field.
Hints:
- If you need searching for log messages containing the given
X.Y.Z.Q
IPv4 address, then"X.Y.Z.Q"
query can be used. See these docs for details. - If you need searching for log messages containing
at least a single IPv4 address out of the given list, then
"ip1" OR "ip2" ... OR "ipN"
query can be used. See these docs for details. - If you need finding log entries with
ip
field in multiple ranges, then useip:(ipv4_range(range1) OR ipv4_range(range2) ... OR ipv4_range(rangeN))
query. See these docs for details.
Performance tips:
- It is better querying pure IPv4 field instead of extracting IPv4 from text field via transformations at query time.
- See other performance tips.
See also:
String range filter
If you need to filter log message by some field with string values in some range, then string_range()
filter can be used.
For example, the following LogsQL query matches log entries with user.name
field starting from A
and B
chars:
user.name:string_range(A, C)
The string_range()
includes the lower bound, while excluding the upper bound. This simplifies querying distinct sets of logs.
For example, the user.name:string_range(C, E)
would match user.name
fields, which start from C
and D
chars.
See also:
Length range filter
If you need to filter log message by its length, then len_range()
filter can be used.
For example, the following LogsQL query matches log messages
with lengths in the range [5, 10]
chars:
len_range(5, 10)
This query matches the following log messages, since their length is in the requested range:
foobar
foo bar
This query doesn't match the following log messages:
foo
, since it is too shortfoo bar baz abc
, sinc it is too long
By default the len_range()
is applied to the _msg
field.
Put the field name in front of the len_range()
in order to apply
the filter to the needed field. For example, the following query matches log entries with the foo
field length in the range [10, 20]
chars:
foo:len_range(10, 20)
See also:
Logical filter
Simpler LogsQL filters can be combined into more complex filters with the following logical operations:
-
q1 AND q2
- matches common log entries returned by bothq1
andq2
. Arbitrary number of filters can be combined withAND
operation. For example,error AND file AND app
matches log messages, which simultaneously containerror
,file
andapp
words. TheAND
operation is frequently used in LogsQL queries, so it is allowed to skip theAND
word. For example,error file app
is equivalent toerror AND file AND app
. -
q1 OR q2
- merges log entries returned by bothq1
andq2
. Aribtrary number of filters can be combined withOR
operation. For example,error OR warning OR info
matches log messages, which contain at least one oferror
,warning
orinfo
words. -
NOT q
- returns all the log entries except of those which matchq
. For example,NOT info
returns all the log messages, which do not containinfo
word. TheNOT
operation is frequently used in LogsQL queries, so it is allowed substitutingNOT
with!
in queries. For example,!info
is equivalent toNOT info
.
The NOT
operation has the highest priority, AND
has the middle priority and OR
has the lowest priority.
The priority order can be changed with parentheses. For example, NOT info OR debug
is interpreted as (NOT info) OR debug
,
so it matches log messages,
which do not contain info
word, while it also matches messages with debug
word (which may contain the info
word).
This is not what most users expect. In this case the query can be rewritten to NOT (info OR debug)
,
which correctly returns log messages without info
and debug
words.
LogsQL supports arbitrary complex logical queries with arbitrary mix of AND
, OR
and NOT
operations and parentheses.
By default logical filters apply to the _msg
field
unless the inner filters explicitly specify the needed log field via field_name:filter
syntax.
For example, (error OR warn) AND host.hostname:host123
is interpreted as (_msg:error OR _msg:warn) AND host.hostname:host123
.
It is possible to specify a single log field for multiple filters with the following syntax:
field_name:(q1 OR q2 OR ... qN)
For example, log.level:error OR log.level:warning OR log.level:info
can be substituted with the shorter query: log.level:(error OR warning OR info)
.
Performance tips:
-
VictoriaLogs executes logical operations from the left to the right, so it is recommended moving the most specific and the fastest filters (such as word filter and phrase filter) to the left, while moving less specific and the slowest filters (such as regexp filter and case-insensitive filter) to the right. For example, if you need to find log messages with the
error
word, which match some/foo/(bar|baz)
regexp, it is better from performance PoV to use the queryerror re("/foo/(bar|baz)")
instead ofre("/foo/(bar|baz)") error
.The most specific filter means that it matches the lowest number of log entries comparing to other filters.
Stream context
LogsQL will support the ability to select the given number of surrounding log lines for the selected log lines on a per-stream basis.
See the Roadmap for details.
Transformations
It is possible to perform various transformations on the selected log entries at client side
with jq
, awk
, cut
, etc. Unix commands according to these docs.
LogsQL will support the following transformations for the selected log entries:
- Extracting the specified fields from text log fields according to the provided pattern.
- Extracting the specified fields from JSON strings stored inside log fields.
- Extracting the specified fields from logfmt strings stored inside log fields.
- Creating a new field from existing log fields according to the provided format.
- Creating a new field according to math calculations over existing log fields.
- Copying of the existing log fields.
- Parsing duration strings into floating-point seconds for further stats calculations.
- Creating a boolean field with the result of arbitrary post-filters applied to the current fields. Boolean fields may be useful for conditional stats calculation.
- Creating an integer field with the length of the given field value. This can be useful for stats calculations.
See the Roadmap for details.
Post-filters
It is possible to perform post-filtering on the selected log entries at client side with grep
or similar Unix commands
according to these docs.
LogsQL will support post-filtering on the original log fields and fields created by various transformations. The following post-filters will be supported:
- Full-text filtering.
- Logical filtering.
See the Roadmap for details.
Stats
It is possible to perform stats calculations on the selected log entries at client side with sort
, uniq
, etc. Unix commands
according to these docs.
LogsQL will support calculating the following stats based on the log fields and fields created by transformations:
- The number of selected logs.
- The number of non-empty values for the given field.
- The number of unique values for the given field.
- The min, max, avg, and sum for the given field.
- The median and percentile for the given field.
It will be possible specifying an optional condition filter when calculating the stats.
For example, sumIf(response_size, is_admin:true)
calculates the total response size for admins only.
It will be possible to group stats by the specified fields and by the specified time buckets.
See the Roadmap for details.
Sorting
By default VictoriaLogs doesn't sort the returned results because of performance and efficiency concerns described here.
It is possible to sort the selected log entries at client side with sort
Unix command
according to these docs.
LogsQL will support results' sorting by the given set of log fields.
See the Roadmap for details.
Limiters
It is possible to limit the returned results with head
, tail
, less
, etc. Unix commands
according to these docs.
LogsQL will support the ability to limit the number of returned results alongside the ability to page the returned results. Additionally, LogsQL will provide the ability to select fields, which must be returned in the response.
See the Roadmap for details.
Performance tips
- It is highly recommended specifying time filter in order to narrow down the search to specific time range.
- It is highly recommended specifying stream filter in order to narrow down the search to specific log streams.
- Move faster filters such as word filter and phrase filter to the beginning of the query. This rule doesn't apply to time filter and stream filter, which can be put at any place of the query.
- Move more specific filters, which match lower number of log entries, to the beginning of the query. This rule doesn't apply to time filter and stream filter, which can be put at any place of the query.