lib/logstorage: allow using - instead of ! as a shorthand for NOT operator in LogsQL

This commit is contained in:
Aliaksandr Valialkin 2024-09-27 13:12:34 +02:00
parent 76c1b0b8ea
commit 09b309a82e
No known key found for this signature in database
GPG key ID: 52C003EE2BCDB9EB
6 changed files with 67 additions and 55 deletions

View file

@ -17,13 +17,14 @@ according to [these docs](https://docs.victoriametrics.com/victorialogs/quicksta
* FEATURE: [web UI](https://docs.victoriametrics.com/victorialogs/querying/#web-ui): keep selected columns in table view on page reloads. Before, selected columns were reset on each update. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7016). * FEATURE: [web UI](https://docs.victoriametrics.com/victorialogs/querying/#web-ui): keep selected columns in table view on page reloads. Before, selected columns were reset on each update. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7016).
* FEATURE: allow skipping `_stream:` prefix in [stream filters](https://docs.victoriametrics.com/victorialogs/logsql/#stream-filter). This simplifies writing queries with stream filters. Now `{foo="bar"}` is the recommended format for stream filters over the `_stream:{foo="bar"}` format. * FEATURE: allow skipping `_stream:` prefix in [stream filters](https://docs.victoriametrics.com/victorialogs/logsql/#stream-filter). This simplifies writing queries with stream filters. Now `{foo="bar"}` is the recommended format for stream filters over the `_stream:{foo="bar"}` format.
* FEATURE: allow using `-` instead of `!` as `NOT` operator shorthand in [logical filters](https://docs.victoriametrics.com/victorialogs/logsql/#logical-filter). For example, `-info -warn` query is equivalent to `!info !warn`. This simplifies transition from other query languages with full-text search support, which usually use `-` as `NOT` operator.
## [v0.30.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.1-victorialogs) ## [v0.30.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.1-victorialogs)
Released at 2024-09-27 Released at 2024-09-27
* BUGFIX: consistently return matching log streams sorted by time from [`stream_context` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stream_context-pipe). Previously log streams could be returned in arbitrary order with every request. This could complicate using `stream_context` pipe. * BUGFIX: consistently return matching log streams sorted by time from [`stream_context` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stream_context-pipe). Previously log streams could be returned in arbitrary order with every request. This could complicate using `stream_context` pipe.
* BUGFIX: add missing `_msg="---"` delimiter between stream contexts belonging to different [log streams](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields). This should simplify investigating `stream_context` output for multiple matching log streams. * BUGFIX: [`stream_context` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stream_context-pipe): add missing `_msg="---"` delimiter between stream contexts belonging to different [log streams](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields). This should simplify investigating `stream_context` output for multiple matching log streams.
## [v0.30.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.0-victorialogs) ## [v0.30.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.0-victorialogs)

View file

@ -115,30 +115,30 @@ Then the following query removes all the logs from the buggy app, allowing us pa
_time:5m error NOT buggy_app _time:5m error NOT buggy_app
``` ```
This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `!` char This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `-` or `!` char
(the `!` char is used instead of `-` char as a shorthand for `NOT` operator because it nicely combines with [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter) (the `!` must be used instead of `-` in front of [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`). and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`).
The following query is equivalent to the previous one: The following query is equivalent to the previous one:
```logsql ```logsql
_time:5m error !buggy_app _time:5m error -buggy_app
``` ```
Suppose another buggy app starts pushing invalid error logs to VictoriaLogs - it adds `foobar` [word](#word) to every emitted log line. Suppose another buggy app starts pushing invalid error logs to VictoriaLogs - it adds `foobar` [word](#word) to every emitted log line.
No problems - just add `!foobar` to the query in order to remove these buggy logs: No problems - just add `-foobar` to the query in order to remove these buggy logs:
```logsql ```logsql
_time:5m error !buggy_app !foobar _time:5m error -buggy_app -foobar
``` ```
This query can be rewritten to more clear query with the `OR` [operator](#logical-filter) inside parentheses: This query can be rewritten to more clear query with the `OR` [operator](#logical-filter) inside parentheses:
```logsql ```logsql
_time:5m error !(buggy_app OR foobar) _time:5m error -(buggy_app OR foobar)
``` ```
The parentheses are **required** here, since otherwise the query won't return the expected results. The parentheses are **required** here, since otherwise the query won't return the expected results.
The query `error !buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar` according to [priorities for AND, OR and NOT operator](#logical-filters). The query `error -buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar` according to [priorities for AND, OR and NOT operator](#logical-filters).
This query returns logs with `foobar` [word](#word), even if do not contain `error` word or contain `buggy_app` word. This query returns logs with `foobar` [word](#word), even if do not contain `error` word or contain `buggy_app` word.
So it is recommended wrapping the needed query parts into explicit parentheses if you are unsure in priority rules. So it is recommended wrapping the needed query parts into explicit parentheses if you are unsure in priority rules.
As an additional bonus, explicit parentheses make queries easier to read and maintain. As an additional bonus, explicit parentheses make queries easier to read and maintain.
@ -148,26 +148,26 @@ If this word is stored in other [field](https://docs.victoriametrics.com/victori
in front of the `error` word: in front of the `error` word:
```logsql ```logsql
_time:5m log.level:error !(buggy_app OR foobar) _time:5m log.level:error -(buggy_app OR foobar)
``` ```
The field name can be wrapped into quotes if it contains special chars or keywords, which may clash with LogsQL syntax. The field name can be wrapped into quotes if it contains special chars or keywords, which may clash with LogsQL syntax.
Any [word](#word) also can be wrapped into quotes. So the following query is equivalent to the previous one: Any [word](#word) also can be wrapped into quotes. So the following query is equivalent to the previous one:
```logsql ```logsql
"_time":"5m" "log.level":"error" !("buggy_app" OR "foobar") "_time":"5m" "log.level":"error" -("buggy_app" OR "foobar")
``` ```
What if the application identifier - such as `buggy_app` and `foobar` - is stored in the `app` field? Correct - just add `app:` prefix in front of `buggy_app` and `foobar`: What if the application identifier - such as `buggy_app` and `foobar` - is stored in the `app` field? Correct - just add `app:` prefix in front of `buggy_app` and `foobar`:
```logsql ```logsql
_time:5m log.level:error !(app:buggy_app OR app:foobar) _time:5m log.level:error -(app:buggy_app OR app:foobar)
``` ```
The query can be simplified by moving the `app:` prefix outside the parentheses: The query can be simplified by moving the `app:` prefix outside the parentheses:
```logsql ```logsql
_time:5m log.level:error !app:(buggy_app OR foobar) _time:5m log.level:error -app:(buggy_app OR foobar)
``` ```
The `app` field uniquely identifies the application instance if a single instance runs per each unique `app`. The `app` field uniquely identifies the application instance if a single instance runs per each unique `app`.
@ -1239,8 +1239,11 @@ Simpler LogsQL [filters](#filters) can be combined into more complex filters wit
- `NOT q` - returns all the log entries except of those which match `q`. For example, `NOT info` returns all the - `NOT q` - returns all the log entries except of those which match `q`. For example, `NOT info` returns all the
[log messages](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field), [log messages](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field),
which do not contain `info` [word](#word). The `NOT` operation is frequently used in LogsQL queries, so it is allowed substituting `NOT` with `!` in queries. which do not contain `info` [word](#word). The `NOT` operation is frequently used in LogsQL queries, so it is allowed substituting `NOT` with `-` and `!` in queries.
For example, `!info` is equivalent to `NOT info`. For example, `-info` and `!info` are equivalent to `NOT info`.
The `!` must be used instead of `-` in front of [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`.
The `NOT` operation has the highest priority, `AND` has the middle priority and `OR` has the lowest priority. The `NOT` operation has the highest priority, `AND` has the middle priority and `OR` has the lowest priority.
The priority order can be changed with parentheses. For example, `NOT info OR debug` is interpreted as `(NOT info) OR debug`, The priority order can be changed with parentheses. For example, `NOT info OR debug` is interpreted as `(NOT info) OR debug`,

View file

@ -58,7 +58,7 @@ to the query. For example, the following query selects logs with `error` [word](
which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour: which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql ```logsql
error !kubernetes _time:1h error -kubernetes _time:1h
``` ```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe) The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
@ -86,14 +86,14 @@ Use [`NOT` logical filter](https://docs.victoriametrics.com/victorialogs/logsql/
without the `INFO` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word) in the [log message](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field): without the `INFO` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word) in the [log message](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field):
```logsql ```logsql
!INFO -INFO
``` ```
If the number of returned logs is too big, then add [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter) If the number of returned logs is too big, then add [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter)
for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour: for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour:
```logsql ```logsql
!INFO _time:1h -INFO _time:1h
``` ```
If the number of returned logs is still too big, then consider adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) If the number of returned logs is still too big, then consider adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters)
@ -101,7 +101,7 @@ to the query. For example, the following query selects logs without `INFO` [word
which contain `error` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour: which contain `error` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql ```logsql
!INFO error _time:1h -INFO error _time:1h
``` ```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe) The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
@ -109,7 +109,7 @@ for sorting logs by the needed [fields](https://docs.victoriametrics.com/victori
sorts the selected logs by [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field): sorts the selected logs by [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field):
```logsql ```logsql
!INFO _time:1h | sort by (_time) -INFO _time:1h | sort by (_time)
``` ```
See also: See also:
@ -191,7 +191,7 @@ to the query. For example, the following query selects logs without `error`, `ER
which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour: which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql ```logsql
(error or ERROR or Error) !kubernetes _time:1h (error or ERROR or Error) -kubernetes _time:1h
``` ```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe) The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)

View file

@ -11,9 +11,10 @@ func (fn *filterNot) String() string {
s := fn.f.String() s := fn.f.String()
switch fn.f.(type) { switch fn.f.(type) {
case *filterAnd, *filterOr: case *filterAnd, *filterOr:
s = "(" + s + ")" return "!(" + s + ")"
default:
return "!" + s
} }
return "!" + s
} }
func (fn *filterNot) updateNeededFields(neededFields fieldsSet) { func (fn *filterNot) updateNeededFields(neededFields fieldsSet) {

View file

@ -856,7 +856,7 @@ func parseGenericFilter(lex *lexer, fieldName string) (filter, error) {
return parseFilterTilda(lex, fieldName) return parseFilterTilda(lex, fieldName)
case lex.isKeyword("!~"): case lex.isKeyword("!~"):
return parseFilterNotTilda(lex, fieldName) return parseFilterNotTilda(lex, fieldName)
case lex.isKeyword("not", "!"): case lex.isKeyword("not", "!", "-"):
return parseFilterNot(lex, fieldName) return parseFilterNot(lex, fieldName)
case lex.isKeyword("exact"): case lex.isKeyword("exact"):
return parseFilterExact(lex, fieldName) return parseFilterExact(lex, fieldName)
@ -2149,7 +2149,7 @@ func needQuoteToken(s string) bool {
return true return true
} }
for _, r := range s { for _, r := range s {
if !isTokenRune(r) && r != '.' && r != '-' { if !isTokenRune(r) && r != '.' {
return true return true
} }
} }

View file

@ -638,7 +638,9 @@ func TestParseQuerySuccess(t *testing.T) {
f(`"":foo`, "foo") f(`"":foo`, "foo")
f(`"" bar`, `"" bar`) f(`"" bar`, `"" bar`)
f(`!''`, `!""`) f(`!''`, `!""`)
f(`-''`, `!""`)
f(`foo:""`, `foo:""`) f(`foo:""`, `foo:""`)
f(`-foo:""`, `!foo:""`)
f(`!foo:""`, `!foo:""`) f(`!foo:""`, `!foo:""`)
f(`not foo:""`, `!foo:""`) f(`not foo:""`, `!foo:""`)
f(`not(foo)`, `!foo`) f(`not(foo)`, `!foo`)
@ -648,6 +650,8 @@ func TestParseQuerySuccess(t *testing.T) {
f("_msg:foo", "foo") f("_msg:foo", "foo")
f("'foo:bar'", `"foo:bar"`) f("'foo:bar'", `"foo:bar"`)
f("'!foo'", `"!foo"`) f("'!foo'", `"!foo"`)
f("'-foo'", `"-foo"`)
f(`'{a="b"}'`, `"{a=\"b\"}"`)
f("foo 'and' and bar", `foo "and" bar`) f("foo 'and' and bar", `foo "and" bar`)
f("foo bar", "foo bar") f("foo bar", "foo bar")
f("foo and bar", "foo bar") f("foo and bar", "foo bar")
@ -656,10 +660,13 @@ func TestParseQuerySuccess(t *testing.T) {
f("foo OR bar", "foo or bar") f("foo OR bar", "foo or bar")
f("not foo", "!foo") f("not foo", "!foo")
f("! foo", "!foo") f("! foo", "!foo")
f("- foo", "!foo")
f("not !`foo bar`", `"foo bar"`) f("not !`foo bar`", `"foo bar"`)
f("not -`foo bar`", `"foo bar"`)
f("foo or bar and not baz", "foo or bar !baz") f("foo or bar and not baz", "foo or bar !baz")
f("'foo bar' !baz", `"foo bar" !baz`) f("'foo bar' !baz", `"foo bar" !baz`)
f("foo:!bar", `!foo:bar`) f("foo:!bar", `!foo:bar`)
f("foo:-bar", `!foo:bar`)
f(`foo and bar and baz or x or y or z and zz`, `foo bar baz or x or y or z zz`) f(`foo and bar and baz or x or y or z and zz`, `foo bar baz or x or y or z zz`)
f(`foo and bar and (baz or x or y or z) and zz`, `foo bar (baz or x or y or z) zz`) f(`foo and bar and (baz or x or y or z) and zz`, `foo bar (baz or x or y or z) zz`)
f(`(foo or bar or baz) and x and y and (z or zz)`, `(foo or bar or baz) x y (z or zz)`) f(`(foo or bar or baz) and x and y and (z or zz)`, `(foo or bar or baz) x y (z or zz)`)
@ -778,50 +785,50 @@ func TestParseQuerySuccess(t *testing.T) {
// reserved functions // reserved functions
f("exact", `"exact"`) f("exact", `"exact"`)
f("exact:a", `"exact":a`) f("exact:a", `"exact":a`)
f("exact-foo", `exact-foo`) f("exact-foo", `"exact-foo"`)
f("a:exact", `a:"exact"`) f("a:exact", `a:"exact"`)
f("a:exact-foo", `a:exact-foo`) f("a:exact-foo", `a:"exact-foo"`)
f("exact-foo:b", `exact-foo:b`) f("exact-foo:b", `"exact-foo":b`)
f("i", `"i"`) f("i", `"i"`)
f("i-foo", `i-foo`) f("i-foo", `"i-foo"`)
f("a:i-foo", `a:i-foo`) f("a:i-foo", `a:"i-foo"`)
f("i-foo:b", `i-foo:b`) f("i-foo:b", `"i-foo":b`)
f("in", `"in"`) f("in", `"in"`)
f("in:a", `"in":a`) f("in:a", `"in":a`)
f("in-foo", `in-foo`) f("in-foo", `"in-foo"`)
f("a:in", `a:"in"`) f("a:in", `a:"in"`)
f("a:in-foo", `a:in-foo`) f("a:in-foo", `a:"in-foo"`)
f("in-foo:b", `in-foo:b`) f("in-foo:b", `"in-foo":b`)
f("ipv4_range", `"ipv4_range"`) f("ipv4_range", `"ipv4_range"`)
f("ipv4_range:a", `"ipv4_range":a`) f("ipv4_range:a", `"ipv4_range":a`)
f("ipv4_range-foo", `ipv4_range-foo`) f("ipv4_range-foo", `"ipv4_range-foo"`)
f("a:ipv4_range", `a:"ipv4_range"`) f("a:ipv4_range", `a:"ipv4_range"`)
f("a:ipv4_range-foo", `a:ipv4_range-foo`) f("a:ipv4_range-foo", `a:"ipv4_range-foo"`)
f("ipv4_range-foo:b", `ipv4_range-foo:b`) f("ipv4_range-foo:b", `"ipv4_range-foo":b`)
f("len_range", `"len_range"`) f("len_range", `"len_range"`)
f("len_range:a", `"len_range":a`) f("len_range:a", `"len_range":a`)
f("len_range-foo", `len_range-foo`) f("len_range-foo", `"len_range-foo"`)
f("a:len_range", `a:"len_range"`) f("a:len_range", `a:"len_range"`)
f("a:len_range-foo", `a:len_range-foo`) f("a:len_range-foo", `a:"len_range-foo"`)
f("len_range-foo:b", `len_range-foo:b`) f("len_range-foo:b", `"len_range-foo":b`)
f("range", `"range"`) f("range", `"range"`)
f("range:a", `"range":a`) f("range:a", `"range":a`)
f("range-foo", `range-foo`) f("range-foo", `"range-foo"`)
f("a:range", `a:"range"`) f("a:range", `a:"range"`)
f("a:range-foo", `a:range-foo`) f("a:range-foo", `a:"range-foo"`)
f("range-foo:b", `range-foo:b`) f("range-foo:b", `"range-foo":b`)
f("re", `"re"`) f("re", `"re"`)
f("re-bar", `re-bar`) f("re-bar", `"re-bar"`)
f("a:re-bar", `a:re-bar`) f("a:re-bar", `a:"re-bar"`)
f("re-bar:a", `re-bar:a`) f("re-bar:a", `"re-bar":a`)
f("seq", `"seq"`) f("seq", `"seq"`)
f("seq-a", `seq-a`) f("seq-a", `"seq-a"`)
f("x:seq-a", `x:seq-a`) f("x:seq-a", `x:"seq-a"`)
f("seq-a:x", `seq-a:x`) f("seq-a:x", `"seq-a":x`)
f("string_range", `"string_range"`) f("string_range", `"string_range"`)
f("string_range-a", `string_range-a`) f("string_range-a", `"string_range-a"`)
f("x:string_range-a", `x:string_range-a`) f("x:string_range-a", `x:"string_range-a"`)
f("string_range-a:x", `string_range-a:x`) f("string_range-a:x", `"string_range-a":x`)
// exact filter // exact filter
f("exact(foo)", `=foo`) f("exact(foo)", `=foo`)
@ -932,11 +939,11 @@ func TestParseQuerySuccess(t *testing.T) {
f("1.2.3.4 or ip:5.6.7.9", "1.2.3.4 or ip:5.6.7.9") f("1.2.3.4 or ip:5.6.7.9", "1.2.3.4 or ip:5.6.7.9")
// '-' and '.' chars in field name and search phrase // '-' and '.' chars in field name and search phrase
f("trace-id.foo.bar:baz", `trace-id.foo.bar:baz`) f("trace-id.foo.bar:baz", `"trace-id.foo.bar":baz`)
f(`custom-Time:2024-01-02T03:04:05+08:00 fooBar OR !baz:xxx`, `custom-Time:"2024-01-02T03:04:05+08:00" fooBar or !baz:xxx`) f(`custom-Time:2024-01-02T03:04:05+08:00 fooBar OR !baz:xxx`, `"custom-Time":"2024-01-02T03:04:05+08:00" fooBar or !baz:xxx`)
f("foo-bar+baz*", `"foo-bar+baz"*`) f("foo-bar+baz*", `"foo-bar+baz"*`)
f("foo- bar", `foo- bar`) f("foo- bar", `"foo-" bar`)
f("foo -bar", `foo -bar`) f("foo -bar", `foo !bar`)
f("foo!bar", `foo !bar`) f("foo!bar", `foo !bar`)
f("foo:aa!bb:cc", `foo:aa !bb:cc`) f("foo:aa!bb:cc", `foo:aa !bb:cc`)
f(`foo:bar:baz`, `foo:"bar:baz"`) f(`foo:bar:baz`, `foo:"bar:baz"`)