lib/logstorage: allow using - instead of ! as a shorthand for NOT operator in LogsQL

This commit is contained in:
Aliaksandr Valialkin 2024-09-27 13:12:34 +02:00
parent b60cb98377
commit 1a6313ca68
No known key found for this signature in database
GPG key ID: 52C003EE2BCDB9EB
6 changed files with 67 additions and 55 deletions

View file

@ -17,13 +17,14 @@ according to [these docs](https://docs.victoriametrics.com/victorialogs/quicksta
* FEATURE: [web UI](https://docs.victoriametrics.com/victorialogs/querying/#web-ui): keep selected columns in table view on page reloads. Before, selected columns were reset on each update. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7016).
* FEATURE: allow skipping `_stream:` prefix in [stream filters](https://docs.victoriametrics.com/victorialogs/logsql/#stream-filter). This simplifies writing queries with stream filters. Now `{foo="bar"}` is the recommended format for stream filters over the `_stream:{foo="bar"}` format.
* FEATURE: allow using `-` instead of `!` as `NOT` operator shorthand in [logical filters](https://docs.victoriametrics.com/victorialogs/logsql/#logical-filter). For example, `-info -warn` query is equivalent to `!info !warn`. This simplifies transition from other query languages with full-text search support, which usually use `-` as `NOT` operator.
## [v0.30.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.1-victorialogs)
Released at 2024-09-27
* BUGFIX: consistently return matching log streams sorted by time from [`stream_context` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stream_context-pipe). Previously log streams could be returned in arbitrary order with every request. This could complicate using `stream_context` pipe.
* BUGFIX: add missing `_msg="---"` delimiter between stream contexts belonging to different [log streams](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields). This should simplify investigating `stream_context` output for multiple matching log streams.
* BUGFIX: [`stream_context` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stream_context-pipe): add missing `_msg="---"` delimiter between stream contexts belonging to different [log streams](https://docs.victoriametrics.com/victorialogs/keyconcepts/#stream-fields). This should simplify investigating `stream_context` output for multiple matching log streams.
## [v0.30.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.30.0-victorialogs)

View file

@ -115,30 +115,30 @@ Then the following query removes all the logs from the buggy app, allowing us pa
_time:5m error NOT buggy_app
```
This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `!` char
(the `!` char is used instead of `-` char as a shorthand for `NOT` operator because it nicely combines with [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
This query uses `NOT` [operator](#logical-filter) for removing log lines from the buggy app. The `NOT` operator is used frequently, so it can be substituted with `-` or `!` char
(the `!` must be used instead of `-` in front of [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`).
The following query is equivalent to the previous one:
```logsql
_time:5m error !buggy_app
_time:5m error -buggy_app
```
Suppose another buggy app starts pushing invalid error logs to VictoriaLogs - it adds `foobar` [word](#word) to every emitted log line.
No problems - just add `!foobar` to the query in order to remove these buggy logs:
No problems - just add `-foobar` to the query in order to remove these buggy logs:
```logsql
_time:5m error !buggy_app !foobar
_time:5m error -buggy_app -foobar
```
This query can be rewritten to more clear query with the `OR` [operator](#logical-filter) inside parentheses:
```logsql
_time:5m error !(buggy_app OR foobar)
_time:5m error -(buggy_app OR foobar)
```
The parentheses are **required** here, since otherwise the query won't return the expected results.
The query `error !buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar` according to [priorities for AND, OR and NOT operator](#logical-filters).
The query `error -buggy_app OR foobar` is interpreted as `(error AND NOT buggy_app) OR foobar` according to [priorities for AND, OR and NOT operator](#logical-filters).
This query returns logs with `foobar` [word](#word), even if do not contain `error` word or contain `buggy_app` word.
So it is recommended wrapping the needed query parts into explicit parentheses if you are unsure in priority rules.
As an additional bonus, explicit parentheses make queries easier to read and maintain.
@ -148,26 +148,26 @@ If this word is stored in other [field](https://docs.victoriametrics.com/victori
in front of the `error` word:
```logsql
_time:5m log.level:error !(buggy_app OR foobar)
_time:5m log.level:error -(buggy_app OR foobar)
```
The field name can be wrapped into quotes if it contains special chars or keywords, which may clash with LogsQL syntax.
Any [word](#word) also can be wrapped into quotes. So the following query is equivalent to the previous one:
```logsql
"_time":"5m" "log.level":"error" !("buggy_app" OR "foobar")
"_time":"5m" "log.level":"error" -("buggy_app" OR "foobar")
```
What if the application identifier - such as `buggy_app` and `foobar` - is stored in the `app` field? Correct - just add `app:` prefix in front of `buggy_app` and `foobar`:
```logsql
_time:5m log.level:error !(app:buggy_app OR app:foobar)
_time:5m log.level:error -(app:buggy_app OR app:foobar)
```
The query can be simplified by moving the `app:` prefix outside the parentheses:
```logsql
_time:5m log.level:error !app:(buggy_app OR foobar)
_time:5m log.level:error -app:(buggy_app OR foobar)
```
The `app` field uniquely identifies the application instance if a single instance runs per each unique `app`.
@ -1239,8 +1239,11 @@ Simpler LogsQL [filters](#filters) can be combined into more complex filters wit
- `NOT q` - returns all the log entries except of those which match `q`. For example, `NOT info` returns all the
[log messages](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field),
which do not contain `info` [word](#word). The `NOT` operation is frequently used in LogsQL queries, so it is allowed substituting `NOT` with `!` in queries.
For example, `!info` is equivalent to `NOT info`.
which do not contain `info` [word](#word). The `NOT` operation is frequently used in LogsQL queries, so it is allowed substituting `NOT` with `-` and `!` in queries.
For example, `-info` and `!info` are equivalent to `NOT info`.
The `!` must be used instead of `-` in front of [`=`](https://docs.victoriametrics.com/victorialogs/logsql/#exact-filter)
and [`~`](https://docs.victoriametrics.com/victorialogs/logsql/#regexp-filter) filters like `!=` and `!~`.
The `NOT` operation has the highest priority, `AND` has the middle priority and `OR` has the lowest priority.
The priority order can be changed with parentheses. For example, `NOT info OR debug` is interpreted as `(NOT info) OR debug`,

View file

@ -58,7 +58,7 @@ to the query. For example, the following query selects logs with `error` [word](
which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql
error !kubernetes _time:1h
error -kubernetes _time:1h
```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
@ -86,14 +86,14 @@ Use [`NOT` logical filter](https://docs.victoriametrics.com/victorialogs/logsql/
without the `INFO` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word) in the [log message](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field):
```logsql
!INFO
-INFO
```
If the number of returned logs is too big, then add [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter)
for limiting the time range for the selected logs. For example, the following query returns matching logs over the last hour:
```logsql
!INFO _time:1h
-INFO _time:1h
```
If the number of returned logs is still too big, then consider adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters)
@ -101,7 +101,7 @@ to the query. For example, the following query selects logs without `INFO` [word
which contain `error` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql
!INFO error _time:1h
-INFO error _time:1h
```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)
@ -109,7 +109,7 @@ for sorting logs by the needed [fields](https://docs.victoriametrics.com/victori
sorts the selected logs by [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field):
```logsql
!INFO _time:1h | sort by (_time)
-INFO _time:1h | sort by (_time)
```
See also:
@ -191,7 +191,7 @@ to the query. For example, the following query selects logs without `error`, `ER
which do not contain `kubernetes` [word](https://docs.victoriametrics.com/victorialogs/logsql/#word), over the last hour:
```logsql
(error or ERROR or Error) !kubernetes _time:1h
(error or ERROR or Error) -kubernetes _time:1h
```
The logs are returned in arbitrary order because of performance reasons. Add [`sort` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe)

View file

@ -11,9 +11,10 @@ func (fn *filterNot) String() string {
s := fn.f.String()
switch fn.f.(type) {
case *filterAnd, *filterOr:
s = "(" + s + ")"
return "!(" + s + ")"
default:
return "!" + s
}
return "!" + s
}
func (fn *filterNot) updateNeededFields(neededFields fieldsSet) {

View file

@ -856,7 +856,7 @@ func parseGenericFilter(lex *lexer, fieldName string) (filter, error) {
return parseFilterTilda(lex, fieldName)
case lex.isKeyword("!~"):
return parseFilterNotTilda(lex, fieldName)
case lex.isKeyword("not", "!"):
case lex.isKeyword("not", "!", "-"):
return parseFilterNot(lex, fieldName)
case lex.isKeyword("exact"):
return parseFilterExact(lex, fieldName)
@ -2149,7 +2149,7 @@ func needQuoteToken(s string) bool {
return true
}
for _, r := range s {
if !isTokenRune(r) && r != '.' && r != '-' {
if !isTokenRune(r) && r != '.' {
return true
}
}

View file

@ -638,7 +638,9 @@ func TestParseQuerySuccess(t *testing.T) {
f(`"":foo`, "foo")
f(`"" bar`, `"" bar`)
f(`!''`, `!""`)
f(`-''`, `!""`)
f(`foo:""`, `foo:""`)
f(`-foo:""`, `!foo:""`)
f(`!foo:""`, `!foo:""`)
f(`not foo:""`, `!foo:""`)
f(`not(foo)`, `!foo`)
@ -648,6 +650,8 @@ func TestParseQuerySuccess(t *testing.T) {
f("_msg:foo", "foo")
f("'foo:bar'", `"foo:bar"`)
f("'!foo'", `"!foo"`)
f("'-foo'", `"-foo"`)
f(`'{a="b"}'`, `"{a=\"b\"}"`)
f("foo 'and' and bar", `foo "and" bar`)
f("foo bar", "foo bar")
f("foo and bar", "foo bar")
@ -656,10 +660,13 @@ func TestParseQuerySuccess(t *testing.T) {
f("foo OR bar", "foo or bar")
f("not foo", "!foo")
f("! foo", "!foo")
f("- foo", "!foo")
f("not !`foo bar`", `"foo bar"`)
f("not -`foo bar`", `"foo bar"`)
f("foo or bar and not baz", "foo or bar !baz")
f("'foo bar' !baz", `"foo bar" !baz`)
f("foo:!bar", `!foo:bar`)
f("foo:-bar", `!foo:bar`)
f(`foo and bar and baz or x or y or z and zz`, `foo bar baz or x or y or z zz`)
f(`foo and bar and (baz or x or y or z) and zz`, `foo bar (baz or x or y or z) zz`)
f(`(foo or bar or baz) and x and y and (z or zz)`, `(foo or bar or baz) x y (z or zz)`)
@ -778,50 +785,50 @@ func TestParseQuerySuccess(t *testing.T) {
// reserved functions
f("exact", `"exact"`)
f("exact:a", `"exact":a`)
f("exact-foo", `exact-foo`)
f("exact-foo", `"exact-foo"`)
f("a:exact", `a:"exact"`)
f("a:exact-foo", `a:exact-foo`)
f("exact-foo:b", `exact-foo:b`)
f("a:exact-foo", `a:"exact-foo"`)
f("exact-foo:b", `"exact-foo":b`)
f("i", `"i"`)
f("i-foo", `i-foo`)
f("a:i-foo", `a:i-foo`)
f("i-foo:b", `i-foo:b`)
f("i-foo", `"i-foo"`)
f("a:i-foo", `a:"i-foo"`)
f("i-foo:b", `"i-foo":b`)
f("in", `"in"`)
f("in:a", `"in":a`)
f("in-foo", `in-foo`)
f("in-foo", `"in-foo"`)
f("a:in", `a:"in"`)
f("a:in-foo", `a:in-foo`)
f("in-foo:b", `in-foo:b`)
f("a:in-foo", `a:"in-foo"`)
f("in-foo:b", `"in-foo":b`)
f("ipv4_range", `"ipv4_range"`)
f("ipv4_range:a", `"ipv4_range":a`)
f("ipv4_range-foo", `ipv4_range-foo`)
f("ipv4_range-foo", `"ipv4_range-foo"`)
f("a:ipv4_range", `a:"ipv4_range"`)
f("a:ipv4_range-foo", `a:ipv4_range-foo`)
f("ipv4_range-foo:b", `ipv4_range-foo:b`)
f("a:ipv4_range-foo", `a:"ipv4_range-foo"`)
f("ipv4_range-foo:b", `"ipv4_range-foo":b`)
f("len_range", `"len_range"`)
f("len_range:a", `"len_range":a`)
f("len_range-foo", `len_range-foo`)
f("len_range-foo", `"len_range-foo"`)
f("a:len_range", `a:"len_range"`)
f("a:len_range-foo", `a:len_range-foo`)
f("len_range-foo:b", `len_range-foo:b`)
f("a:len_range-foo", `a:"len_range-foo"`)
f("len_range-foo:b", `"len_range-foo":b`)
f("range", `"range"`)
f("range:a", `"range":a`)
f("range-foo", `range-foo`)
f("range-foo", `"range-foo"`)
f("a:range", `a:"range"`)
f("a:range-foo", `a:range-foo`)
f("range-foo:b", `range-foo:b`)
f("a:range-foo", `a:"range-foo"`)
f("range-foo:b", `"range-foo":b`)
f("re", `"re"`)
f("re-bar", `re-bar`)
f("a:re-bar", `a:re-bar`)
f("re-bar:a", `re-bar:a`)
f("re-bar", `"re-bar"`)
f("a:re-bar", `a:"re-bar"`)
f("re-bar:a", `"re-bar":a`)
f("seq", `"seq"`)
f("seq-a", `seq-a`)
f("x:seq-a", `x:seq-a`)
f("seq-a:x", `seq-a:x`)
f("seq-a", `"seq-a"`)
f("x:seq-a", `x:"seq-a"`)
f("seq-a:x", `"seq-a":x`)
f("string_range", `"string_range"`)
f("string_range-a", `string_range-a`)
f("x:string_range-a", `x:string_range-a`)
f("string_range-a:x", `string_range-a:x`)
f("string_range-a", `"string_range-a"`)
f("x:string_range-a", `x:"string_range-a"`)
f("string_range-a:x", `"string_range-a":x`)
// exact filter
f("exact(foo)", `=foo`)
@ -932,11 +939,11 @@ func TestParseQuerySuccess(t *testing.T) {
f("1.2.3.4 or ip:5.6.7.9", "1.2.3.4 or ip:5.6.7.9")
// '-' and '.' chars in field name and search phrase
f("trace-id.foo.bar:baz", `trace-id.foo.bar:baz`)
f(`custom-Time:2024-01-02T03:04:05+08:00 fooBar OR !baz:xxx`, `custom-Time:"2024-01-02T03:04:05+08:00" fooBar or !baz:xxx`)
f("trace-id.foo.bar:baz", `"trace-id.foo.bar":baz`)
f(`custom-Time:2024-01-02T03:04:05+08:00 fooBar OR !baz:xxx`, `"custom-Time":"2024-01-02T03:04:05+08:00" fooBar or !baz:xxx`)
f("foo-bar+baz*", `"foo-bar+baz"*`)
f("foo- bar", `foo- bar`)
f("foo -bar", `foo -bar`)
f("foo- bar", `"foo-" bar`)
f("foo -bar", `foo !bar`)
f("foo!bar", `foo !bar`)
f("foo:aa!bb:cc", `foo:aa !bb:cc`)
f(`foo:bar:baz`, `foo:"bar:baz"`)