wip

2025-01-20 15:16:42 +00:00 · 2024-05-22 17:17:59 +02:00 · 2024-05-22 17:17:59 +02:00 · 79787ce25a
commit 79787ce25a
parent 93a645dcfc
7 changed files with 251 additions and 180 deletions
--- a/docs/VictoriaLogs/CHANGELOG.md
+++ b/docs/VictoriaLogs/CHANGELOG.md
@ -19,6 +19,7 @@ according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/QuickSta

 ## tip

+* FEATURE: add ability to generate output fields according to the provided format string. See [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#format-pipe).
 * FEATURE: add ability to extract fields with [`extract` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#extract-pipe) only if the given condition is met. See [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#conditional-extract).
 * FEATURE: add ability to unpack JSON fields with [`unpack_json` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#unpack_json-pipe) only if the given condition is met. See [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#conditional-unpack_json).
 * FEATURE: add ability to unpack [logfmt](https://brandur.org/logfmt) fields with [`unpack_logfmt` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#unpack_logfmt-pipe) only if the given condition is met. See [these docs](https://docs.victoriametrics.com/victorialogs/logsql/#conditional-unpack_logfmt).
--- a/docs/VictoriaLogs/LogsQL.md
+++ b/docs/VictoriaLogs/LogsQL.md
@ -1056,6 +1056,7 @@ LogsQL supports the following pipes:
 - [`field_names`](#field_names-pipe) returns all the names of [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
 - [`fields`](#fields-pipe) selects the given set of [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
 - [`filter`](#filter-pipe) applies additional [filters](#filters) to results.
+- [`format`](#format-pipe) formats ouptut field from input [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
 - [`limit`](#limit-pipe) limits the number selected logs.
 - [`offset`](#offset-pipe) skips the given number of selected logs.
 - [`rename`](#rename-pipe) renames [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
@ -1110,21 +1111,21 @@ See also:

 ### extract pipe

-`| extract from field_name "pattern"` [pipe](#pipes) allows extracting additional fields specified in the `pattern` from the given
-`field_name` [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model). Existing log fields remain unchanged
-after the `| extract ...` pipe.
+`| extract "pattern" from field_name` [pipe](#pipes) allows extracting abitrary text into output fields according to the [`pattern`](#format-for-extract-pipe-pattern) from the given
+[`field_name`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model). Existing log fields remain unchanged after the `| extract ...` pipe.

-`| extract ...` pipe can be useful for extracting additional fields needed for further data processing with other pipes such as [`stats` pipe](#stats-pipe) or [`sort` pipe](#sort-pipe).
+`| extract ...` can be useful for extracting additional fields needed for further data processing with other pipes such as [`stats` pipe](#stats-pipe) or [`sort` pipe](#sort-pipe).

 For example, the following query selects logs with the `error` [word](#word) for the last day,
 extracts ip address from [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field) into `ip` field and then calculates top 10 ip addresses
 with the biggest number of logs:

 ```logsql
-_time:1d error | extract from _msg "ip=<ip> " | stats by (ip) count() logs | sort by (logs) desc limit 10
+_time:1d error | extract "ip=<ip> " from _msg | stats by (ip) count() logs | sort by (logs) desc limit 10
 ```

-It is expected that `_msg` field contains `ip=...` substring, which ends with space. For example, `error ip=1.2.3.4 from user_id=42`.
+It is expected that `_msg` field contains `ip=...` substring ending with space. For example, `error ip=1.2.3.4 from user_id=42`.
+If there is no such substring in the current `_msg` field, then the `ip` output field will be empty.

 If the `| extract ...` pipe is applied to [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field), then the `from _msg` part can be omitted.
 For example, the following query is equivalent to the previous one:
@ -1133,6 +1134,12 @@ For example, the following query is equivalent to the previous one:
 _time:1d error | extract "ip=<ip> " | stats by (ip) count() logs | sort by (logs) desc limit 10
 ```

+If the `pattern` contains double quotes, then it can be quoted into single quotes. For example, the following query extracts `ip` from the corresponding JSON field:
+
+```logsql
+_time:5m | extract '"ip":"<ip>"'
+```
+
 See also:

 - [Format for extract pipe pattern](#format-for-extract-pipe-pattern)
@ -1140,23 +1147,27 @@ See also:
 - [`unpack_json` pipe](#unpack_json-pipe)
 - [`unpack_logfmt` pipe](#unpack_logfmt-pipe)

-#### Conditional extract
-
-If some log entries must be skipped from [`extract` pipe](#extract-pipe), then add `if (<filters>)` filter to the end of `| extract ...` pipe.
-The `<filters>` can contain arbitrary [filters](#filters). For example, the following query extracts `ip` field only
-if the input [log entry](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) doesn't contain `ip` field or this field is empty:
-
-```logsql
-_time:5m | extract "ip=<ip> " if (ip:"")
-```
-
 #### Format for extract pipe pattern

-The `pattern` part from [`| extract from src_field "pattern"` pipe](#extract-pipes) may contain arbitrary text, which matches as is to the `src_field` value.
-Additionally to arbitrary text, the `pattern` may contain placeholders in the form `<...>`, which match any strings, including empty strings.
-Placeholders may be named, such as `<ip>`, or anonymous, such as `<_>`. Named placeholders extract the matching text into
-the corresponding [log field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model).
-Anonymous placeholders are useful for skipping arbitrary text during pattern matching.
+The `pattern` part from [`extract ` pipe](#extract-pipe) has the following format:
+
+```
+text1<field1>text2<field2>...textN<fieldN>textN+1
+```
+
+Where `text1`, ... `textN+1` is arbitrary non-empty text, which matches as is to the input text.
+
+The `field1`, ... `fieldN` are placeholders, which match a substring of any length (including zero length) in the input text until the next `textX`.
+Placeholders can be anonymous and named. Anonymous placeholders are written as `<_>`. They are used for convenience when some input text
+must be skipped until the next `textX`. Named palceholders are written as `<some_name>`, where `some_name` is the name of the log field to store
+the corresponding matching substring to.
+
+The matching starts from the first occurence of the `text1` in the input text. If the `pattern` starts with `<field1>` and doesn't contain `text1`,
+then the matching starts from the beginning of the input text. Matching is performed sequentially according to the `pattern`. If some `textX` isn't found
+in the remaining input text, then the remaining named placeholders receive empty string values and the matching finishes prematurely.
+
+Matching finishes successfully when `textN+1` is found in the input text.
+If the `pattern` ends with `<fieldN>` and doesn't contain `textN+1`, then the `<fieldN>` matches the remaining input text.

 For example, if [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field) contains the following text:

@ -1164,34 +1175,44 @@ For example, if [`_msg` field](https://docs.victoriametrics.com/victorialogs/key
 1.2.3.4 GET /foo/bar?baz 404 "Mozilla  foo bar baz" some tail here
 ```

-Then the following `| extract ...` [pipe](#pipes) can be used for extracting `ip`, `path` and `user_agent` fields from it:
+Then the following `pattern` can be used for extracting `ip`, `path` and `user_agent` fields from it:

 ```
-| extract '<ip> <_> <path> <_> "<user_agent>"'
+<ip> <_> <path> <_> "<user_agent>"
 ```

 Note that the user-agent part of the log message is in double quotes. This means that it may contain special chars, including escaped double quote, e.g. `\"`.
 This may break proper matching of the string in double quotes.

-VictoriaLogs automatically detects the whole string in quotes and automatically decodes it if the first char in the placeholder is double quote or backtick.
-So it is better to use the following `pattern` for proper matching of quoted strings:
+VictoriaLogs automatically detects quoted strings and automatically unquotes them if the first matching char in the placeholder is double quote or backtick.
+So it is better to use the following `pattern` for proper matching of quoted `user_agent` string:

 ```
-| extract "<ip> <_> <path> <_> <user_agent>"
+<ip> <_> <path> <_> <user_agent>
 ```

-Note that the `user_agent` now matches double quotes, but VictoriaLogs automatically unquotes the matching string before storing it in the `user_agent` field.
-This is useful for extracting JSON strings. For example, the following `pattern` properly extracts the `message` JSON string into `msg` field:
+This is useful for extracting JSON strings. For example, the following `pattern` properly extracts the `message` JSON string into `msg` field, even if it contains special chars:

 ```
-| extract '"message":<msg>'
+"message":<msg>
 ```

 If some special chars such as `<` must be matched by the `pattern`, then they can be [html-escaped](https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references).
-For example, the following `pattern` properly matches `a < 123.456` text:
+For example, the following `pattern` properly matches `a < b` text by extracting `a` into `left` field and `b` into `right` field:

 ```
-| extract "<left> &lt; <right>"
+<left> &lt; <right>
+```
+
+#### Conditional extract
+
+If some log entries must be skipped from [`extract` pipe](#extract-pipe), then add `if (<filters>)` filter after the `extract` word.
+The `<filters>` can contain arbitrary [filters](#filters). For example, the following query extracts `ip` field
+from [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) only
+if the input [log entry](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) doesn't contain `ip` field or this field is empty:
+
+```logsql
+_time:5m | extract if (ip:"") "ip=<ip> "
 ```

 ### field_names pipe
@ -1249,6 +1270,49 @@ See also:
 - [`stats` pipe](#stats-pipe)
 - [`sort` pipe](#sort-pipe)

+### format pipe
+
+`| format "pattern" as result_field` [pipe](#format-pipe) combines [log fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
+according to the `pattern` and stores it to the `result_field`.
+
+For example, the following query stores `request from <ip>:<port>` text into [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field),
+by substituting `<ip>` and `<port>` with the corresponding [log field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) names:
+
+```logsql
+_time:5m | format "request from <ip>:<port>" as _msg
+```
+
+If the result of the `format` pattern is stored into [`_msg` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field),
+then `as _msg` part can be omitted. The following query is equivalent to the previous one:
+
+```logsql
+_time:5m | format "request from <ip>:<port>"
+```
+
+If some field values must be put into double quotes before formatting, then add `:q` after the corresponding field name.
+For example, the following command generates properly encoded JSON object from `_msg` and `stacktrace` [log fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model)
+and stores it into `my_json` output field:
+
+```logsql
+_time:5m | format '{"_msg":<_msg:q>,"stacktrace":<stacktrace:q>}' as my_json
+```
+
+See also:
+
+- [Conditional format](#conditional-format)
+- [`extract` pipe](#extract-pipe)
+
+#### Conditional format
+
+If the [`format` pipe](#format-pipe) musn't be applied to every [log entry](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model),
+then add `if (<filters>)` just after the `format` word.
+The `<filters>` can contain arbitrary [filters](#filters). For example, the following query stores the formatted result to `message` field
+only if `ip` and `host` [fields](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model) aren't empty:
+
+```logsql
+_time:5m | format if (ip:* and host:*) "request from <ip>:<host>" as message
+```
+
 ### limit pipe

 If only a subset of selected logs must be processed, then `| limit N` [pipe](#pipes) can be used, where `N` can contain any [supported integer numeric value](#numeric-values).
--- a/lib/logstorage/parser_test.go
+++ b/lib/logstorage/parser_test.go
@ -1001,11 +1001,11 @@ func TestParseQuerySuccess(t *testing.T) {

 	// extract pipe
 	f(`* | extract "foo<bar>baz"`, `* | extract "foo<bar>baz"`)
-	f(`* | extract from _msg "foo<bar>baz"`, `* | extract "foo<bar>baz"`)
-	f(`* | extract from '' 'foo<bar>baz'`, `* | extract "foo<bar>baz"`)
-	f("* | extract from x `foo<bar>baz`", `* | extract from x "foo<bar>baz"`)
-	f("* | extract from x foo<bar>baz", `* | extract from x "foo<bar>baz"`)
-	f("* | extract from x foo<bar>baz if (a:b)", `* | extract from x "foo<bar>baz" if (a:b)`)
+	f(`* | extract "foo<bar>baz" from _msg`, `* | extract "foo<bar>baz"`)
+	f(`* | extract 'foo<bar>baz' from ''`, `* | extract "foo<bar>baz"`)
+	f("* | extract `foo<bar>baz` from x", `* | extract "foo<bar>baz" from x`)
+	f("* | extract foo<bar>baz from x", `* | extract "foo<bar>baz" from x`)
+	f("* | extract if (a:b) foo<bar>baz from x", `* | extract if (a:b) "foo<bar>baz" from x`)

 	// unpack_json pipe
 	f(`* | unpack_json`, `* | unpack_json`)
@ -1625,10 +1625,10 @@ func TestQueryGetNeededColumns(t *testing.T) {
 	f(`* | format "foo<f1>" as s1`, `*`, `s1`)
 	f(`* | format "foo<s1>" as s1`, `*`, ``)

-	f(`* | format "foo" if (x1:y) as s1`, `*`, `s1`)
-	f(`* | format "foo<f1>" if (x1:y) as s1`, `*`, `s1`)
-	f(`* | format "foo<f1>" if (s1:y) as s1`, `*`, ``)
-	f(`* | format "foo<s1>" if (x1:y) as s1`, `*`, ``)
+	f(`* | format if (x1:y) "foo" as s1`, `*`, `s1`)
+	f(`* | format if (x1:y) "foo<f1>" as s1`, `*`, `s1`)
+	f(`* | format if (s1:y) "foo<f1>" as s1`, `*`, ``)
+	f(`* | format if (x1:y) "foo<s1>" as s1`, `*`, ``)

 	f(`* | format "foo" as s1 | fields f1`, `f1`, ``)
 	f(`* | format "foo" as s1 | fields s1`, ``, ``)
@ -1638,8 +1638,8 @@ func TestQueryGetNeededColumns(t *testing.T) {
 	f(`* | format "foo<s1>" as s1 | fields f1`, `f1`, ``)
 	f(`* | format "foo<s1>" as s1 | fields s1`, `s1`, ``)

-	f(`* | format "foo" if (f1:x) as s1 | fields s1`, `f1`, ``)
-	f(`* | format "foo" if (f1:x) as s1 | fields s2`, `s2`, ``)
+	f(`* | format if (f1:x) "foo" as s1 | fields s1`, `f1`, ``)
+	f(`* | format if (f1:x) "foo" as s1 | fields s2`, `s2`, ``)

 	f(`* | format "foo" as s1 | rm f1`, `*`, `f1,s1`)
 	f(`* | format "foo" as s1 | rm s1`, `*`, `s1`)
@ -1649,52 +1649,52 @@ func TestQueryGetNeededColumns(t *testing.T) {
 	f(`* | format "foo<s1>" as s1 | rm f1`, `*`, `f1`)
 	f(`* | format "foo<s1>" as s1 | rm s1`, `*`, `s1`)

-	f(`* | format "foo" if (f1:x) as s1 | rm s1`, `*`, `s1`)
-	f(`* | format "foo" if (f1:x) as s1 | rm f1`, `*`, `s1`)
-	f(`* | format "foo" if (f1:x) as s1 | rm f2`, `*`, `f2,s1`)
+	f(`* | format if (f1:x) "foo" as s1 | rm s1`, `*`, `s1`)
+	f(`* | format if (f1:x) "foo" as s1 | rm f1`, `*`, `s1`)
+	f(`* | format if (f1:x) "foo" as s1 | rm f2`, `*`, `f2,s1`)

-	f(`* | extract from s1 "<f1>x<f2>"`, `*`, `f1,f2`)
-	f(`* | extract from s1 "<f1>x<f2>" if (f3:foo)`, `*`, `f1,f2`)
-	f(`* | extract from s1 "<f1>x<f2>" if (f1:foo)`, `*`, `f2`)
-	f(`* | extract from s1 "<f1>x<f2>" | fields foo`, `foo`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | fields foo`, `foo`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" | fields foo,s1`, `foo,s1`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | fields foo,s1`, `foo,s1`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" | fields foo,f1`, `foo,s1`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | fields foo,f1`, `foo,s1,x`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" | fields foo,f1,f2`, `foo,s1`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | fields foo,f1,f2`, `foo,s1,x`, ``)
-	f(`* | extract from s1 "<f1>x<f2>" | rm foo`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | rm foo`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" | rm foo,s1`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | rm foo,s1`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" | rm foo,f1`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | rm foo,f1`, `*`, `f1,f2,foo`)
-	f(`* | extract from s1 "<f1>x<f2>" | rm foo,f1,f2`, `*`, `f1,f2,foo,s1`)
-	f(`* | extract from s1 "<f1>x<f2>" if (x:bar) | rm foo,f1,f2`, `*`, `f1,f2,foo,s1`)
+	f(`* | extract "<f1>x<f2>" from s1`, `*`, `f1,f2`)
+	f(`* | extract if (f3:foo) "<f1>x<f2>" from s1`, `*`, `f1,f2`)
+	f(`* | extract if (f1:foo) "<f1>x<f2>" from s1`, `*`, `f2`)
+	f(`* | extract "<f1>x<f2>" from s1 | fields foo`, `foo`, ``)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | fields foo`, `foo`, ``)
+	f(`* | extract "<f1>x<f2>" from s1| fields foo,s1`, `foo,s1`, ``)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | fields foo,s1`, `foo,s1`, ``)
+	f(`* | extract "<f1>x<f2>" from s1 | fields foo,f1`, `foo,s1`, ``)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | fields foo,f1`, `foo,s1,x`, ``)
+	f(`* | extract "<f1>x<f2>" from s1 | fields foo,f1,f2`, `foo,s1`, ``)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | fields foo,f1,f2`, `foo,s1,x`, ``)
+	f(`* | extract "<f1>x<f2>" from s1 | rm foo`, `*`, `f1,f2,foo`)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | rm foo`, `*`, `f1,f2,foo`)
+	f(`* | extract "<f1>x<f2>" from s1 | rm foo,s1`, `*`, `f1,f2,foo`)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | rm foo,s1`, `*`, `f1,f2,foo`)
+	f(`* | extract "<f1>x<f2>" from s1 | rm foo,f1`, `*`, `f1,f2,foo`)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | rm foo,f1`, `*`, `f1,f2,foo`)
+	f(`* | extract "<f1>x<f2>" from s1 | rm foo,f1,f2`, `*`, `f1,f2,foo,s1`)
+	f(`* | extract if (x:bar) "<f1>x<f2>" from s1 | rm foo,f1,f2`, `*`, `f1,f2,foo,s1`)

-	f(`* | extract from s1 "x<s1>y"`, `*`, ``)
-	f(`* | extract from s1 "x<s1>y" if (x:foo)`, `*`, ``)
-	f(`* | extract from s1 "x<s1>y" if (s1:foo)`, `*`, ``)
-	f(`* | extract from s1 "x<f1>y" if (s1:foo)`, `*`, `f1`)
+	f(`* | extract "x<s1>y" from s1 `, `*`, ``)
+	f(`* | extract if (x:foo) "x<s1>y" from s1`, `*`, ``)
+	f(`* | extract if (s1:foo) "x<s1>y" from s1`, `*`, ``)
+	f(`* | extract if (s1:foo) "x<f1>y" from s1`, `*`, `f1`)

-	f(`* | extract from s1 "x<s1>y" | fields s2`, `s2`, ``)
-	f(`* | extract from s1 "x<s1>y" | fields s1`, `s1`, ``)
-	f(`* | extract from s1 "x<s1>y" if (x:foo) | fields s1`, `s1,x`, ``)
-	f(`* | extract from s1 "x<s1>y" if (x:foo) | fields s2`, `s2`, ``)
-	f(`* | extract from s1 "x<s1>y" if (s1:foo) | fields s1`, `s1`, ``)
-	f(`* | extract from s1 "x<s1>y" if (s1:foo) | fields s2`, `s2`, ``)
-	f(`* | extract from s1 "x<f1>y" if (s1:foo) | fields s1`, `s1`, ``)
-	f(`* | extract from s1 "x<f1>y" if (s1:foo) | fields s2`, `s2`, ``)
+	f(`* | extract "x<s1>y" from s1 | fields s2`, `s2`, ``)
+	f(`* | extract "x<s1>y" from s1 | fields s1`, `s1`, ``)
+	f(`* | extract if (x:foo) "x<s1>y" from s1 | fields s1`, `s1,x`, ``)
+	f(`* | extract if (x:foo) "x<s1>y" from s1 | fields s2`, `s2`, ``)
+	f(`* | extract if (s1:foo) "x<s1>y" from s1 | fields s1`, `s1`, ``)
+	f(`* | extract if (s1:foo) "x<s1>y" from s1 | fields s2`, `s2`, ``)
+	f(`* | extract if (s1:foo) "x<f1>y" from s1 | fields s1`, `s1`, ``)
+	f(`* | extract if (s1:foo) "x<f1>y" from s1 | fields s2`, `s2`, ``)

-	f(`* | extract from s1 "x<s1>y" | rm s2`, `*`, `s2`)
-	f(`* | extract from s1 "x<s1>y" | rm s1`, `*`, `s1`)
-	f(`* | extract from s1 "x<s1>y" if (x:foo) | rm s1`, `*`, `s1`)
-	f(`* | extract from s1 "x<s1>y" if (x:foo) | rm s2`, `*`, `s2`)
-	f(`* | extract from s1 "x<s1>y" if (s1:foo) | rm s1`, `*`, `s1`)
-	f(`* | extract from s1 "x<s1>y" if (s1:foo) | rm s2`, `*`, `s2`)
-	f(`* | extract from s1 "x<f1>y" if (s1:foo) | rm s1`, `*`, `f1`)
-	f(`* | extract from s1 "x<f1>y" if (s1:foo) | rm s2`, `*`, `f1,s2`)
+	f(`* | extract "x<s1>y" from s1 | rm s2`, `*`, `s2`)
+	f(`* | extract "x<s1>y" from s1 | rm s1`, `*`, `s1`)
+	f(`* | extract if (x:foo) "x<s1>y" from s1 | rm s1`, `*`, `s1`)
+	f(`* | extract if (x:foo) "x<s1>y" from s1 | rm s2`, `*`, `s2`)
+	f(`* | extract if (s1:foo) "x<s1>y" from s1 | rm s1`, `*`, `s1`)
+	f(`* | extract if (s1:foo) "x<s1>y" from s1 | rm s2`, `*`, `s2`)
+	f(`* | extract if (s1:foo) "x<f1>y" from s1 | rm s1`, `*`, `f1`)
+	f(`* | extract if (s1:foo) "x<f1>y" from s1 | rm s2`, `*`, `f1,s2`)

 	f(`* | unpack_json`, `*`, ``)
 	f(`* | unpack_json from s1`, `*`, ``)
--- a/lib/logstorage/pipe_extract.go
+++ b/lib/logstorage/pipe_extract.go
@ -4,7 +4,7 @@ import (
 	"fmt"
 )

-// pipeExtract processes '| extract from <field> <pattern>' pipe.
+// pipeExtract processes '| extract ...' pipe.
 //
 // See https://docs.victoriametrics.com/victorialogs/logsql/#extract-pipe
 type pipeExtract struct {
@ -19,13 +19,13 @@ type pipeExtract struct {

 func (pe *pipeExtract) String() string {
 	s := "extract"
-	if !isMsgFieldName(pe.fromField) {
-		s += " from " + quoteTokenIfNeeded(pe.fromField)
-	}
-	s += " " + quoteTokenIfNeeded(pe.patternStr)
 	if pe.iff != nil {
 		s += " " + pe.iff.String()
 	}
+	s += " " + quoteTokenIfNeeded(pe.patternStr)
+	if !isMsgFieldName(pe.fromField) {
+		s += " from " + quoteTokenIfNeeded(pe.fromField)
+	}
 	return s
 }

@ -90,14 +90,14 @@ func parsePipeExtract(lex *lexer) (*pipeExtract, error) {
 	}
 	lex.nextToken()

-	fromField := "_msg"
-	if lex.isKeyword("from") {
-		lex.nextToken()
-		f, err := parseFieldName(lex)
+	// parse optional if (...)
+	var iff *ifFilter
+	if lex.isKeyword("if") {
+		f, err := parseIfFilter(lex)
 		if err != nil {
-			return nil, fmt.Errorf("cannot parse 'from' field name: %w", err)
+			return nil, err
 		}
-		fromField = f
+		iff = f
 	}

 	// parse pattern
@ -110,19 +110,22 @@ func parsePipeExtract(lex *lexer) (*pipeExtract, error) {
 		return nil, fmt.Errorf("cannot parse 'pattern' %q: %w", patternStr, err)
 	}

+	// parse optional 'from ...' part
+	fromField := "_msg"
+	if lex.isKeyword("from") {
+		lex.nextToken()
+		f, err := parseFieldName(lex)
+		if err != nil {
+			return nil, fmt.Errorf("cannot parse 'from' field name: %w", err)
+		}
+		fromField = f
+	}
+
 	pe := &pipeExtract{
 		fromField:  fromField,
 		ptn:        ptn,
 		patternStr: patternStr,
-	}
-
-	// parse optional if (...)
-	if lex.isKeyword("if") {
-		iff, err := parseIfFilter(lex)
-		if err != nil {
-			return nil, err
-		}
-		pe.iff = iff
+		iff: iff,
 	}

 	return pe, nil
--- a/lib/logstorage/pipe_extract_test.go
+++ b/lib/logstorage/pipe_extract_test.go
@ -11,8 +11,8 @@ func TestParsePipeExtractSuccess(t *testing.T) {
 	}

 	f(`extract "foo<bar>"`)
-	f(`extract from x "foo<bar>"`)
-	f(`extract from x "foo<bar>" if (y:in(a:foo bar | uniq by (qwe) limit 10))`)
+	f(`extract "foo<bar>" from x`)
+	f(`extract if (x:y) "foo<bar>" from baz`)
 }

 func TestParsePipeExtractFailure(t *testing.T) {
@ -23,11 +23,10 @@ func TestParsePipeExtractFailure(t *testing.T) {

 	f(`extract`)
 	f(`extract from`)
+	f(`extract from x`)
+	f(`extract from x "y<foo>"`)
 	f(`extract if (x:y)`)
-	f(`extract if (x:y) "a<b>"`)
-	f(`extract "a<b>" if`)
-	f(`extract "a<b>" if (foo`)
-	f(`extract "a<b>" if "foo"`)
+	f(`extract "a<b>" if (x:y)`)
 	f(`extract "a"`)
 	f(`extract "<a><b>"`)
 	f(`extract "<*>foo<_>bar"`)
@ -64,7 +63,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, extract from non-existing field
-	f(`extract from x "foo=<bar>"`, [][]Field{
+	f(`extract "foo=<bar>" from x`, [][]Field{
 		{
 			{"_msg", `foo=bar`},
 		},
@ -76,7 +75,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, pattern mismatch
-	f(`extract from x "foo=<bar>"`, [][]Field{
+	f(`extract "foo=<bar>" from x`, [][]Field{
 		{
 			{"x", `foobar`},
 		},
@ -88,7 +87,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, partial partern match
-	f(`extract from x "foo=<bar> baz=<xx>"`, [][]Field{
+	f(`extract "foo=<bar> baz=<xx>" from x`, [][]Field{
 		{
 			{"x", `a foo="a\"b\\c" cde baz=aa`},
 		},
@ -101,7 +100,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, overwirte existing column
-	f(`extract from x "foo=<bar> baz=<xx>"`, [][]Field{
+	f(`extract "foo=<bar> baz=<xx>" from x`, [][]Field{
 		{
 			{"x", `a foo=cc baz=aa b`},
 			{"bar", "abc"},
@ -115,7 +114,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, if match
-	f(`extract from x "foo=<bar> baz=<xx>" if (x:baz)`, [][]Field{
+	f(`extract if (x:baz) "foo=<bar> baz=<xx>" from x`, [][]Field{
 		{
 			{"x", `a foo=cc baz=aa b`},
 			{"bar", "abc"},
@ -129,7 +128,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// single row, if mismatch
-	f(`extract from x "foo=<bar> baz=<xx>" if (bar:"")`, [][]Field{
+	f(`extract if (bar:"") "foo=<bar> baz=<xx>" from x`, [][]Field{
 		{
 			{"x", `a foo=cc baz=aa b`},
 			{"bar", "abc"},
@ -142,7 +141,7 @@ func TestPipeExtract(t *testing.T) {
 	})

 	// multiple rows with distinct set of labels
-	f(`extract "ip=<ip> " if (!ip:keep)`, [][]Field{
+	f(`extract if (!ip:keep) "ip=<ip> "`, [][]Field{
 		{
 			{"foo", "bar"},
 			{"_msg", "request from ip=1.2.3.4 xxx"},
@ -201,44 +200,44 @@ func TestPipeExtractUpdateNeededFields(t *testing.T) {
 	}

 	// all the needed fields
-	f("extract from x '<foo>'", "*", "", "*", "foo")
-	f("extract from x '<foo>' if (foo:bar)", "*", "", "*", "")
+	f("extract '<foo>' from x", "*", "", "*", "foo")
+	f("extract if (foo:bar) '<foo>' from x", "*", "", "*", "")

 	// unneeded fields do not intersect with pattern and output fields
-	f("extract from x '<foo>'", "*", "f1,f2", "*", "f1,f2,foo")
-	f("extract from x '<foo>' if (f1:x)", "*", "f1,f2", "*", "f2,foo")
-	f("extract from x '<foo>' if (foo:bar f1:x)", "*", "f1,f2", "*", "f2")
+	f("extract '<foo>' from x", "*", "f1,f2", "*", "f1,f2,foo")
+	f("extract if (f1:x) '<foo>' from x", "*", "f1,f2", "*", "f2,foo")
+	f("extract if (foo:bar f1:x) '<foo>' from x", "*", "f1,f2", "*", "f2")

 	// unneeded fields intersect with pattern
-	f("extract from x '<foo>'", "*", "f2,x", "*", "f2,foo")
-	f("extract from x '<foo>' if (f1:abc)", "*", "f2,x", "*", "f2,foo")
-	f("extract from x '<foo>' if (f2:abc)", "*", "f2,x", "*", "foo")
+	f("extract '<foo>' from x", "*", "f2,x", "*", "f2,foo")
+	f("extract if (f1:abc) '<foo>' from x", "*", "f2,x", "*", "f2,foo")
+	f("extract if (f2:abc) '<foo>' from x", "*", "f2,x", "*", "foo")

 	// unneeded fields intersect with output fields
-	f("extract from x '<foo>x<bar>'", "*", "f2,foo", "*", "bar,f2,foo")
-	f("extract from x '<foo>x<bar>' if (f1:abc)", "*", "f2,foo", "*", "bar,f2,foo")
-	f("extract from x '<foo>x<bar>' if (f2:abc foo:w)", "*", "f2,foo", "*", "bar")
+	f("extract '<foo>x<bar>' from x", "*", "f2,foo", "*", "bar,f2,foo")
+	f("extract if (f1:abc) '<foo>x<bar>' from x", "*", "f2,foo", "*", "bar,f2,foo")
+	f("extract if (f2:abc foo:w) '<foo>x<bar>' from x", "*", "f2,foo", "*", "bar")

 	// unneeded fields intersect with all the output fields
-	f("extract from x '<foo>x<bar>'", "*", "f2,foo,bar", "*", "bar,f2,foo,x")
-	f("extract from x '<foo>x<bar> if (a:b f2:q x:y foo:w)'", "*", "f2,foo,bar", "*", "bar,f2,foo,x")
+	f("extract '<foo>x<bar>' from x", "*", "f2,foo,bar", "*", "bar,f2,foo,x")
+	f("extract if (a:b f2:q x:y foo:w) '<foo>x<bar>' from x", "*", "f2,foo,bar", "*", "bar,f2,foo,x")

 	// needed fields do not intersect with pattern and output fields
-	f("extract from x '<foo>x<bar>'", "f1,f2", "", "f1,f2", "")
-	f("extract from x '<foo>x<bar>' if (a:b)", "f1,f2", "", "f1,f2", "")
-	f("extract from x '<foo>x<bar>' if (f1:b)", "f1,f2", "", "f1,f2", "")
+	f("extract '<foo>x<bar>' from x", "f1,f2", "", "f1,f2", "")
+	f("extract if (a:b) '<foo>x<bar>' from x", "f1,f2", "", "f1,f2", "")
+	f("extract if (f1:b) '<foo>x<bar>' from x", "f1,f2", "", "f1,f2", "")

 	// needed fields intersect with pattern field
-	f("extract from x '<foo>x<bar>'", "f2,x", "", "f2,x", "")
-	f("extract from x '<foo>x<bar>' if (a:b)", "f2,x", "", "f2,x", "")
+	f("extract '<foo>x<bar>' from x", "f2,x", "", "f2,x", "")
+	f("extract if (a:b) '<foo>x<bar>' from x", "f2,x", "", "f2,x", "")

 	// needed fields intersect with output fields
-	f("extract from x '<foo>x<bar>'", "f2,foo", "", "f2,x", "")
-	f("extract from x '<foo>x<bar>' if (a:b)", "f2,foo", "", "a,f2,x", "")
+	f("extract '<foo>x<bar>' from x", "f2,foo", "", "f2,x", "")
+	f("extract if (a:b) '<foo>x<bar>' from x", "f2,foo", "", "a,f2,x", "")

 	// needed fields intersect with pattern and output fields
-	f("extract from x '<foo>x<bar>'", "f2,foo,x,y", "", "f2,x,y", "")
-	f("extract from x '<foo>x<bar>' if (a:b foo:q)", "f2,foo,x,y", "", "a,f2,foo,x,y", "")
+	f("extract '<foo>x<bar>' from x", "f2,foo,x,y", "", "f2,x,y", "")
+	f("extract if (a:b foo:q) '<foo>x<bar>' from x", "f2,foo,x,y", "", "a,f2,foo,x,y", "")
 }

 func expectParsePipeFailure(t *testing.T, pipeStr string) {
--- a/lib/logstorage/pipe_format.go
+++ b/lib/logstorage/pipe_format.go
@ -21,11 +21,14 @@ type pipeFormat struct {
 }

 func (pf *pipeFormat) String() string {
-	s := "format " + quoteTokenIfNeeded(pf.formatStr)
+	s := "format"
 	if pf.iff != nil {
 		s += " " + pf.iff.String()
 	}
+	s += " " + quoteTokenIfNeeded(pf.formatStr)
+	if !isMsgFieldName(pf.resultField) {
 		s += " as " + quoteTokenIfNeeded(pf.resultField)
+	}
 	return s
 }

@ -150,16 +153,6 @@ func parsePipeFormat(lex *lexer) (*pipeFormat, error) {
 	}
 	lex.nextToken()

-	// parse format
-	formatStr, err := getCompoundToken(lex)
-	if err != nil {
-		return nil, fmt.Errorf("cannot read 'format': %w", err)
-	}
-	steps, err := parsePatternSteps(formatStr)
-	if err != nil {
-		return nil, fmt.Errorf("cannot parse 'pattern' %q: %w", formatStr, err)
-	}
-
 	// parse optional if (...)
 	var iff *ifFilter
 	if lex.isKeyword("if") {
@ -170,15 +163,26 @@ func parsePipeFormat(lex *lexer) (*pipeFormat, error) {
 		iff = f
 	}

-	// parse resultField
-	if !lex.isKeyword("as") {
-		return nil, fmt.Errorf("missing 'as' keyword after 'format %q'", formatStr)
+	// parse format
+	formatStr, err := getCompoundToken(lex)
+	if err != nil {
+		return nil, fmt.Errorf("cannot read 'format': %w", err)
 	}
+	steps, err := parsePatternSteps(formatStr)
+	if err != nil {
+		return nil, fmt.Errorf("cannot parse 'pattern' %q: %w", formatStr, err)
+	}
+
+	// parse optional 'as ...` part
+	resultField := "_msg"
+	if lex.isKeyword("as") {
 		lex.nextToken()
-	resultField, err := parseFieldName(lex)
+		field, err := parseFieldName(lex)
 		if err != nil {
 			return nil, fmt.Errorf("cannot parse result field after 'format %q as': %w", formatStr, err)
 		}
+		resultField = field
+	}

 	pf := &pipeFormat{
 		formatStr:   formatStr,
--- a/lib/logstorage/pipe_format_test.go
+++ b/lib/logstorage/pipe_format_test.go
@ -10,13 +10,14 @@ func TestParsePipeFormatSuccess(t *testing.T) {
 		expectParsePipeSuccess(t, pipeStr)
 	}

+	f(`format "foo<bar>"`)
 	f(`format "" as x`)
 	f(`format "<>" as x`)
 	f(`format foo as x`)
-	f(`format "<foo>" as _msg`)
-	f(`format "<foo>bar<baz>" as _msg`)
-	f(`format "bar<baz><xyz>bac" as _msg`)
-	f(`format "bar<baz><xyz>bac" if (x:y) as _msg`)
+	f(`format "<foo>"`)
+	f(`format "<foo>bar<baz>"`)
+	f(`format "bar<baz><xyz>bac"`)
+	f(`format if (x:y) "bar<baz><xyz>bac"`)
 }

 func TestParsePipeFormatFailure(t *testing.T) {
@ -26,9 +27,8 @@ func TestParsePipeFormatFailure(t *testing.T) {
 	}

 	f(`format`)
-	f(`format foo`)
+	f(`format if`)
 	f(`format foo bar`)
-	f(`format foo as`)
 	f(`format foo if`)
 	f(`format foo as x if (x:y)`)
 }
@ -108,7 +108,7 @@ func TestPipeFormat(t *testing.T) {
 	})

 	// conditional format over multiple rows
-	f(`format "a: <a>, b: <b>, x: <a>" if (!c:*) as c`, [][]Field{
+	f(`format if (!c:*) "a: <a>, b: <b>, x: <a>" as c`, [][]Field{
 		{
 			{"b", "bar"},
 			{"a", "foo"},
@ -147,41 +147,41 @@ func TestPipeFormatUpdateNeededFields(t *testing.T) {
 	// all the needed fields
 	f(`format "foo" as x`, "*", "", "*", "x")
 	f(`format "<f1>foo" as x`, "*", "", "*", "x")
-	f(`format "<f1>foo" if (f2:z) as x`, "*", "", "*", "x")
+	f(`format if (f2:z) "<f1>foo" as x`, "*", "", "*", "x")

 	// unneeded fields do not intersect with pattern and output field
 	f(`format "foo" as x`, "*", "f1,f2", "*", "f1,f2,x")
 	f(`format "<f3>foo" as x`, "*", "f1,f2", "*", "f1,f2,x")
-	f(`format "<f3>foo" if (f4:z) as x`, "*", "f1,f2", "*", "f1,f2,x")
-	f(`format "<f3>foo" if (f1:z) as x`, "*", "f1,f2", "*", "f2,x")
+	f(`format if (f4:z) "<f3>foo" as x`, "*", "f1,f2", "*", "f1,f2,x")
+	f(`format if (f1:z) "<f3>foo" as x`, "*", "f1,f2", "*", "f2,x")

 	// unneeded fields intersect with pattern
 	f(`format "<f1>foo" as x`, "*", "f1,f2", "*", "f2,x")
-	f(`format "<f1>foo" if (f4:z) as x`, "*", "f1,f2", "*", "f2,x")
-	f(`format "<f1>foo" if (f2:z) as x`, "*", "f1,f2", "*", "x")
+	f(`format if (f4:z) "<f1>foo" as x`, "*", "f1,f2", "*", "f2,x")
+	f(`format if (f2:z) "<f1>foo" as x`, "*", "f1,f2", "*", "x")

 	// unneeded fields intersect with output field
 	f(`format "<f1>foo" as x`, "*", "x,y", "*", "x,y")
-	f(`format "<f1>foo" if (f2:z) as x`, "*", "x,y", "*", "x,y")
-	f(`format "<f1>foo" if (y:z) as x`, "*", "x,y", "*", "x,y")
+	f(`format if (f2:z) "<f1>foo" as x`, "*", "x,y", "*", "x,y")
+	f(`format if (y:z) "<f1>foo" as x`, "*", "x,y", "*", "x,y")

 	// needed fields do not intersect with pattern and output field
 	f(`format "<f1>foo" as f2`, "x,y", "", "x,y", "")
-	f(`format "<f1>foo" if (f3:z) as f2`, "x,y", "", "x,y", "")
-	f(`format "<f1>foo" if (x:z) as f2`, "x,y", "", "x,y", "")
+	f(`format if (f3:z) "<f1>foo" as f2`, "x,y", "", "x,y", "")
+	f(`format if (x:z) "<f1>foo" as f2`, "x,y", "", "x,y", "")

 	// needed fields intersect with pattern field
 	f(`format "<f1>foo" as f2`, "f1,y", "", "f1,y", "")
-	f(`format "<f1>foo" if (f3:z) as f2`, "f1,y", "", "f1,y", "")
-	f(`format "<f1>foo" if (x:z) as f2`, "f1,y", "", "f1,y", "")
+	f(`format if (f3:z) "<f1>foo" as f2`, "f1,y", "", "f1,y", "")
+	f(`format if (x:z) "<f1>foo" as f2`, "f1,y", "", "f1,y", "")

 	// needed fields intersect with output field
 	f(`format "<f1>foo" as f2`, "f2,y", "", "f1,y", "")
-	f(`format "<f1>foo" if (f3:z) as f2`, "f2,y", "", "f1,f3,y", "")
-	f(`format "<f1>foo" if (x:z or y:w) as f2`, "f2,y", "", "f1,x,y", "")
+	f(`format if (f3:z) "<f1>foo" as f2`, "f2,y", "", "f1,f3,y", "")
+	f(`format if (x:z or y:w) "<f1>foo" as f2`, "f2,y", "", "f1,x,y", "")

 	// needed fields intersect with pattern and output fields
 	f(`format "<f1>foo" as f2`, "f1,f2,y", "", "f1,y", "")
-	f(`format "<f1>foo" if (f3:z) as f2`, "f1,f2,y", "", "f1,f3,y", "")
-	f(`format "<f1>foo" if (x:z or y:w) as f2`, "f1,f2,y", "", "f1,x,y", "")
+	f(`format if (f3:z) "<f1>foo" as f2`, "f1,f2,y", "", "f1,f3,y", "")
+	f(`format if (x:z or y:w) "<f1>foo" as f2`, "f1,f2,y", "", "f1,x,y", "")
 }