VictoriaMetrics/docs/VictoriaLogs/querying/README.md
Aliaksandr Valialkin ca79687500
wip
2024-05-20 04:46:34 +02:00

18 KiB

sort title weight menu aliases
4 Querying 4
docs
identifier parent weight
victorialogs-querying victorialogs 4
/VictoriaLogs/querying/
/VictoriaLogs/querying/index.html

Querying

VictoriaLogs can be queried with LogsQL via the following ways:

HTTP API

VictoriaLogs can be queried at the /select/logsql/query HTTP endpoint. The LogsQL query must be passed via query argument. For example, the following query returns all the log entries with the error word:

curl http://localhost:9428/select/logsql/query -d 'query=error'

The response by default contains all the log fields. See how to query specific fields.

The query argument can be passed either in the request url itself (aka HTTP GET request) or via request body with the x-www-form-urlencoded encoding (aka HTTP POST request). The HTTP POST is useful for sending long queries when they do not fit the maximum url length of the used clients and proxies.

See LogsQL docs for details on what can be passed to the query arg. The query arg must be properly encoded with percent encoding when passing it to curl or similar tools.

By default the /select/logsql/query returns all the log entries matching the given query. The response size can be limited in the following ways:

  • By closing the response stream at any time. In this case VictoriaLogs stops query execution and frees all the resources occupied by the request.
  • By specifying the maximum number of log entries, which can be returned in the response via limit query arg. For example, the following request returns up to 10 matching log entries:
    curl http://localhost:9428/select/logsql/query -d 'query=error' -d 'limit=10'
    
  • By adding limit pipe to the query. For example:
    curl http://localhost:9428/select/logsql/query -d 'query=error | limit 10'
    
  • By adding _time filter. The time range for the query can be specified via optional start and end query ars formatted according to these docs.
  • By adding other filters to the query.

The /select/logsql/query endpoint returns a stream of JSON lines, where each line contains JSON-encoded log entry in the form {field1="value1",...,fieldN="valueN"}. Example response:

{"_msg":"error: disconnect from 19.54.37.22: Auth fail [preauth]","_stream":"{}","_time":"2023-01-01T13:32:13Z"}
{"_msg":"some other error","_stream":"{}","_time":"2023-01-01T13:32:15Z"}

The matching lines are sent to the response stream as soon as they are found in VictoriaLogs storage. This means that the returned response may contain billions of lines for queries matching too many log entries. The response can be interrupted at any time by closing the connection to VictoriaLogs server. This allows post-processing the returned lines at the client side with the usual Unix commands such as grep, jq, less, head, etc. See these docs for more details.

The returned lines aren't sorted, since sorting disables the ability to send matching log entries to response stream as soon as they are found. Query results can be sorted either at VictoriaLogs side according to these docs or at client side with the usual sort command according to these docs.

By default the (AccountID=0, ProjectID=0) tenant is queried. If you need querying other tenant, then specify the needed tenant via http request headers. For example, the following query searches for log messages at (AccountID=12, ProjectID=34) tenant:

curl http://localhost:9428/select/logsql/query -H 'AccountID: 12' -H 'ProjectID: 34' -d 'query=error'

The number of requests to /select/logsql/query can be monitored with vl_http_requests_total{path="/select/logsql/query"} metric.

Querying hits stats

VictoriaMetrics provides /select/logsql/hits?query=<query>&start=<start>&end=<end>&step=<step> HTTP endpoint, which returns the number of matching log entries for the given <query> LogsQL query on the given [<start> ... <end>] time range grouped by <step> buckets. The returned results are sorted by time.

The <start> and <end> args can contain values in any supported format. If <start> is missing, then it equals to the minimum timestamp across logs stored in VictoriaLogs. If <end> is missing, then it equals to the maximum timestamp across logs stored in VictoriaLogs.

The <step> arg can contain values in the format specified here. If <step> is missing, then it equals to 1d (one day).

For example, the following command returns per-hour number of log messages with the error word over logs for the last 3 hours:

curl http://localhost:9428/select/logsql/hits -d 'query=error' -d 'start=3h' -d 'step=1h'

Below is an example JSON output returned from this endpoint:

{
  "hits": [
    {
      "fields": {},
      "timestamps": [
        "2024-01-01T00:00:00Z",
        "2024-01-01T01:00:00Z",
        "2024-01-01T02:00:00Z"
      ],
      "values": [
        410339,
        450311,
        899506
      ]
    }
  ]
}

Additionally, the offset=<offset> arg can be passed to /select/logsql/hits in order to group buckets according to the given timezone offset. The <offset> can contain values in the format specified here. For example, the following command returns per-day number of logs with error word over the last week in New York time zone (-4h):

curl http://localhost:9428/select/logsql/hits -d 'query=error' -d 'start=1w' -d 'step=1d' -d 'offset=-4h'

Additionally, any number of field=<field_name> args can be passed to /select/logsql/hits for grouping hits buckets by the mentioned <field_name> fields. For example, the following query groups hits by level field additionally to the provided step:

curl http://localhost:9428/select/logsql/hits -d 'query=*' -d 'start=3h' -d 'step=1h' -d 'field=level'

The grouped fields are put inside "fields" object:

{
  "hits": [
    {
      "fields": {
        "level": "error"
      },
      "timestamps": [
        "2024-01-01T00:00:00Z",
        "2024-01-01T01:00:00Z",
        "2024-01-01T02:00:00Z"
      ],
      "values": [
        25,
        20,
        15
      ]
    },
    {
      "fields": {
        "level": "info"
      },
      "timestamps": [
        "2024-01-01T00:00:00Z",
        "2024-01-01T01:00:00Z",
        "2024-01-01T02:00:00Z"
      ],
      "values": [
        25625,
        35043,
        25230
      ]
    }
  ]
}

See also:

Querying field names

VictoriaLogs provides /select/logsql/field_names?query=<query>&start=<start>&end=<end> HTTP endpoint, which returns field names from results of the given <query> LogsQL query on the given [<start> ... <end>] time range.

The <start> and <end> args can contain values in any supported format. If <start> is missing, then it equals to the minimum timestamp across logs stored in VictoriaLogs. If <end> is missing, then it equals to the maximum timestamp across logs stored in VictoriaLogs.

For example, the following command returns field names across logs with the error word for the last 5 minutes:

curl http://localhost:9428/select/logsql/field_names -d 'query=error' -d 'start=5m'

Below is an example JSON output returned from this endpoint:

{
  "names": [
    "_msg",
    "_stream",
    "_time",
    "host",
    "level",
    "location"
  ]
}

See also:

Querying field values

VictoriaLogs provides /select/logsql/field_values?query=<query>&field_name=<fieldName>&start=<start>&end=<end> HTTP endpoint, which returns unique values for the given <fieldName> field from results of the given <query> LogsQL query on the given [<start> ... <end>] time range.

The <start> and <end> args can contain values in any supported format. If <start> is missing, then it equals to the minimum timestamp across logs stored in VictoriaLogs. If <end> is missing, then it equals to the maximum timestamp across logs stored in VictoriaLogs.

For example, the following command returns unique values for host field across logs with the error word for the last 5 minutes:

curl http://localhost:9428/select/logsql/field_values -d 'query=error' -d 'field_name=host' -d 'start=5m'

Below is an example JSON output returned from this endpoint:

{
  "values": [
    "host_0",
    "host_1",
    "host_10",
    "host_100",
    "host_1000"
  ]
}

The /select/logsql/field_names endpoint supports optional limit=N query arg, which allows limiting the number of returned values to N. The endpoint returns arbitrary subset of values if their number exceeds N, so limit=N cannot be used for pagination over big number of field values.

See also:

Web UI

VictoriaLogs provides a simple Web UI for logs querying and exploration at http://localhost:9428/select/vmui. The UI allows exploring query results:

There are three modes of displaying query results:

  • Group - results are displayed as a table with rows grouped by stream and fields for filtering.
  • Table - displays query results as a table.
  • JSON - displays raw JSON response from HTTP API.

This is the first version that has minimal functionality. It comes with the following limitations:

  • The number of query results is always limited to 1000 lines. Iteratively add more specific filters to the query in order to get full response with less than 1000 lines.
  • Queries are always executed against tenant 0.

These limitations will be removed in future versions.

To get around the current limitations, you can use an alternative - the command line interface.

Command-line

VictoriaLogs integrates well with curl and other command-line tools during querying because of the following features:

  • VictoriaLogs sends the matching log entries to the response stream as soon as they are found. This allows forwarding the response stream to arbitrary Unix pipes.
  • VictoriaLogs automatically adjusts query execution speed to the speed of the client, which reads the response stream. For example, if the response stream is piped to less command, then the query is suspended until the less command reads the next block from the response stream.
  • VictoriaLogs automatically cancels query execution when the client closes the response stream. For example, if the query response is piped to head command, then VictoriaLogs stops executing the query when the head command closes the response stream.

These features allow executing queries at command-line interface, which potentially select billions of rows, without the risk of high resource usage (CPU, RAM, disk IO) at VictoriaLogs server.

For example, the following query can return very big number of matching log entries (e.g. billions) if VictoriaLogs contains many log messages with the error word:

curl http://localhost:9428/select/logsql/query -d 'query=error'

If the command returns "never-ending" response, then just press ctrl+C at any time in order to cancel the query. VictoriaLogs notices that the response stream is closed, so it cancels the query and instantly stops consuming CPU, RAM and disk IO for this query.

Then just use head command for investigating the returned log messages and narrowing down the query:

curl http://localhost:9428/select/logsql/query -d 'query=error' | head -10

The head -10 command reads only the first 10 log messages from the response and then closes the response stream. This automatically cancels the query at VictoriaLogs side, so it stops consuming CPU, RAM and disk IO resources.

Sometimes it may be more convenient to use less command instead of head during the investigation of the returned response:

curl http://localhost:9428/select/logsql/query -d 'query=error' | less

The less command reads the response stream on demand, when the user scrolls down the output. VictoriaLogs suspends query execution when less stops reading the response stream. It doesn't consume CPU and disk IO resources during this time. It resumes query execution when the less continues reading the response stream.

Suppose that the initial investigation of the returned query results helped determining that the needed log messages contain cannot open file phrase. Then the query can be narrowed down to error AND "cannot open file" (see these docs about AND operator). Then run the updated command in order to continue the investigation:

curl http://localhost:9428/select/logsql/query -d 'query=error AND "cannot open file"' | head

Note that the query arg must be properly encoded with percent encoding when passing it to curl or similar tools.

The pipe the query to "head" or "less" -> investigate the results -> refine the query iteration can be repeated multiple times until the needed log messages are found.

The returned VictoriaLogs query response can be post-processed with any combination of Unix commands, which are usually used for log analysis - grep, jq, awk, sort, uniq, wc, etc.

For example, the following command uses wc -l Unix command for counting the number of log messages with the error word received from streams with app="nginx" field during the last 5 minutes:

curl http://localhost:9428/select/logsql/query -d 'query=_stream:{app="nginx"} AND _time:5m AND error' | wc -l

See these docs about _stream filter, these docs about _time filter and these docs about AND operator.

The following example shows how to sort query results by the _time field:

curl http://localhost:9428/select/logsql/query -d 'query=error' | jq -r '._time + " " + ._msg' | sort | less

This command uses jq for extracting _time and _msg fields from the returned results, and piping them to sort command.

Note that the sort command needs to read all the response stream before returning the sorted results. So the command above can take non-trivial amounts of time if the query returns too many results. The solution is to narrow down the query before sorting the results. See these tips on how to narrow down query results.

The following example calculates stats on the number of log messages received during the last 5 minutes grouped by log.level field:

curl http://localhost:9428/select/logsql/query -d 'query=_time:5m log.level:*' | jq -r '."log.level"' | sort | uniq -c 

The query selects all the log messages with non-empty log.level field via "any value" filter, then pipes them to jq command, which extracts the log.level field value from the returned JSON stream, then the extracted log.level values are sorted with sort command and, finally, they are passed to uniq -c command for calculating the needed stats.

See also: