app/vmselect: ignore empty series for limit_offset (#3178)

* app/vmselect: ignore empty series for `limit_offset`

VictoriaMetrics doesn't return empty series (with all NaN values) to
the user. But such series are filtered after transform functions.
It means `limit_offset` will account for empty series as well.

For example, let's consider following data set:
```
time series:
foo{label="1"} NaN, NaN, NaN, NaN // empty series
foo{label="2"} 1, 2, 3, 4
foo{label="3"} 4, 3, 2, 1
```

When user requests all series for metric `foo` the empty series
will be filtered out:
```
/query=foo:
foo{label="v2"} 1, 2, 3, 4
foo{label="v3"} 4, 3, 2, 1
```

But `limit_offset(1, 1, foo)` is applied to original series, not filtered yet.
So it will return `foo{label="v2"}` (skips the first in list)
```
/query=limit_offset(1, 1, foo):
foo{label="v2"} 1, 2, 3, 4
```

Expected result would be to apply `limit_offset` to already filtered list,
so in result we receive `foo{label="v3"}`:
```
/query=limit_offset(1, 1, foo):
foo{label="v3"} 4, 3, 2, 1
```

The change does exactly that - filters empty series before applying `limit_offset`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: ignore empty series for `limit_offset`

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
Roman Khavronenko 2022-09-30 07:20:34 +02:00 committed by Aliaksandr Valialkin
parent 7543bdfd54
commit f68333a8ce
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
2 changed files with 24 additions and 1 deletions

View file

@ -2280,6 +2280,27 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`limit_offset NaN`, func(t *testing.T) {
t.Parallel()
// q returns 3 time series, where foo=3 contains only NaN values
// limit_offset suppose to apply offset for non-NaN series only
q := `limit_offset(1, 1, sort_by_label_desc((
label_set(time()*1, "foo", "1"),
label_set(time()*2, "foo", "2"),
label_set(time()*3, "foo", "3"),
) < 3000, "foo"))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("1"),
}}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`sum(label_graphite_group)`, func(t *testing.T) {
t.Parallel()
q := `sort(sum by (__name__) (

View file

@ -1848,7 +1848,9 @@ func transformLimitOffset(tfa *transformFuncArg) ([]*timeseries, error) {
if err != nil {
return nil, fmt.Errorf("cannot obtain offset arg: %w", err)
}
rvs := args[2]
// removeEmptySeries so offset will be calculated after empty series
// were filtered out.
rvs := removeEmptySeries(args[2])
if len(rvs) >= offset {
rvs = rvs[offset:]
}