Lateral Subqueries
Lateral subqueries provide a powerful means to apply a Zed query to each subsequence of values generated from an outer sequence of values. The inner query may be any Zed query and may refer to values from the outer sequence.
Lateral subqueries are created using the scoped form of the
over
operator and may be nested to arbitrary depth.
For example,
echo '{s:"foo",a:[1,2]} {s:"bar",a:[3]}' | zq -z 'over a with name=s => (yield {name,elem:this})' -
produces
{name:"foo",elem:1}
{name:"foo",elem:2}
{name:"bar",elem:3}
Here the lateral scope, described below, creates a subquery
yield {name,elem:this}
for each subsequence of values derived from each outer input value. In the example above, there are two input values:
{s:"foo",a:[1,2]}
{s:"bar",a:[3]}
which imply two subqueries derived from the over
operator traversing a
.
The first subquery thus operates on the input values 1, 2
with the variable
name
set to "foo" assigning 1
and then 2
to this
, thereby emitting
{name:"foo",elem:1}
{name:"foo",elem:2}
and the second subquery operators on the input value 3
with the variable
name
set to "bar", emitting
{name:"bar",elem:3}
You can also import a parent-scope field reference into the inner scope by simply referring to its name without assignment, e.g.,
echo '{s:"foo",a:[1,2]} {s:"bar",a:[3]}' | zq -z 'over a with s => (yield {s,elem:this})' -
produces
{s:"foo",elem:1}
{s:"foo",elem:2}
{s:"bar",elem:3}
Lateral Scope
A lateral scope has the form => ( <query> )
and currently appears
only the context of an over
operator,
as illustrated above, and has the form:
over ... with <elem> [, <elem> ...] => ( <query> )
where <elem>
has either an assignment form
<var>=<expr>
or a field reference form
<field>
For each input value to the outer scope, the assignment form creates a binding
between each <expr>
evaluated in the outer scope and each <var>
, which
represents a new symbol in the inner scope of the <query>
.
In the field reference form, a single identifier <field>
refers to a field
in the parent scope and makes that field's value available in the lateral scope
with the same name.
The <query>
, which may be any Zed query, is evaluated once per outer value
on the sequence generated by the over
expression. In the lateral scope,
the value this
refers to the inner sequence generated from the over
expressions.
This query runs to completion for each inner sequence and emits
each subquery result as each inner sequence traversal completes.
This structure is powerful because any Zed query can appear in the body of
the lateral scope. In contrast to the yield
example, a sort could be
applied to each subsequence in the subquery, where sort
reads all values of the subsequence, sorts them, emits them, then
repeats the process for the next subsequence. For example,
echo '[3,2,1] [4,1,7] [1,2,3]' | zq -z 'over this => (sort this | collect(this))' -
produces
[1,2,3]
[1,4,7]
[1,2,3]
Lateral Expressions
Lateral subqueries can also appear in expression context using the parenthesized form:
( over <expr> [, <expr>...] [with <var>=<expr> [, ... <var>[=<expr>]] | <lateral> )
Note that the parentheses disambiguate a lateral expression from a lateral dataflow operator.
This form must always include a lateral scope as indicated by <lateral>
,
which can be any dataflow operator sequence excluding from
operators.
As with the over
operator, values from the outer scope can be brought into
the lateral scope using the with
clause.
The lateral expression is evaluated by evaluating each <expr>
and feeding
the results as inputs to the <lateral>
dataflow operators. Each time the
lateral expression is evaluated, the lateral operators are run to completion,
e.g.,
echo '[3,2,1] [4,1,7] [1,2,3]' | zq -z 'yield (over this | sum(this))' -
produces
6
12
6
This structure generalizes to any more complicated expression context, e.g., we can embed multiple lateral expressions inside of a record literal and use the spread operator to tighten up the output:
echo '[3,2,1] [4,1,7] [1,2,3]' | zq -z '{...(over this | sort this | sorted:=collect(this)),...(over this | sum:=sum(this))}' -
produces
{sorted:[1,2,3],sum:6}
{sorted:[1,4,7],sum:12}
{sorted:[1,2,3],sum:6}