Zed Language Overview
The Zed language is a query language for search, analytics, and transformation inspired by the pipeline pattern of the traditional Unix shell. Like a Unix pipeline, a query is expressed as a data source followed by a number of commands:
command | command | command | ...
However, in Zed, the entities that transform data are called "operators" instead of "commands" and unlike Unix pipelines, the streams of data in a Zed query are typed data sequences that adhere to the Zed data model. Moreover, Zed sequences can be forked and joined:
operator
| operator
| fork (
=> operator | ...
=> operator | ...
)
| join | ...
Here, Zed programs can include multiple data sources and splitting operations where multiple paths run in parallel and paths can be combined (in an undefined order), merged (in a defined order) by one or more sort keys, or joined using relational-style join logic.
Generally speaking, a flow graph
defines a directed acyclic graph (DAG) composed
of data sources and operator nodes. The Zed syntax leverages "fat arrows",
i.e., =>
, to indicate the start of a parallel leg of the data flow.
That said, the Zed language is declarative and the Zed compiler optimizes the data flow computation — e.g., often implementing a Zed program differently than the flow implied by the pipeline yet reaching the same result — much as a modern SQL engine optimizes a declarative SQL query.
Search and Analytics
Zed is also intended to provide a seamless transition from a simple search experience
(e.g., typed into a search bar or as the query argument of the zq
command-line
tool) to more a complex analytics experience composed of complex joins and aggregations
where the Zed language source text would typically be authored in a editor and
managed under source-code control.
Like an email or Web search, a simple keyword search is just the word itself, e.g.,
example.com
is a search for the string "example.com" and
example.com urgent
is a search for values with both the strings "example.com" and "urgent" present.
Unlike typical log search systems, the Zed language operators are uniform: you can specify an operator including keyword search terms, Boolean predicates, etc. using the same search expression syntax at any point in the pipeline.
For example,
the predicate message_length > 100
can simply be tacked onto the keyword search
from above, e.g.,
example.com urgent message_length > 100
finds all values containing the string "example.com" and "urgent" somewhere in them
provided further that the field message_length
is a numeric value greater than 100.
A related query that performs an aggregation could be more formally
written as follows:
search "example.com" AND "urgent"
| where message_length > 100
| summarize kinds:=union(type) by net:=network_of(srcip)
which computes an aggregation table of different message types (e.g.,
from a hypothetical field called type
) into a new, aggregated field
called kinds
and grouped by the network of all the source IP addresses
in the input
(e.g., from a hypothetical field called srcip
) as a derived field called net
.
The short-hand query from above might be typed into a search box while the latter query might be composed in a query editor or in Zed source files maintained in GitHub. Both forms are valid Zed queries.
Comments
To further ease the maintenance and readability of source files, comments
beginning with //
may appear in Zed.
// This includes a search with boolean logic, an expression, and an aggregation.
search "example.com" AND "urgent"
| where message_length > 100 // We only care about long messages
| summarize kinds:=union(type) by net:=network_of(srcip)
What's Next?
The following sections continue describing the Zed language.
- The Dataflow Model
- Data Types
- Const, Func, Operator, and Type Statements
- Expressions
- Search Expressions
- Lateral Subqueries
- Shaping and Type Fusion
You may also be interested in the detailed reference materials on operators, functions, and aggregate functions, as well as the conventions for how they're described.