Skip to main content
Version: Next

fuse

Operator

fuse coerce all input values into a merged type

Synopsis

fuse

Description

The fuse operator reads all of its input, computes an "intelligent merge" of varied types in the input, then adjusts each output value to conform to the merged type.

The merged type is constructed intelligently in the sense that type {a:string} and {b:string} is fused into type {a:string,b:string} instead of the Zed union type ({a:string},{b:string}).

TBD: document the algorithm here in more detail. The operator takes no paramters but we are experimenting with ways to control how field with the same name but different types are merged especially in light of complex types like arrays, sets, and so forth.

Because all values of the input must be read to compute the union, fuse may spill its input to disk when memory limits are exceeded.

Fuse is not normally needed for Zed data as the Zed data model supports heterogenous sequences of values. However, fuse can be quite useful during data exploration when sampling or filtering data to look at slices of raw data that are fused together. Fuse is also useful for transforming arbitrary Zed data to prepare it for formats that require a uniform schema like Parquet or a tabular structure like CSV. Unfortunately, when data leaves the Zed format using fuse to accomplish this, the original data must be altered to fit into the rigid structure of these output formats.

A fused type over many heterogeneous values also represents a typical design pattern of a data warehouse where a single very-wide schema defines slots for all possible input values where the columns are sparsely populated by each row value as the missing columns are set to null. Zed data is super-structured, and fortunately, does not require such a structure.

Examples

Fuse two records

echo '{a:1}{b:2}' | zq -z fuse -

=>

{a:1,b:null(int64)}
{a:null(int64),b:2}

Fuse records with type variation

echo '{a:1}{a:"foo"}' | zq -z fuse -

=>

{a:1((int64,string))}
{a:"foo"((int64,string))}

Fuse records with complex type variation

echo '{a:[1,2]}{a:["foo","bar"],b:10.0.0.1}' | zq -z fuse -

=>

{a:[1,2]([(int64,string)]),b:null(ip)}
{a:["foo","bar"]([(int64,string)]),b:10.0.0.1}

The table format clarifies what fuse does

echo '{a:1}{b:2}{a:3}' | zq -f table fuse -

=>

a b
1 -
- 2
3 -