uniq
Operator
uniq — deduplicate adjacent values
Synopsis
uniq [-c]
Description
Inspired by the traditional Unix shell command of the same name,
the uniq
operator copies its input to its output but removes duplicate values
that are adjacent to one another.
This operator is most often used with cut
and sort
to find and eliminate
duplicate values.
When run with the -c
option, each value is output as a record with the
type signature {value:any,count:uint64}
, where the value
field contains the
unique value and the count
field indicates the number of consecutive duplicates
that occurred in the input for that output value.
Examples
Simple deduplication
echo '1 2 2 3' | zq -z uniq -
=>
1
2
3
Simple deduplication with -c
echo '1 2 2 3' | zq -z 'uniq -c' -
=>
{value:1,count:1(uint64)}
{value:2,count:2(uint64)}
{value:3,count:1(uint64)}
Use sort to deduplicate non-adjacent values
echo '"hello" "world" "goodbye" "world" "hello" "again"' |
zq -z 'sort | uniq' -
=>
"again"
"goodbye"
"hello"
"world"
Complex values must match fully to be considered duplicate (e.g., every field/value pair in adjacent records)
echo '{ts:2024-09-10T21:12:33Z, action:"start"}
{ts:2024-09-10T21:12:34Z, action:"running"}
{ts:2024-09-10T21:12:34Z, action:"running"}
{ts:2024-09-10T21:12:35Z, action:"running"}
{ts:2024-09-10T21:12:36Z, action:"stop"}' |
zq -z 'uniq' -
=>
{ts:2024-09-10T21:12:33Z,action:"start"}
{ts:2024-09-10T21:12:34Z,action:"running"}
{ts:2024-09-10T21:12:35Z,action:"running"}
{ts:2024-09-10T21:12:36Z,action:"stop"}