---
title: Parametric Aggregate Functions reference
meta:
    description: Parametric Aggregate Functions
headingMaxLevels: 2
---

# Parametric aggregate functions

Some aggregate functions accept parameters in addition to their arguments. These parameters are constants used for initialization or configuration. The syntax for these functions uses two pairs of brackets: the first for parameters, and the second for arguments.

## histogram

Calculates an approximate histogram for a set of numerical values. This function does not guarantee precise results, as it uses an adaptive algorithm.

### Syntax

```sql
histogram(number_of_bins)(values)
```

### Arguments

- `values`: An expression representing the input numerical values for which to build the histogram.

### Parameters

- `number_of_bins`: UInt. The desired upper limit for the number of bins in the histogram. The function attempts to create this many bins but may use fewer if necessary.

### Returns

An array of tuples, where each tuple represents a bin in the histogram. Each tuple contains three elements: `(lower_bound, upper_bound, height)`.
Type: `Array(Tuple(Float64, Float64, Float64))`

### Example

```sql
SELECT histogram(5)(number + 1)
FROM (
    SELECT number
    FROM system.numbers
    LIMIT 20
)
```

Result:

```result
┌─histogram(5)(plus(number, 1))───────────────────────────────────────────┐
│ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] │
└─────────────────────────────────────────────────────────────────────────┘
```

You can visualize a histogram with the `bar` function:

```sql
WITH histogram(5)(rand() % 100) AS hist
SELECT
    arrayJoin(hist).3 AS height,
    bar(height, 0, 6, 5) AS bar
FROM
(
    SELECT number
    FROM system.numbers
    LIMIT 20
)
```

Result:

```result
┌─height─┬─bar───┐
│  2.125 │ █▋    │
│   3.25 │ ██▌   │
│  5.625 │ ████▏ │
│  5.625 │ ████▏ │
│  3.375 │ ██▌   │
└────────┴───────┘
```

## sequenceMatch

Determines if a sequence of events matches a specified pattern. It returns 1 if a match is found, and 0 otherwise.

### Syntax

```sql
sequenceMatch(pattern)(timestamp, cond1, cond2, ...)
```

{% callout type="info" %}
Events that occur at the same second may be processed in an undefined order, which can affect the result.
{% /callout %}

### Arguments

- `timestamp`: Date, DateTime, or UInt. A column containing time data, used to order events.
- `cond1`, `cond2`, ...: UInt8. Boolean conditions that define the events in the sequence. Up to 32 conditions can be provided. Events not described by any condition are skipped.

### Parameters

- `pattern`: String. A string defining the event sequence pattern to match. See [Pattern syntax](#pattern-syntax) below.

### Returns

- 1 if the pattern is matched.
- 0 if the pattern is not matched.
Type: `UInt8`.

### Pattern syntax

- `(?N)`: Matches the N-th condition argument (e.g., `(?1)` matches `cond1`). Conditions are numbered from 1 to 32.
- `.*`: Matches any number of events between specified conditions.
- `(?t operator value)`: Specifies a time constraint between two events. `operator` can be `>=`, `>`, `<`, `<=`, `==`, and `value` is a time in seconds. For example, `(?1)(?t>1800)(?2)` matches `cond1` followed by `cond2` with more than 1800 seconds between them.

### Example

Consider a dataset of events with `time` and `number` values.

```sql
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2)
FROM (
    SELECT 1 AS time, 1 AS number UNION ALL
    SELECT 2 AS time, 3 AS number UNION ALL
    SELECT 3 AS time, 2 AS number
) AS t
```

Result:

```result
┌─sequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2))─┐
│                                                                     1 │
└───────────────────────────────────────────────────────────────────────┘
```

## sequenceCount

Counts the number of non-overlapping event chains that match a specified pattern. The function searches for the next chain immediately after a current chain is matched.

### Syntax

```sql
sequenceCount(pattern)(timestamp, cond1, cond2, ...)
```

{% callout type="info" %}
Events that occur at the same second may be processed in an undefined order, which can affect the result.
{% /callout %}

### Arguments

- `timestamp`: Date, DateTime, or UInt. A column containing time data, used to order events.
- `cond1`, `cond2`, ...: UInt8. Boolean conditions that define the events in the sequence. Up to 32 conditions can be provided. Events not described by any condition are skipped.

### Parameters

- `pattern`: String. A string defining the event sequence pattern to match. See [Pattern syntax](#pattern-syntax) for `sequenceMatch`.

### Returns

The number of non-overlapping event chains that match the pattern.
Type: `UInt64`.

### Example

Count how many times `number = 2` occurs after `number = 1`, allowing any number of other events in between.

```sql
SELECT sequenceCount('(?1).*(?2)')(time, number = 1, number = 2)
FROM (
    SELECT 1 AS time, 1 AS number UNION ALL
    SELECT 2 AS time, 3 AS number UNION ALL
    SELECT 3 AS time, 2 AS number UNION ALL
    SELECT 4 AS time, 1 AS number UNION ALL
    SELECT 5 AS time, 3 AS number UNION ALL
    SELECT 6 AS time, 2 AS number
) AS t
```

Result:

```result
┌─sequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))─┐
│                                                                       2 │
└─────────────────────────────────────────────────────────────────────────┘
```

### See Also

- [sequenceMatch](#sequencematch)

## sequenceMatchEvents

Returns an array of timestamps corresponding to the events in the longest matched event chain for a given pattern.

### Syntax

```sql
sequenceMatchEvents(pattern)(timestamp, cond1, cond2, ...)
```

{% callout type="info" %}
Events that occur at the same second may be processed in an undefined order, which can affect the result.
{% /callout %}

### Arguments

- `timestamp`: Date, DateTime, or UInt. A column containing time data, used to order events.
- `cond1`, `cond2`, ...: UInt8. Boolean conditions that define the events in the chain. Up to 32 conditions can be provided. Events not described by any condition are skipped.

### Parameters

- `pattern`: String. A string defining the event sequence pattern to match. See [Pattern syntax](#pattern-syntax) for `sequenceMatch`.

### Returns

An array of timestamps for the matched condition arguments (`?N`) from the longest event chain. The position in the array corresponds to the position of the condition argument in the pattern.
Type: `Array(TimestampType)`.

### Example

Return timestamps of events for the longest chain matching `(?1).*(?2).*(?1)(?3)`.

```sql
SELECT sequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, number = 1, number = 2, number = 4)
FROM (
    SELECT 1 AS time, 1 AS number UNION ALL
    SELECT 2 AS time, 3 AS number UNION ALL
    SELECT 3 AS time, 2 AS number UNION ALL
    SELECT 4 AS time, 1 AS number UNION ALL
    SELECT 5 AS time, 3 AS number UNION ALL
    SELECT 6 AS time, 2 AS number
) AS t
```

Result:

```result
┌─sequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, equals(number, 1), equals(number, 2), equals(number, 4))─┐
│ [1,3,4]                                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```

### See Also

- [sequenceMatch](#sequencematch)

## windowFunnel

Analyzes event chains within a sliding time window and calculates the maximum number of consecutive conditions met from the chain.

### Syntax

```sql
windowFunnel(window, [mode, ...])(timestamp, cond1, cond2, ..., condN)
```

### Arguments

- `timestamp`: Date, DateTime, or UInt. The column containing event timestamps.
- `cond`: UInt8. Boolean conditions describing the sequence of events.

### Parameters

- `window`: UInt. The length of the sliding time window. This defines the maximum time interval (in the units of the `timestamp` column) between the first and last matched condition for a chain to be considered valid.
- `mode`: String. Optional. One or more modes can be specified to modify the funnel behavior:
    - `'strict_deduplication'`: If the same condition holds for consecutive events, further processing is interrupted.
    - `'strict_order'`: Prevents other events from intervening between matched conditions.
    - `'strict_increase'`: Only applies conditions to events with strictly increasing timestamps.
    - `'strict_once'`: Counts each event only once within a chain, even if it meets multiple conditions.

### Returns

The maximum number of consecutive conditions from the chain that were met within the sliding time window.
Type: `Integer`.

### Example

Determine the maximum funnel stage reached by users within a 7-day window (604800 seconds).

```tb {%title=trend.datasource%}
SCHEMA >
    event_date Date `json:$.event_date`,
    user_id UInt32 `json:$.user_id`,
    timestamp DateTime `json:$.timestamp`,
    eventID UInt32 `json:$.eventID`,
    product String `json:$.product`

ENGINE "MergeTree"
```

```bash
tb push datasources/trend.datasource

echo '
{"event_date": "2019-01-28", "user_id": 1, "timestamp": "2019-01-29 10:00:00", "eventID": 1003, "product": "phone"}
{"event_date": "2019-01-31", "user_id": 1, "timestamp": "2019-01-31 09:00:00", "eventID": 1007, "product": "phone"}
{"event_date": "2019-01-30", "user_id": 1, "timestamp": "2019-01-30 08:00:00", "eventID": 1009, "product": "phone"}
{"event_date": "2019-02-01", "user_id": 1, "timestamp": "2019-02-01 08:00:00", "eventID": 1010, "product": "phone"}'  > trend.ndjson

tb datasource append trend trend.ndjson
```

```sql
SELECT
    level,
    count() AS c
FROM
(
    SELECT
        user_id,
        windowFunnel(604800)(timestamp, eventID = 1003, eventID = 1009, eventID = 1007, eventID = 1010) AS level
    FROM trend
    WHERE (event_date >= '2019-01-01') AND (event_date <= '2019-02-02')
    GROUP BY user_id
)
GROUP BY level
ORDER BY level ASC
```

Result:

```result
┌─level─┬─c─┐
│     4 │ 1 │
└───────┴───┘
```

## retention

Calculates an array of boolean values indicating whether a series of conditions were met for a given event, relative to the first condition.

### Syntax

```sql
retention(cond1, cond2, ..., cond32)
```

### Arguments

- `cond`: UInt8. A boolean expression (0 or 1) representing a condition. Up to 32 conditions can be provided.

### Returns

An array of `UInt8` values. The first element indicates if `cond1` was met. Subsequent elements `N` indicate if `cond1` AND `condN` were met.
Type: `Array(UInt8)`.

### Example

Calculate user retention across three specific dates.

```sql
SELECT
    uid,
    retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM (
    SELECT '2020-01-01' as date, number as uid FROM numbers(5)  UNION ALL
    SELECT '2020-01-02' as date, number as uid FROM numbers(10) UNION ALL
    SELECT '2020-01-03' as date, number as uid FROM numbers(15))
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
ORDER BY uid ASC
```

Result:

```result
┌─uid─┬─r───────┐
│   0 │ [1,1,1] │
│   1 │ [1,1,1] │
│   2 │ [1,1,1] │
│   3 │ [1,1,1] │
│   4 │ [1,1,1] │
│   5 │ [0,0,0] │
│   6 │ [0,0,0] │
│   7 │ [0,0,0] │
│   8 │ [0,0,0] │
│   9 │ [0,0,0] │
│  10 │ [0,0,0] │
│  11 │ [0,0,0] │
│  12 │ [0,0,0] │
│  13 │ [0,0,0] │
│  14 │ [0,0,0] │
└─────┴─────────┘
```

## uniqUpTo

Calculates the number of distinct values for an argument, up to a specified limit `N`. If the count exceeds `N`, it returns `N + 1`. Otherwise, it returns the exact count.

### Syntax

```sql
uniqUpTo(N)(x)
```

### Arguments

- `x`: The expression for which to count unique values.

### Parameters

- `N`: UInt. The maximum limit for the unique count. Recommended for small values (up to 10), with a maximum of 100.

### Returns

The number of unique values, or `N + 1` if the count exceeds `N`.
Type: `UInt64`.

### Example

Count unique numbers up to a limit of 3.

```sql
SELECT uniqUpTo(3)(number)
FROM (
    SELECT number FROM system.numbers LIMIT 5
)
```

Result:

```result
┌─uniqUpTo(3)(number)─┐
│                   4 │
└─────────────────────┘
```

## sumMapFiltered

Aggregates values from maps, similar to `sumMap`, but only for keys specified in a filter array. This is useful for high-cardinality maps where you only need to sum specific keys.

### Syntax

```sql
sumMapFiltered(keys_to_keep)(keys, values)
```

### Arguments

- `keys`: Array. An array of keys from the map.
- `values`: Array. An array of values corresponding to the keys.

### Parameters

- `keys_to_keep`: Array. An array of keys to include in the summation. Only values associated with these keys will be summed.

### Returns

A tuple containing two arrays: the filtered and sorted keys, and their corresponding summed values.
Type: `Tuple(Array(KeyType), Array(ValueType))`.

### Example

Sum values for specific keys `1`, `4`, and `8` from a map.

```sql
SELECT sumMapFiltered([1, 4, 8])(mapKeys(statusMap),mapValues(statusMap)) as summapfiltered
FROM (
  SELECT
    c1::Date as date,
    c2::DateTime as timeslot,
    (c3::Array(UInt16), c4::Array(UInt64))::Map(UInt16, UInt64) as statusMap
  FROM VALUES(
    ('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10])
  )
)
```

Result:

```result
┌─────summapfiltered───┐
│ ([1,4,8],[10,20,10]) │
└──────────────────────┘
```

## sumMapFilteredWithOverflow

Aggregates values from maps for specified keys, similar to `sumMapFiltered`, but performs summation with potential overflow. This means the return type for summed values will match the input value type, even if the sum exceeds its maximum capacity.

### Syntax

```sql
sumMapFilteredWithOverflow(keys_to_keep)(keys, values)
```

### Arguments

- `keys`: Array. An array of keys from the map.
- `values`: Array. An array of values corresponding to the keys.

### Parameters

- `keys_to_keep`: Array. An array of keys to include in the summation. Only values associated with these keys will be summed.

### Returns

A tuple containing two arrays: the filtered and sorted keys, and their corresponding summed values. The type of the summed values array will match the input `values` type.
Type: `Tuple(Array(KeyType), Array(ValueType))`.

### Example

Demonstrate `sumMapFilteredWithOverflow` where `UInt8` values might overflow, compared to `sumMapFiltered` which promotes to `UInt64` to prevent overflow.

```sql
SELECT
    sumMapFilteredWithOverflow([1, 4, 8])(mapKeys(statusMap),mapValues(statusMap)) as summap_overflow,
    toTypeName(summap_overflow)
FROM (
  SELECT
    c1::Date as date,
    c2::DateTime as timeslot,
    (c3::Array(UInt8), c4::Array(UInt8))::Map(UInt8, UInt8) as statusMap
  FROM VALUES(
    ('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10])
  )
)
```

Result:

```result
┌─summap_overflow──────┬─toTypeName(summap_overflow)───────┐
│ ([1,4,8],[10,20,10]) │ Tuple(Array(UInt8), Array(UInt8)) │
└──────────────────────┴───────────────────────────────────┘
```

Compare with `sumMapFiltered`:

```sql
SELECT
    sumMapFiltered([1, 4, 8])(mapKeys(statusMap),mapValues(statusMap)) as summap,
    toTypeName(summap)
FROM (
  SELECT
    c1::Date as date,
    c2::DateTime as timeslot,
    (c3::Array(UInt8), c4::Array(UInt8))::Map(UInt8, UInt8) as statusMap
  FROM VALUES(
    ('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
    ('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10])
  )
)
```

Result:

```result
┌─summap───────────────┬─toTypeName(summap)─────────────────┐
│ ([1,4,8],[10,20,10]) │ Tuple(Array(UInt8), Array(UInt64)) │
└──────────────────────┴────────────────────────────────────┘
```

## sequenceNextNode

Identifies and returns the value of the next event in a sequence that matches a specified pattern, relative to a base event.

### Syntax

```sql
sequenceNextNode(direction, base)(timestamp, event_column, base_condition, event1, event2, event3, ...)
```

### Arguments

- `timestamp`: Date, DateTime, or UInt. The column containing event timestamps.
- `event_column`: String or Nullable(String). The column whose value should be returned for the next matched event.
- `base_condition`: UInt8. A boolean condition that the base event must fulfill.
- `event1`, `event2`, ...: UInt8. Boolean conditions describing the chain of events to match after the base event.

### Parameters

- `direction`: String. Specifies the search direction:
    - `'forward'`: Searches for events occurring after the base event.
    - `'backward'`: Searches for events occurring before the base event.
- `base`: String. Defines how the base event is determined:
    - `'head'`: The first event in the sequence that satisfies `base_condition`.
    - `'tail'`: The last event in the sequence that satisfies `base_condition`.
    - `'first_match'`: The first event that matches `event1` and satisfies `base_condition`.
    - `'last_match'`: The last event that matches `event1` and satisfies `base_condition`.

### Returns

The value from `event_column` of the next matched event, or `NULL` if no match is found or the next event does not exist.
Type: `Nullable(String)`.

### Example

Find the event that follows 'A' then 'B' in a sequence, starting from the head of the sequence.

```sql
SELECT
    id,
    sequenceNextNode('forward', 'head')(dt, page, page = 'A', page = 'A', page = 'B') AS next_flow
FROM (
    SELECT 1 AS dt, 1 AS id, 'A' AS page UNION ALL
    SELECT 2 AS dt, 1 AS id, 'B' AS page UNION ALL
    SELECT 3 AS dt, 1 AS id, 'C' AS page UNION ALL
    SELECT 4 AS dt, 1 AS id, 'D' AS page UNION ALL
    SELECT 5 AS dt, 1 AS id, 'E' AS page
) AS test_flow
GROUP BY id
```

Result:

```result
┌─id─┬─next_flow─┐
│  1 │ C         │
└────┴───────────┘
```
