---
title: "How to URL-encode query parameters in ClickHouse®"
excerpt: "Learn how to safely URL-encode query parameters in ClickHouse® using encodeURLFormComponent, including syntax, examples, and performance tips for web applications."
authors: "Cameron Archer"
categories: "AI Resources"
createdOn: "2025-09-30 17:19:22"
publishedOn: "2025-10-03 10:30:00"
updatedOn: "2025-10-03 10:30:00"
status: "published"
---

URL encoding in ClickHouse® requires a different approach than you might expect from other databases or programming languages. Instead of a generic `urlEncode` function, ClickHouse® provides `encodeURLFormComponent`, which follows web form standards and treats spaces differently than JavaScript's equivalent function.

This guide covers the syntax, behavior differences, performance considerations, and practical examples for safely encoding query parameters in your ClickHouse® workflows.

## What `encodeURLFormComponent` does

ClickHouse® doesn't have a built-in `urlEncode` function like you might expect, but it does provide `encodeURLFormComponent` to safely encode strings for URL query parameters. This function replaces characters that have special meaning in URLs with percent-encoded representations that browsers and servers can safely parse.

Without proper encoding, a query parameter like `search=coffee & tea` would break your URL structure since the ampersand has special meaning in query strings. The function handles this by converting problematic characters into their safe equivalents.

You'll find this useful when building APIs or processing user input that gets embedded in URLs. Rather than manually replacing each special character, `encodeURLFormComponent` handles the entire conversion process according to web standards.

## Syntax and basic example

The function signature is straightforward and takes any UTF-8 string as input:

```sql
encodeURLFormComponent(x: String) → String
```

### 1. Minimal select statement

Here's how the function handles various special characters in a single query:

```sql
SELECT encodeURLFormComponent('email=a+b@example.com & tags=music/rock');
```

This returns: `email%3Da%2Bb%40example.com+%26+tags%3Dmusic%2Frock`

Notice the transformation pattern: equals signs become `%3D`, plus signs become `%2B`, at symbols become `%40`, spaces become `+`, ampersands become `%26`, and forward slashes become `%2F`.

### 2. Insert-select pattern for batch encoding

You can encode entire columns during data ingestion or transformation. This approach works well when processing large datasets:

```sql
INSERT INTO target_table
SELECT
    id,
    encodeURLFormComponent(raw_param) AS encoded_param
FROM source_table;
```

Alternatively, create a new table with encoded values using a single statement:

```sql
CREATE TABLE encoded_params AS
SELECT
    id,
    encodeURLFormComponent(param) AS param_encoded
FROM source_table;
```

## How it differs from JavaScript `encodeURIComponent`

The differences between ClickHouse®'s `encodeURLFormComponent` and JavaScript's `encodeURIComponent` matter when you're migrating logic between client and server environments.

### Space character handling

The most noticeable difference involves how each function treats spaces. ClickHouse® follows the application/x-www-form-urlencoded standard, which uses plus signs for spaces. JavaScript uses percent-encoding for everything, including spaces.

### Plus sign versus percent-20

This comparison shows the key difference:

```sql
-- ClickHouse®
SELECT encodeURLFormComponent('a b+c');  -- Returns: 'a+b%2Bc'
```

```javascript
// JavaScript
encodeURIComponent('a b+c');             // Returns: 'a%20b%2Bc'
```

In ClickHouse®, spaces become `+` while literal plus signs become `%2B`. JavaScript percent-encodes both spaces and plus signs differently, using `%20` for spaces and `%2B` for plus signs.

### Reserved character set

The functions also differ in their approach to "safe" characters:

- **ClickHouse® `encodeURLFormComponent`:** Focuses on form-style query strings, encoding characters like `=`, `&`, `+`, and most non-alphanumerics while treating spaces as `+`
- **JavaScript `encodeURIComponent`:** Encodes all characters except `A-Z a-z 0-9 - _ . ! ~ * ' ( )` and uses `%20` for spaces instead of `+`

If you need percent-20 encoding for spaces in ClickHouse®, `encodeURL` might be more appropriate than `encodeURLFormComponent`.

## Decoding with `decodeURLComponent`

Use `decodeURLComponent` to reverse the encoding process. This function interprets `+` as space and converts percent escapes back to their original UTF-8 characters.

### Round-trip verification query

Test that your encoding and decoding work correctly:

```sql
SELECT
    s,
    encodeURLFormComponent(s) AS enc,
    decodeURLComponent(enc) AS round_trip
FROM
(
    SELECT 'a b+c & €' AS s
);
```

The `round_trip` column returns the original string, confirming that the encode-decode process preserves your data integrity.

## Handling arrays, UTF-8 and nulls

The function works with UTF-8 strings and integrates well with ClickHouse®'s array functions. However, you'll want to consider multibyte characters and null handling in your schemas.

### UTF-8 multibyte characters

International characters and emojis get percent-encoded as their UTF-8 byte sequences:

```sql
SELECT encodeURLFormComponent('café ☕');  -- Returns: 'caf%C3%A9+%E2%98%95'
```

Each multibyte character becomes multiple percent-encoded bytes, which follows the correct URL encoding standards.

### Null and empty string behavior

The function handles edge cases predictably:

- **NULL values:** Stay NULL if your column is `Nullable(String)`. Wrap with `coalesce` if you prefer empty strings:

  ```sql
  SELECT encodeURLFormComponent(COALESCE(col, '')) FROM table_name;
  ```

- **Empty strings:** Pass through unchanged as `''`

### Encoding an array column

Apply encoding to each array element using `arrayMap`:

```sql
SELECT arrayMap(x -> encodeURLFormComponent(x), params) AS params_encoded
FROM events;
```

This pattern works well when you have multiple parameters stored as an array that all need encoding.

## Avoiding double encoding in queries

Applying encoding multiple times corrupts your data by turning `%` into `%25` and breaking downstream consumers. The key is encoding values exactly once at the boundary where they enter URL contexts.

### Detecting pre-encoded values

Use pattern matching to check if strings might already be encoded:

```sql
SELECT
    s,
    countRegexp(s, '%[0-9A-Fa-f]{2}') AS pct_escapes,
    position(s, '+') AS has_plus
FROM sample_table;
```

High `pct_escapes` counts or unexpected `+` characters often indicate pre-encoded values.

### Safe parameter ordering pattern

Follow these practices to prevent double encoding:

- **Store raw data unencoded** in your tables
- **Encode at read-time** when constructing URLs or API responses, or encode once at ingestion in a dedicated column
- **Centralize encoding** in a single view or [materialized view](https://tinybird.co/docs/forward/work-with-data/optimize/materialized-views) to prevent accidental duplication
- **Avoid encoding** the same field both during ingestion and again in downstream queries

## Performance considerations for large batches

Encoding is CPU-bound and scales with input length and the number of characters requiring escapes. The function performs efficiently for most workloads.

### CPU cost benchmarks

Relative to other string functions, `encodeURLFormComponent` performs similarly to `replaceAll` or `lowerUTF8`. It typically costs less than heavy regex operations and processes hundreds of MB/s per core, depending on your hardware and character distribution.

Monitor actual performance using `system.query_log` to measure costs for your specific dataset and usage patterns.

### Materialized view strategy

For frequently accessed data, consider pre-encoding during ingestion to shift CPU cost away from query time:

```sql
CREATE TABLE raw_events (
    id UInt64,
    param String,
    timestamp DateTime
) ENGINE = MergeTree
ORDER BY (timestamp, id);

CREATE MATERIALIZED VIEW mv_events_encoded TO events_encoded AS
SELECT
    id,
    encodeURLFormComponent(param) AS param_encoded,
    timestamp
FROM raw_events;
```

This approach keeps queries fast by doing encoding work once during ingestion rather than repeatedly during queries.

## Building ClickHouse®-based web analytics APIs with Tinybird

Let's create a complete example that demonstrates how to build a real-time API for URL parameter encoding using [Tinybird's](https://www.tinybird.co) managed ClickHouse® platform.

This walkthrough shows how to ingest web analytics data, process URL parameters with ClickHouse® functions, and expose encoded parameter analytics through a real-time API endpoint.

### 1. Create the data source

First, create a data source to store web analytics events:

```tinybird
SCHEMA >
    `timestamp` DateTime64(3) `json:$.timestamp`,
    `user_id` String `json:$.user_id`,
    `page_url` String `json:$.page_url`,
    `search_query` String `json:$.search_query`,
    `utm_source` String `json:$.utm_source`,
    `utm_medium` String `json:$.utm_medium`

ENGINE MergeTree
ENGINE_SORTING_KEY timestamp
ENGINE_PARTITION_KEY toYYYYMM(timestamp)
```

Now build the project locally:

```bash
tb dev
```

This builds the project on [Tinybird Local](https://www.tinybird.co/docs/forward/install-tinybird/local) and creates your data source. Then ingest some sample data using the Events API:

```bash
curl -X POST "http://localhost:7181/v0/events?name=web_analytics" \
  -H "Content-Type: application/json" \
  -d '{"timestamp": "2025-01-15 14:30:15.123", "user_id": "user123", "page_url": "https://shop.example.com/search?q=coffee & tea", "search_query": "coffee & tea", "utm_source": "google", "utm_medium": "cpc"}
{"timestamp": "2025-01-15 14:30:16.456", "user_id": "user456", "page_url": "https://shop.example.com/products?category=electronics&brand=apple", "search_query": "", "utm_source": "facebook", "utm_medium": "social"}
{"timestamp": "2025-01-15 14:30:17.789", "user_id": "user789", "page_url": "https://shop.example.com/about", "search_query": "", "utm_source": "", "utm_medium": ""}'
```

### 2. Create the pipe files

Create a pipe that processes URL data and provides analytics:

```tinybird
NODE web_events
SQL >
    SELECT
        timestamp,
        user_id,
        page_url,
        search_query,
        utm_source,
        utm_medium,
        encodeURLFormComponent(search_query) AS encoded_search,
        encodeURLFormComponent(utm_source) AS encoded_utm_source,
        encodeURLFormComponent(utm_medium) AS encoded_utm_medium
    FROM web_analytics
    WHERE timestamp >= now64(3) - INTERVAL 1 HOUR

NODE url_analytics
SQL >
    %
    SELECT
        encoded_search,
        encoded_utm_source,
        encoded_utm_medium,
        COUNT(*) AS search_count,
        COUNT(DISTINCT user_id) AS unique_users,
        MIN(timestamp) AS first_seen,
        MAX(timestamp) AS last_seen
    FROM web_events
    WHERE
        {\% if defined(utm_source) %}
            encoded_utm_source = {{ String(utm_source) }}
        {\% end %}
        {\% if defined(hours_back) %}
            AND timestamp >= now64(3) - INTERVAL {{ Int32(hours_back) }} HOUR
        {\% end %}
    GROUP BY encoded_search, encoded_utm_source, encoded_utm_medium
    ORDER BY search_count DESC
    LIMIT {{ Int32(limit, 100) }}

TYPE endpoint
```

### 3. Deploy and test the API

Deploy your project to Tinybird Cloud:

```bash
tb --cloud deploy
```

Then test your API endpoint:

```bash
# Get all search analytics from the last hour
curl -H "Authorization: Bearer <your_token>" \
  "https://api.tinybird.co/v0/pipes/url_analytics.json"

# Filter by UTM source
curl -H "Authorization: Bearer <your_token>" \
  "https://api.tinybird.co/v0/pipes/url_analytics.json?utm_source=google"

# Filter by time range
curl -H "Authorization: Bearer <your_token>" \
  "https://api.tinybird.co/v0/pipes/url_analytics.json?hours_back=2"
```

### 4. Sample API response

The API returns structured data with encoded parameter analytics:

```json
{
  "meta": [
    {
      "name": "encoded_search",
      "type": "String"
    },
    {
      "name": "encoded_utm_source",
      "type": "String"
    },
    {
      "name": "encoded_utm_medium",
      "type": "String"
    },
    {
      "name": "search_count",
      "type": "UInt64"
    },
    {
      "name": "unique_users",
      "type": "UInt64"
    },
    {
      "name": "first_seen",
      "type": "DateTime64(3)"
    },
    {
      "name": "last_seen",
      "type": "DateTime64(3)"
    }
  ],
  "data": [
    {
      "encoded_search": "coffee+%26+tea",
      "encoded_utm_source": "google",
      "encoded_utm_medium": "cpc",
      "search_count": 1247,
      "unique_users": 892,
      "first_seen": "2025-01-15 14:30:15.123",
      "last_seen": "2025-01-15 15:45:22.789"
    },
    {
      "encoded_search": "",
      "encoded_utm_source": "facebook",
      "encoded_utm_medium": "social",
      "search_count": 456,
      "unique_users": 234,
      "first_seen": "2025-01-15 14:30:16.456",
      "last_seen": "2025-01-15 15:30:18.345"
    }
  ],
  "rows": 2,
  "statistics": {
    "elapsed": 0.045,
    "rows_read": 2000,
    "bytes_read": 89000
  }
}
```

## Related ClickHouse® URL functions

ClickHouse® includes several URL utilities that complement `encodeURLFormComponent` for different parts of URL handling workflows.

### `encodeURL` for full URLs

Use `encodeURL` when you need percent-encoding for entire URLs or specifically want spaces as `%20` rather than `+`. This function suits path components and non-form contexts where the plus-as-space convention doesn't apply.

### `extractURLParameter` for retrieval

Extract parameter values from complete URLs using `extractURLParameter(url, name)`. Combine it with `decodeURLComponent` to properly interpret plus-as-space and percent escapes:

```sql
SELECT decodeURLComponent(extractURLParameter(url, 'q')) AS search_term
FROM page_views;
```

This pattern lets you safely extract and decode URL parameters that were encoded with `encodeURLFormComponent`.

## Next steps for web analytics APIs with ClickHouse®

Managing ClickHouse® infrastructure for URL parameter encoding involves complexity—from optimizing storage and query performance to handling encoding edge cases and maintaining consistent data quality across distributed systems.

Tinybird's [managed ClickHouse® service](https://www.tinybird.co/clickhouse) reduces infrastructure overhead by providing a managed ClickHouse® environment that abstracts optimization, scaling, and maintenance. Tinybird exposes ClickHouse® features such as `encodeURLFormComponent()` and URL parsing without requiring direct infrastructure management, including database clusters, performance monitoring, or version upgrades.

This enables teams to focus on application development rather than database operations or configuring ClickHouse® for production workloads. Tinybird's platform supports developer efficiency through managed scaling and workflow integrations.

[Sign up for a free Tinybird account](https://cloud.tinybird.co/signup) to build and test your first ClickHouse®-based API in just a few minutes.

## Additional resources

- [encodeURLFormComponent function](https://clickhouse.com/docs/en/sql-reference/functions/url-functions#encodeurlformcomponent)
- [decodeURLComponent function](https://clickhouse.com/docs/en/sql-reference/functions/url-functions#decodeurlcomponent)
- [encodeURL function](https://clickhouse.com/docs/en/sql-reference/functions/url-functions#encodeurl)
- [extractURLParameter function](https://clickhouse.com/docs/en/sql-reference/functions/url-functions#extracturlparameter)
- [URL functions overview](https://clickhouse.com/docs/en/sql-reference/functions/url-functions)
- [How to decode URL-encoded strings in ClickHouse®](https://www.tinybird.co/blog-posts/how-to-decode-url-encoded-strings-in-clickhouse)
- [Extract query parameter in ClickHouse®](https://www.tinybird.co/blog-posts/extract-query-parameter-clickhouse)
- [Extract URL fragment in ClickHouse®](https://www.tinybird.co/blog-posts/extract-url-fragment-clickhouse)

## FAQs about URL-encoding in ClickHouse®

### Can I run `encodeURLFormComponent` inside an INSERT statement?

Yes, you can use it directly in `INSERT ... SELECT` statements to encode during ingestion, or in a materialized view to automatically populate an encoded column as new data arrives.

### Which ClickHouse® version added `encodeURLFormComponent`?

The function was introduced in ClickHouse® 21.4 and is available in all recent stable versions, including the managed ClickHouse® provided by Tinybird.
