When you're building search features on top of ClickHouse, you typically want searches to match case insensitive results. Users expect to find results for iPhones regardless of whether they type "iPhone" or "iphone". The ILIKE
operator handles this case-insensitive matching automatically, letting you search for text patterns without worrying about capitalization.
This guide covers how to perform case insensitive text matching in ClickHouse using ILIKE
, when to use ILIKE
instead of other string matching functions, and how to optimize text search query performance with specialized indexes. You'll also see how to build a production-ready search API using Tinybird's managed ClickHouse platform.
What is ILIKE in ClickHouse
The ILIKE
operator in ClickHouse performs pattern matching on strings without caring about uppercase or lowercase letters. When you search for 'apple' using ILIKE
, it will match 'Apple', 'APPLE', 'aPpLe', and any other case variation in your data.
This operator works exactly like the LIKE
operator, except it ignores case differences. You can use wildcard characters to create flexible search patterns: the percent sign %
matches any number of characters (including zero), while the underscore _
matches exactly one character.
SELECT product_name
FROM products
WHERE product_name ILIKE '%apple%';
-- "apple", "Apple", "aPpLe", "Apple MacBook Pro", "Red apples"
This query returns all product names containing "apple" regardless of how the letters are capitalized. The %
wildcards on both sides tell ClickHouse to find "apple" anywhere in the string.
ILIKE operator versus ilike function
ClickHouse gives you two ways to write the same case-insensitive search: column ILIKE pattern
or ilike(column, pattern)
. Both produce identical results, but the operator syntax reads more naturally if you're used to SQL.
-- Operator syntax
SELECT name FROM users WHERE name ILIKE 'john%';
-- Function syntax
SELECT name FROM users WHERE ilike(name, 'john%');
Most developers find the operator form easier to read, especially when you're chaining multiple conditions together. In addition, the operator is commonly used across other SQL dialects (e.g. PostgreSQL).
Operator rewrite during parsing
When you write column ILIKE pattern
, ClickHouse automatically converts it to ilike(column, pattern)
while parsing your query. This transformation happens before execution, so there's no performance difference between the two forms.
The operator syntax exists purely for readability. Under the hood, ClickHouse treats both forms identically.
Why aliases matter for readability
The operator form makes your queries cleaner when you're combining multiple search conditions. Compare these two examples:
-- Easier to scan
WHERE email ILIKE '%@gmail.com' AND name ILIKE 'sarah%'
-- Harder to parse quickly
WHERE ilike(email, '%@gmail.com') AND ilike(name, 'sarah%')
If other people will read your queries, the operator syntax saves them a few seconds of mental parsing. Those seconds add up across a large codebase.
Full syntax and wildcard rules
The ILIKE
operator takes a column and a pattern string. The pattern can include regular text characters plus two special wildcards: %
and _
.
ClickHouse reads your pattern from left to right, matching characters against the string value. When it hits a wildcard, the matching behavior changes based on which wildcard you used.
Percent and underscore wildcards
The %
wildcard matches zero or more characters of any kind. You can put it at the start, middle, or end of your pattern.
-- Matches names starting with 'john'
SELECT name FROM users WHERE name ILIKE 'john%';
-- Matches names ending with 'smith'
SELECT name FROM users WHERE name ILIKE '%smith';
-- Matches names containing 'anne' anywhere
SELECT name FROM users WHERE name ILIKE '%anne%';
The _
wildcard matches exactly one character. This works well when you know the string length but not all the characters.
-- Matches 'cat', 'car', 'can'
SELECT word FROM dictionary WHERE word ILIKE 'ca_';
-- Matches phone numbers like '555-1234'
SELECT phone FROM contacts WHERE phone ILIKE '555-____';
Escaping special characters
When your data contains literal %
or _
characters that you want to search for, put a backslash \
before them. Without the backslash, ClickHouse interprets them as wildcards.
-- Search for strings with a literal percent sign
SELECT description FROM products WHERE description ILIKE '%50\% off%';
-- Search for strings with an underscore
SELECT filename FROM files WHERE filename ILIKE '%test\_%';
Handling UTF-8 and case folding
ClickHouse performs Unicode case conversion for ILIKE
operations, which means it handles international characters correctly. The case conversion follows Unicode standards for Latin, Cyrillic, Greek, and most other alphabets.
-- Matches 'Müller', 'müller', 'MÜLLER'
SELECT name FROM customers WHERE name ILIKE 'müller';
For more precise control over case conversion in edge cases, you can still use lowerUTF8()
with the LIKE
operator instead.
-- Matches 'Müller', 'müller', 'MÜLLER'
SELECT name FROM customers WHERE lowerUTF8(name) LIKE 'müller';
ILIKE vs LIKE vs match for case-insensitive search
ClickHouse offers three main approaches for pattern matching in strings. Each has different tradeoffs in terms of flexibility and speed.
The LIKE
operator does case-sensitive matching. It runs faster than ILIKE
because it skips the case conversion step, but it only finds exact case matches.
The match()
function uses regular expressions for pattern matching. Regular expressions give you more flexibility than SQL wildcards, but they typically run slower on large datasets.
Function | Case Sensitivity | Pattern Type | Best For |
---|---|---|---|
LIKE | Sensitive | SQL wildcards | Known case patterns |
ILIKE | Insensitive | SQL wildcards | Simple flexible search |
match() | Configurable | Regex | Complex patterns |
Performance benchmarks
The LIKE
operator generally runs faster than ILIKE
because it doesn't convert case. The difference becomes noticeable mainly on tables with billions of rows.
When you combine lower()
or lowerUTF8()
with LIKE
, you're essentially doing what ILIKE
does internally. The performance is similar, but ILIKE
produces cleaner code.
-- Similar performance
WHERE lower(email) LIKE '%gmail.com'
WHERE email ILIKE '%gmail.com'
When to prefer match or multiMatch
The match()
function works better for complex patterns that SQL wildcards can't express, though regex-based functions cannot leverage ngrambf_v1 indexes. For example, validating email formats or matching multiple domain patterns at once.
-- Use match() for complex patterns
WHERE match(email, '^[a-z0-9]+@(gmail|yahoo)\.com$')
-- Use ILIKE for simple patterns
WHERE email ILIKE '%@gmail.com'
The multiMatchAny()
function checks multiple patterns in one pass, which can be faster than chaining several ILIKE
conditions with OR
.
Speeding up ILIKE with ngram and tokenbf indexes
ClickHouse provides specialized indexes for text search that can dramatically speed up ILIKE
queries on large tables. Without an index, ILIKE
scans every row even on tables with billions of entries.
The index types that work with ILIKE
pre-process your string data to enable faster substring lookups. They don't work for all query patterns, but when they do work, the speedup can be substantial.
Creating an ngrambf_v1 index
The ngrambf_v1
index type creates a bloom filter based on character n-grams. An n-gram is just a sequence of n characters, like "appl" or "pple" from "apple".
CREATE TABLE products (
id UInt32,
name String,
INDEX name_ngram name TYPE ngrambf_v1(4, 512, 2, 0) GRANULARITY 1
) ENGINE = MergeTree()
ORDER BY id;
Similar configurations have achieved 88× speedup over non-indexed searches in production workloads.
Query planner behavior
ClickHouse's query optimizer decides whether to use an n-gram index based on your pattern. The optimizer uses the index when the pattern includes enough non-wildcard characters to make index lookups worthwhile.
-- Index can help here
WHERE name ILIKE '%phone%'
-- Index can help here too
WHERE name ILIKE 'apple%'
You can check whether ClickHouse plans to use your index by running EXPLAIN
before your query.
Common mistakes that slow down ILIKE queries
Several patterns can make ILIKE
queries run slowly even with proper indexing. Knowing these pitfalls helps you write faster queries.
Leading wildcard explosion
Patterns starting with %
prevent some optimizations. The query planner can still use n-gram indexes for these patterns, but the performance gain is smaller compared to patterns with known prefixes.
-- Less efficient even with index
WHERE name ILIKE '%phone'
-- More efficient with index
WHERE name ILIKE 'phone%'
The leading wildcard prevents ClickHouse from eliminating rows based on how the string starts.
-- Slower: leading wildcard
WHERE description ILIKE '%warranty%'
-- Faster: known prefix
WHERE description ILIKE 'warranty%'
For the best performance, try to structure your queries to avoid leading wildcards when possible. If you frequently search with leading wildcards, consider storing a reversed version of the text in a separate column.
Data type mismatches
Applying ILIKE
to non-string columns forces ClickHouse to convert the data type before matching. This conversion happens for every row and adds overhead.
-- Inefficient: converts numbers to strings
WHERE user_id ILIKE '%123%'
-- Better: explicit conversion
WHERE toString(user_id) ILIKE '%123%'
-- Best: use numeric comparison
WHERE user_id = 123
When working with numeric or date columns, use the appropriate comparison operators instead of pattern matching.
Negating ILIKE incorrectly
Using NOT ILIKE
can prevent index usage and force full table scans. ClickHouse cannot optimize negative operators due to Bloom filter limitations.
-- Slower: negative condition
WHERE email NOT ILIKE '%@spam.com'
-- Consider alternatives
WHERE domain != 'spam.com'
If you need to exclude certain patterns frequently, consider adding a boolean column or separate table to track exclusions. You can check function support for different search indexes in Tinybird.
End-to-end example: build a parameterized ILIKE API with Tinybird
Tinybird is a managed ClickHouse service that lets you create APIs from SQL queries without managing infrastructure. This example shows how to build a searchable API using ILIKE
for case-insensitive text matching.
First, install the Tinybird CLI and start the local development environment:
curl -L tinybird.co | sh
tb login
tb local start
Ingest sample data source
Create a file named products.ndjson
with sample product data:
{"product_id": 1, "name": "iPhone 15 Pro", "category": "Electronics", "description": "Latest Apple smartphone"}
{"product_id": 2, "name": "MacBook Air", "category": "Electronics", "description": "Lightweight laptop from Apple"}
{"product_id": 3, "name": "Samsung Galaxy S24", "category": "Electronics", "description": "Android flagship phone"}
{"product_id": 4, "name": "Dell XPS 13", "category": "Electronics", "description": "Windows ultrabook"}
{"product_id": 5, "name": "iPad Pro", "category": "Electronics", "description": "Apple tablet with M2 chip"}
Create the data source and load the data:
tb datasource create products --file products.ndjson
tb build
tb datasource append products --file products.ndjson
Create the pipe with a pattern parameter
Create a file named search_products.pipe
that defines your search API:
TOKEN search_products_read READ
DESCRIPTION >
Search products by name or description using case-insensitive matching
NODE search_products_node
SQL >
%
SELECT
product_id,
name,
category,
description,
sum(amount) AS total_revenue
FROM products
WHERE
name ILIKE {{String(pattern, '%')}}
OR description ILIKE {{String(pattern, '%')}}
GROUP BY product_id, name, category, description
ORDER BY product_id
LIMIT {{Int32(limit, 10)}}
TYPE endpoint
The {{String(pattern)}}
syntax, part of Tinybird's templating language, creates a query parameter that API consumers can pass when calling your endpoint.
The default value %
returns all products if no pattern is provided.
Deploy and query the API
Deploy your pipe to Tinybird Cloud to create a hosted API:
tb --cloud deploy
Get the API endpoint URL:
tb --cloud endpoint url search_products --language curl
This outputs a curl command you can use to query your API:
curl -X GET "https://api.us-east.tinybird.co/v0/pipes/search_products.json?pattern=%apple25apple%25&limit=5&token=p.ey...token"
The API returns JSON with matching products:
{
"meta": [
{"name": "product_id", "type": "Int64"},
{"name": "name", "type": "String"},
{"name": "category", "type": "String"},
{"name": "description", "type": "String"}
{"name": "total_revenue", "type": "Float64"}
],
"data": [
{"product_id": 1, "name": "iPhone 15 Pro", "category": "Electronics", "description": "Latest Apple smartphone", "total_revenue": 10295921.59},
{"product_id": 2, "name": "MacBook Air", "category": "Electronics", "description": "Lightweight laptop from Apple", "total_revenue": 2859059.19},
{"product_id": 5, "name": "iPad Pro", "category": "Electronics", "description": "Apple tablet with M2 chip", "total_revenue": 7401298.10}
],
"rows": 3 ...
}
You can change the pattern
parameter to search for different terms. The ILIKE
operator handles all case variations automatically.
Next steps to ship real-time search faster
Tinybird eliminates the infrastructure complexity of running ClickHouse clusters. Instead of spending weeks setting up servers, configuring replication, and tuning performance, you can focus on building search and real-time analytics features.
The platform handles data ingestion, query optimization, and API hosting in one integrated service. This means you go from raw data to a production API in minutes instead of weeks.
Sign up for a free Tinybird account to start building case-insensitive search APIs with ClickHouse. The free tier includes enough resources to build and test real applications.
FAQs about case-insensitive search in ClickHouse
How do I perform ILIKE against multiple patterns in ClickHouse?
You can chain multiple ILIKE
conditions with OR
operators to match any of several patterns. For better performance with many patterns, consider using multiMatchAny()
with regular expressions that include case-insensitive flags.
-- Multiple ILIKE conditions
WHERE name ILIKE '%apple%' OR name ILIKE '%samsung%' OR name ILIKE '%google%'
-- Alternative with multiMatchAny
WHERE multiMatchAny(name, ['(?i)apple', '(?i)samsung', '(?i)google'])
Does ILIKE work with Cyrillic or other non-Latin alphabets in ClickHouse?
Yes, ILIKE
supports UTF-8 case folding for most Unicode characters including Cyrillic, Greek, Arabic, and other alphabets. ClickHouse automatically handles case conversion for international character sets during pattern matching.
-- Works with Cyrillic text
WHERE city ILIKE '%москва%' -- matches 'Москва', 'МОСКВА', 'москва'