ClickHouse was originally created by a web company, so you shouldn't be surprised to find that it contains many functions to help you work with URLs.
Get the protocol
You can easily extract the protocol using the built-in function
protocol(url) as protocol
│ https │
Get the domain
You can also extract the domain using the
domain(url) as protocol
│ www.tinybird.co │
If you want to excude the
www from the domain, use the `domainWithoutWWW() function.
domainWithoutWWW(url) as protocol
│ tinybird.co │
There's also a handful of functions for working with subdomains, such as:
Get URL query parameters
You might have experienced just how painful it can be to work with URL Query Parameters in other databases, but ClickHouse makes it easy with the
Imagine you have a URL like the following:
You want to extract the UTM parameters to analyse your marketing campaigns.
Here's how you can achieve that with ClickHouse:
extractURLParameter(url, 'utm_source') as utm_source,
extractURLParameter(url, 'utm_medium') as utm_medium,
extractURLParameter(url, 'utm_campaign') as utm_campaign,
extractURLParameter(url, 'utm_content') as utm_content
│ youtube │ referral │ livessessions │ description │