When your application stores dates as integers like 20241215 or strings like '20241215143022', ClickHouse treats them as regular numbers rather than actual dates. This means you can't use date functions, get poor query performance, and miss out on specialized compression that can reduce storage by 30-50%.
ClickHouse provides four specialized functions that convert numeric formats into proper date types:
YYYYMMDDToDate()
YYYYMMDDToDate32()
YYYYMMDDhhmmssToDateTime()
YYYYMMDDhhmmssToDateTime64()
We'll cover how each function works, when to use them, performance optimization strategies, and how to handle edge cases like invalid dates and timezone conversions.
Why convert numeric dates to ClickHouse date types
While ClickHouse does give you four specific functions to turn numeric values into proper date types: YYYYMMDDToDate()
, YYYYMMDDToDate32()
, YYYYMMDDhhmmssToDateTime()
, and YYYYMMDDhhmmssToDateTime64()
, you're really better off storing dates from your application as date strings rather than integers.
If your dates are stored as integers like 20241215 or strings like '20241215143022', ClickHouse treats them as regular numbers or text, which makes date operations slow and limits what you can do with them.
But, if you don't want to change the way you initial store dates from your application, then converting to proper date types unlocks ClickHouse's built-in date functions and dramatically improves query performance. You'll also get better compression—proper date columns can be 30-50% smaller than their numeric equivalents because ClickHouse can apply specialized compression algorithms.
Also, conversion gives you access to hundreds of date functions like addDays()
, dateDiff()
, and formatDateTime()
that only work with actual date types.
Choosing between YYYYMMDDToDate and YYYYMMDDToDate32
Both functions parse YYYYMMDD format, but they handle different date ranges and use different amounts of storage space. The choice between them depends on your data's date range and storage requirements.
Date range support
YYYYMMDDToDate()
covers dates from 1900-01-01 to 2299-12-31, while YYYYMMDDToDate32()
only goes from 1900-01-01 to 2106-02-07. For most applications, the regular Date type's broader range makes it the safer choice.
SELECT YYYYMMDDToDate(19001231); -- Works: 1900-12-31
SELECT YYYYMMDDToDate32(21070101); -- Fails: beyond Date32 range
Storage size and codec impact
Date uses 2 bytes per value while Date32 uses 4 bytes. When you're storing millions of date values, Date32 will consume exactly twice the storage space, which adds up quickly in large datasets.
The smaller storage footprint also means better cache utilization and faster queries, especially for analytical workloads that scan large date ranges.
Choosing between YYYYMMDDhhmmssToDateTime and YYYYMMDDhhmmssToDateTime64
When parsing datetime values in YYYYMMDDhhmmss format, your choice comes down to precision requirements and whether you need sub-second accuracy.
Precision needs in analytics
YYYYMMDDhhmmssToDateTime()
gives you second-level precision, which works well for most logging, event tracking, and business analytics. YYYYMMDDhhmmssToDateTime64()
supports microsecond precision, making it essential for high-frequency trading data, IoT sensor readings, or any application where milliseconds matter.
SELECT YYYYMMDDhhmmssToDateTime(20241215143022); -- 2024-12-15 14:30:22
SELECT YYYYMMDDhhmmssToDateTime64(20241215143022, 3); -- 2024-12-15 14:30:22.000
Timezone awareness
Both functions assume UTC by default, but you can specify a timezone as the final parameter. This becomes critical when parsing timestamps from different geographic regions.
SELECT YYYYMMDDhhmmssToDateTime(20241215143022, 'America/New_York');
SELECT YYYYMMDDhhmmssToDateTime64(20241215143022, 3, 'Europe/London');
Function signatures and accepted data types
Each function accepts specific input types and returns different ClickHouse date types. Understanding the exact syntax helps you avoid common parsing errors.
UInt32 input
All four functions work with UInt32 values as their primary input. Integer literals and UInt32 columns work directly without casting, which is the most efficient approach.
SELECT YYYYMMDDToDate(20241215);
SELECT YYYYMMDDhhmmssToDateTime(20241215143022);
String input
ClickHouse automatically converts string representations to the appropriate numeric type, but this creates performance overhead with large datasets. Converting strings to integers during data ingestion rather than query time gives better performance.
Return types
Each function returns a specific ClickHouse type: Date, Date32, DateTime, or DateTime64. The DateTime64 function accepts a precision parameter (0-9) that specifies decimal places for fractional seconds, with 3 being the most common choice for millisecond precision.
Step-by-step conversion of a column
Converting existing numeric date columns to proper date types requires careful planning to avoid data loss and minimize downtime. Here's how to approach it systematically.
1. Inspect existing schema
Start by examining your current table structure and sampling the data to understand the format and identify any edge cases or invalid values.
DESCRIBE TABLE events;
SELECT date_column, COUNT(*)
FROM events
WHERE date_column < 19000101 OR date_column > 22991231
GROUP BY date_column;
2. Backfill with ALTER TABLE UPDATE
Use ALTER TABLE UPDATE
to convert existing data, but be aware that this operation can be resource-intensive for large tables. Monitor the progress using the system.mutations
table.
ALTER TABLE events
UPDATE date_column = YYYYMMDDToDate(date_column)
WHERE date_column IS NOT NULL;
3. Add a materialized view for new inserts
Create a materialized view that automatically parses incoming numeric dates, ensuring new data gets converted without additional application changes.
CREATE MATERIALIZED VIEW events_parsed
ENGINE = MergeTree()
ORDER BY parsed_date
AS SELECT
YYYYMMDDToDate(raw_date) AS parsed_date,
event_type,
user_id
FROM events_raw;
4. Validate results
After conversion, run validation queries to check accuracy and identify any parsing failures that resulted in null values or unexpected dates.
SELECT
COUNT(*) AS total_rows,
COUNT(parsed_date) AS valid_dates,
COUNT(*) - COUNT(parsed_date) AS null_dates
FROM events_parsed;
Performance tips for bulk parses
Large-scale date parsing operations can slow down queries, but several optimization approaches can minimize this overhead.
Minimize on-the-fly casting
Avoid parsing numeric dates in WHERE clauses or SELECT statements during query execution. Instead, convert dates during data ingestion or use pre-computed columns to eliminate parsing overhead from analytical queries.
-- Slow: parsing during query
SELECT * FROM events WHERE YYYYMMDDToDate(date_int) > '2024-01-01';
-- Fast: use pre-parsed column
SELECT * FROM events WHERE date_parsed > '2024-01-01';
Use LowCardinality where possible
When your date ranges are limited (like daily partitions or monthly reports), wrap parsed dates in LowCardinality
to improve compression and query performance.
ALTER TABLE events
ADD COLUMN date_parsed LowCardinality(Date)
DEFAULT YYYYMMDDToDate(date_int);
Benchmark with system query log
Use system.query_log
to measure parsing performance and identify bottlenecks in your date conversion pipeline. Look for queries with high memory usage or long execution times.
Alternatives for ClickHouse string to datetime parsing
While the YYYYMMDD functions work well for standardized numeric formats, ClickHouse offers other parsing options for more flexible date handling.
parseDateTimeBestEffort variants
parseDateTimeBestEffort()
handles dozens of date formats automatically, making it useful when your input data contains mixed formats or when you're unsure of the exact format. However, it's slower than the specialized YYYYMMDD functions.
SELECT parseDateTimeBestEffort('2024-12-15 14:30:22');
SELECT parseDateTimeBestEffort('Dec 15, 2024');
SELECT parseDateTimeBestEffort('20241215143022');
toDateTime with custom format
For non-standard numeric layouts or when you want precise control over parsing, parseDateTime()
with format strings offers maximum flexibility at the cost of slightly reduced performance.
SELECT parseDateTime('15/12/2024 14:30:22', '%d/%m/%Y %H:%M:%S');
SELECT parseDateTime('241215', '%y%m%d');
Handling nulls timezones and invalid values
Real-world data often contains formatting errors or impossible dates, so robust date parsing requires careful error handling to prevent query failures.
Use OrNull variants
Functions like YYYYMMDDToDateOrNull()
return NULL for invalid dates instead of throwing exceptions. This approach is essential for processing data that might contain formatting errors.
SELECT
raw_date,
YYYYMMDDToDateOrNull(raw_date) AS parsed_date
FROM events
WHERE YYYYMMDDToDateOrNull(raw_date) IS NULL; -- Find invalid dates
Combine with assumeNotNull
When you've validated your data and want to maintain non-nullable column types, use assumeNotNull()
to convert nullable results back to regular types.
SELECT assumeNotNull(YYYYMMDDToDateOrNull(validated_date)) AS clean_date
FROM events
WHERE YYYYMMDDToDateOrNull(validated_date) IS NOT NULL;
The key difference between regular functions and OrNull variants:
- Regular functions: Throw exceptions on invalid input, best for clean validated data
- OrNull functions: Return NULL for invalid dates, essential for dirty data that needs validation
- assumeNotNull: Converts nullable results back to non-nullable types after validation
Building a real-time analytics API over time series data
When your goal is to integrate analytics on time series data back into your application, you need to create an API. Tinybird makes it easy to build ClickHouse-based API endpoints with built in DateTime parsing. Tinybird's managed ClickHouse platform handles the infrastructure complexity while you focus on building features. Here's an example walkthrough building an analytics API with Tinybird.
Create a data source from raw feed
Let's assume your app produces numeric datetimes. You can set up a Tinybird data source with numeric date columns, defining the schema to accept integer date values that will be parsed either during query time or through materialized transformations.
SCHEMA >
event_id UInt64,
event_date UInt32,
event_timestamp UInt64,
user_id String,
event_type String
ENGINE MergeTree
ORDER BY (event_date, user_id)
Define a pipe to parse and filter
Write a pipe that converts numeric dates and applies filters, transforming raw integer dates into proper ClickHouse date types for efficient querying and API responses.
NODE events_parsed
SQL >
SELECT
event_id,
YYYYMMDDToDate(event_date) AS parsed_date,
YYYYMMDDhhmmssToDateTime(event_timestamp) AS parsed_timestamp,
user_id,
event_type
FROM raw_events
WHERE YYYYMMDDToDateOrNull(event_date) IS NOT NULL
AND event_date >= {{Date(start_date, '20240101')}}
AND event_date <= {{Date(end_date, '20241231')}}
Materialize the parsed data for better performance
For high-frequency queries, create a materialized view that pre-computes the date parsing and stores results in an optimized data source. Create a parsed_events.datasource
file:
SCHEMA >
`event_id` String,
`parsed_date` Date,
`parsed_timestamp` DateTime,
`user_id` String,
`event_type` String
ENGINE MergeTree
ENGINE_PARTITION_KEY "toYYYYMM(parsed_date)"
ENGINE_SORTING_KEY "parsed_date, user_id"
Then create a Materialized View that continuously processes raw events and materializes the parsed results:
NODE events_parser
SQL >
SELECT
event_id,
YYYYMMDDToDate(event_date) AS parsed_date,
YYYYMMDDhhmmssToDateTime(event_timestamp) AS parsed_timestamp,
user_id,
event_type
FROM raw_events
WHERE YYYYMMDDToDateOrNull(event_date) IS NOT NULL
AND YYYYMMDDToDateTimeOrNull(event_timestamp) IS NOT NULL
TYPE MATERIALIZED
TARGET_DATASOURCE parsed_events
Deploy both the new data source and materialized view:
tb --cloud deploy
This approach pre-processes date parsing during ingestion, improving API response times by eliminating the need to parse dates at query time. The materialized view automatically maintains the parsed data source as new events arrive.
Create an optimized API from materialized data
Now create a new pipe that queries the pre-parsed data source for maximum performance:
NODE fast_events_query
SQL >
SELECT
event_id,
parsed_date,
parsed_timestamp,
user_id,
event_type
FROM parsed_events
WHERE parsed_date >= {{Date(start_date, '2024-01-01')}}
AND parsed_date <= {{Date(end_date, '2024-12-31')}}
ORDER BY parsed_timestamp DESC
LIMIT {{Int32(limit, 100)}}
Publish an endpoint
Deploy the pipe as an API endpoint with parameters, enabling applications to query parsed date data through a fast, secure REST API without managing ClickHouse infrastructure.
curl "https://api.tinybird.co/v0/pipes/events_parsed.json?start_date=20241201&end_date=20241215" \
-H "Authorization: Bearer $TINYBIRD_TOKEN"
The response returns properly formatted dates that applications can consume directly, eliminating the need for client-side date parsing and ensuring consistent date handling across your stack.
Next steps for time series analytics with Tinybird
ClickHouse is the perfect database for processing and analyzing time series data. While the approaches described above work in any ClickHouse deployment, managing the database infrastructure, optimizing performance, and scaling to handle high-throughput data ingestion reqcan requireuires significant operational overhead.
Tinybird's managed ClickHouse platform reduces operational complexity by providing optimized builds, automatic scaling, and developer-friendly APIs, enabling developers to build analytics features without direct management of database infrastructure.
You can start building immediately using Tinybird's free plan, which includes the necessary tools and resources to prototype and deploy real-time analytics APIs.
Start building with Tinybird's free plan to see how fast you can build analytics features on Tinybird's managed infrastructure.
Additional resources
- ClickHouse Date and Time functions docs
- ClickHouse Type Conversion functions docs
- ClickHouse DateTime Data Types docs
- Working with Time in Tinybird
- Materialized Views Guide
- Tinybird vs. ClickHouse
- Database Timestamps and Timezones Best Practices
FAQs about numeric date parsing in ClickHouse
Can I convert DateTime back to a numeric yyyymmddhhmmss format?
Use formatDateTime(datetime_column, '%Y%m%d%H%M%S')
to convert DateTime back to numeric string format. This approach works well when you need to export data in the original numeric format or interface with systems that expect this specific layout.
Does YYYYMMDDhhmmssToDateTime64 preserve microseconds?
No, this function only parses to second precision since the input format doesn't include microseconds. The DateTime64 precision parameter affects the fractional seconds in the output, but the source data must contain the fractional component for it to be preserved.
What ClickHouse version introduced YYYYMMDDhhmmssToDateTime64?
This function was added in ClickHouse version 21.4 along with other DateTime64 parsing functions. If you're using an older version, you'll need to upgrade or use alternative parsing methods like parseDateTimeBestEffort()
.