---
title: "ClickHouse healthcare analytics"
excerpt: "Healthcare data is high-volume, high-stakes, and structurally complex: ADT events, lab results, vitals streams, medication records, and billing codes all arrive continuously and need to be queryable in real time for clinical operations, quality reporting, and population health. ClickHouse handles the volume and latency that healthcare analytics requires."
authors: "Tinybird"
categories: "AI Resources"
createdOn: "2026-06-15 00:00:00"
publishedOn: "2026-06-15 00:00:00"
updatedOn: "2026-06-15 00:00:00"
status: "published"
---

Healthcare organizations generate clinical event data at a volume that most analytics systems can't keep up with. A 500-bed hospital produces ADT (Admit/Discharge/Transfer) events, lab results, vitals readings, medication administration records, and billing codes continuously across every patient encounter. At a health system level, that scales to hundreds of millions of events per month.

The analytics layer needs to answer questions across all of it in real time: Which patients in the ICU have deteriorating SOFA scores? What is today's emergency department boarding time by hour? Which medication administration patterns deviate from protocol? Standard relational databases answer these slowly if at all. ClickHouse® handles the volume and the latency healthcare analytics requires.

This post covers the schemas, queries, and patterns for clinical event tracking, real-time monitoring, population health analytics, quality metrics, and compliance reporting in healthcare.

## Clinical event schema

The core table captures clinical events as an append-only log. Healthcare data is fundamentally event-driven: a patient encounter produces a stream of events (assessments, orders, results, administration records) that form the audit trail for care decisions. ClickHouse's `MergeTree` engine fits this pattern:

```sql
CREATE TABLE clinical_events
(
    event_time          DateTime64(3),
    event_type          LowCardinality(String),  -- 'adt', 'lab_result', 'vitals', 'medication', 'order'
    encounter_id        String,
    patient_id          String,
    facility_id         LowCardinality(String),
    unit_id             LowCardinality(String),
    provider_id         String,
    diagnosis_code      LowCardinality(Nullable(String)),  -- ICD-10
    procedure_code      LowCardinality(Nullable(String)),  -- CPT
    observation_code    LowCardinality(Nullable(String)),  -- LOINC code
    observation_value   Nullable(Float64),
    observation_unit    LowCardinality(Nullable(String)),
    severity            LowCardinality(Nullable(String)),  -- 'critical', 'abnormal', 'normal'
    properties          String                              -- JSON for event-specific data
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(event_time)
ORDER BY (facility_id, unit_id, event_time, encounter_id);
```

`LowCardinality` on `event_type`, `facility_id`, `unit_id`, `diagnosis_code`, and `observation_code` keeps GROUP BY performance high on the high-frequency queries that clinical dashboards run. LOINC codes for lab observations and ICD-10 for diagnoses typically have fewer than 10,000 distinct values in any deployment, which is exactly the range where ClickHouse's dictionary encoding provides the most benefit according to the official ClickHouse documentation.

## Real-time patient monitoring

For clinical operations teams watching patient status across units, a real-time vitals aggregation query over the last hour:

```sql
SELECT
    unit_id,
    encounter_id,
    argMax(observation_value, event_time)   AS latest_value,
    argMax(severity, event_time)            AS latest_severity,
    max(event_time)                         AS last_reading_time,
    dateDiff('minute', max(event_time), now()) AS minutes_since_last
FROM clinical_events
WHERE event_type = 'vitals'
  AND observation_code = '8867-4'  -- LOINC: heart rate
  AND facility_id = {{ String(facility_id, required=True) }}
  AND event_time >= now() - INTERVAL 1 HOUR
GROUP BY unit_id, encounter_id
HAVING latest_severity = 'critical'
   OR minutes_since_last > 30
ORDER BY minutes_since_last DESC;
```

The `argMax` pattern gets the latest value per encounter without waiting for background deduplication, which is important for dashboards that need to display current patient state.

For early deterioration detection using the NEWS2 (National Early Warning Score) components:

```sql
WITH latest_vitals AS (
    SELECT
        encounter_id,
        maxIf(observation_value, observation_code = '8867-4') AS heart_rate,
        maxIf(observation_value, observation_code = '9279-1') AS resp_rate,
        maxIf(observation_value, observation_code = '59408-5') AS spo2,
        maxIf(observation_value, observation_code = '8310-5') AS temperature,
        max(event_time)                                         AS last_update
    FROM clinical_events
    WHERE event_type = 'vitals'
      AND facility_id = {{ String(facility_id, required=True) }}
      AND event_time >= now() - INTERVAL 4 HOUR
    GROUP BY encounter_id
)
SELECT
    encounter_id,
    heart_rate,
    resp_rate,
    spo2,
    temperature,
    last_update,
    multiIf(
        resp_rate < 8 OR resp_rate > 25, 'high_risk',
        spo2 < 91 OR heart_rate > 130, 'high_risk',
        resp_rate BETWEEN 21 AND 24 OR spo2 BETWEEN 91 AND 93, 'medium_risk',
        'low_risk'
    ) AS deterioration_risk
FROM latest_vitals
WHERE deterioration_risk != 'low_risk'
ORDER BY last_update DESC;
```

## [Real-time anomaly detection](https://www.tinybird.co/blog/real-time-anomaly-detection) on lab results

Flagging lab results that fall outside expected ranges for the clinical context, using Z-score against the patient's own 90-day baseline:

```sql
WITH patient_baselines AS (
    SELECT
        patient_id,
        observation_code,
        avg(observation_value)       AS avg_value,
        stddevPop(observation_value) AS stddev_value
    FROM clinical_events
    WHERE event_type = 'lab_result'
      AND observation_value IS NOT NULL
      AND event_time >= today() - 90
    GROUP BY patient_id, observation_code
    HAVING count() >= 3
)
SELECT
    e.encounter_id,
    e.patient_id,
    e.observation_code,
    e.observation_value,
    b.avg_value,
    round((e.observation_value - b.avg_value) / nullIf(b.stddev_value, 0), 2) AS z_score
FROM clinical_events AS e
INNER JOIN patient_baselines AS b
    USING (patient_id, observation_code)
WHERE e.event_type = 'lab_result'
  AND e.event_time >= now() - INTERVAL 2 HOUR
  AND abs((e.observation_value - b.avg_value) / nullIf(b.stddev_value, 0)) > 3
ORDER BY abs(z_score) DESC;
```

A Z-score above 3 means a lab result is more than 3 standard deviations from the patient's own 90-day baseline, a meaningful outlier signal that complements absolute reference ranges.

## Quality metrics and readmission analytics

30-day readmission rate is a CMS quality metric that affects hospital reimbursement. Computing it across a full patient population in ClickHouse:

```sql
SELECT
    toStartOfMonth(index_discharge)            AS month,
    primary_diagnosis_code,
    count()                                    AS index_discharges,
    countIf(readmitted_within_30d)             AS readmissions_30d,
    round(
        countIf(readmitted_within_30d) /
        count() * 100, 2
    )                                          AS readmission_rate_pct
FROM (
    SELECT
        e1.encounter_id,
        e1.patient_id,
        argMax(e1.event_time, e1.event_time)   AS index_discharge,
        argMax(e1.diagnosis_code, e1.event_time) AS primary_diagnosis_code,
        min(e2.event_time) IS NOT NULL
            AND min(e2.event_time) <= index_discharge + INTERVAL 30 DAY
                                               AS readmitted_within_30d
    FROM clinical_events AS e1
    LEFT JOIN clinical_events AS e2
        ON e1.patient_id = e2.patient_id
        AND e2.event_type = 'adt'
        AND JSONExtractString(e2.properties, 'adt_type') = 'admission'
        AND e2.event_time > argMax(e1.event_time, e1.event_time)
    WHERE e1.event_type = 'adt'
      AND JSONExtractString(e1.properties, 'adt_type') = 'discharge'
    GROUP BY e1.encounter_id, e1.patient_id
)
GROUP BY month, primary_diagnosis_code
HAVING index_discharges >= 10
ORDER BY month DESC, readmission_rate_pct DESC;
```

## Population health and cohort analytics

Identifying patients with multiple chronic conditions (comorbidities) for care management programs:

```sql
SELECT
    patient_id,
    uniq(
        CASE WHEN diagnosis_code LIKE 'E11%' THEN 'diabetes' END,
        CASE WHEN diagnosis_code LIKE 'I10%' THEN 'hypertension' END,
        CASE WHEN diagnosis_code LIKE 'I25%' THEN 'coronary_artery_disease' END,
        CASE WHEN diagnosis_code LIKE 'J44%' THEN 'copd' END,
        CASE WHEN diagnosis_code LIKE 'N18%' THEN 'ckd' END
    )                                           AS chronic_condition_count,
    count()                                     AS total_encounters_12m,
    max(event_time)                             AS last_encounter
FROM clinical_events
WHERE event_type IN ('adt', 'order')
  AND diagnosis_code IS NOT NULL
  AND event_time >= today() - 365
GROUP BY patient_id
HAVING chronic_condition_count >= 3
ORDER BY chronic_condition_count DESC, total_encounters_12m DESC;
```

## Medication safety and ADE detection

Adverse drug events (ADEs) are a leading cause of preventable harm in hospitals. ClickHouse can run multi-step sequence detection to flag patients who received a potentially harmful drug combination within a short window:

```sql
SELECT
    encounter_id,
    patient_id,
    windowFunnel(7200)(
        event_time,
        JSONExtractString(properties, 'drug_class') = 'anticoagulant',
        JSONExtractString(properties, 'drug_class') = 'antiplatelet',
        event_type = 'lab_result'
            AND observation_code = '3173-2'
            AND observation_value < 50  -- low platelet count
    ) AS risk_stage
FROM clinical_events
WHERE event_type IN ('medication', 'lab_result')
  AND event_time >= today()
GROUP BY encounter_id, patient_id
HAVING risk_stage >= 2
ORDER BY risk_stage DESC;
```

`risk_stage = 3` indicates a patient received both drug classes and then showed a low platelet count within a 2-hour window: a potential ADE signal that warrants clinical review.

## Emergency department and bed management analytics

ED throughput metrics are operationally critical and involve aggregating across thousands of encounters in real time:

```sql
SELECT
    toStartOfHour(event_time)                   AS hour,
    facility_id,
    count()                                     AS total_adt_events,
    countIf(JSONExtractString(properties, 'adt_type') = 'admission')  AS admissions,
    countIf(JSONExtractString(properties, 'adt_type') = 'discharge')  AS discharges,
    countIf(JSONExtractString(properties, 'adt_type') = 'transfer')   AS transfers,
    avg(
        JSONExtractFloat(properties, 'length_of_stay_hours')
    )                                           AS avg_los_hours,
    quantile(0.9)(
        JSONExtractFloat(properties, 'door_to_provider_minutes')
    )                                           AS p90_door_to_provider_min
FROM clinical_events
WHERE event_type = 'adt'
  AND event_time >= now() - INTERVAL 24 HOUR
GROUP BY hour, facility_id
ORDER BY hour DESC, facility_id;
```

Door-to-provider time at p90 is a key ED quality metric. A single facility seeing p90 above 60 minutes in this query is an operational alert worth escalating immediately. ClickHouse runs this across all facilities simultaneously, so the operations center dashboard refreshes across the entire health system in the same query.

## Compliance and audit queries

Healthcare audit requirements mandate that access to patient records is logged and queryable. Because `patient_id` is first in the sort key, [building real-time APIs on ClickHouse](https://www.tinybird.co/blog/build-real-time-apis-clickhouse-tinybird) for HIPAA audit requests returns the full event history for a given patient ID in milliseconds regardless of how many total events the table contains:

```sql
SELECT
    event_time,
    event_type,
    encounter_id,
    provider_id,
    observation_code,
    observation_value,
    severity
FROM clinical_events
WHERE patient_id = {{ String(patient_id, required=True) }}
ORDER BY event_time DESC;
```

The same pattern works for provider activity audits (all events by `provider_id`), facility compliance reports (all events by `facility_id` and date range), and payer audits (all events with specific `procedure_code` values).

## Tinybird for healthcare

Healthcare analytics has two requirements that are difficult to satisfy simultaneously: the speed that clinical operations demand and the compliance posture that healthcare data requires.

Tinybird is managed ClickHouse and is SOC 2 Type II certified. The [enterprise-grade real-time analytics](https://www.tinybird.co/blog/enterprise-grade-real-time-analytics) pattern applies directly: the clinical monitoring queries in this post become parameterized HTTP endpoints your nurse station dashboard, care management platform, or clinical decision support system calls in real time. SQL enforces that each request only returns data for the specified facility, unit, or patient, with no application-level row filtering required.

The ingestion path for healthcare event data: HL7 FHIR resources from your EHR system map to ClickHouse rows via the [Kafka to ClickHouse](https://www.tinybird.co/blog/kafka-to-clickhouse-example) connector, or directly through the Events API for systems that emit JSON. ADT events, lab results, and vitals readings become queryable within seconds of the clinical event occurring.

For population health and quality metrics that run on longer time horizons, [what are materialized views and why do they matter for real-time](https://www.tinybird.co/blog/what-are-materialized-views-and-why-do-they-matter-for-realtime) explains the pre-aggregation pattern: define a materialized view that computes monthly readmission rates as data arrives, and the quality dashboard query reads the pre-aggregated result rather than scanning the full events table. Tinybird's own engineering team documented that pre-aggregated rollups reduce query time from 20-30 seconds on raw event tables to 250ms on a 500M-row dataset.

For [real-time streaming data architectures](https://www.tinybird.co/blog/real-time-streaming-data-architectures-that-scale) in healthcare, where EHR systems, lab systems, pharmacy systems, and device integrations all produce data on different schedules, Tinybird acts as the unified analytical layer. Each source writes to its own datasource. SQL Pipes join and aggregate across sources. HTTP endpoints expose the results. Your EHR vendor manages the OLTP layer. Tinybird manages the analytical layer.

{% cta
  title="Real-time clinical analytics without the infrastructure"
  text="Tinybird is managed ClickHouse with streaming ingestion and SQL endpoints. Query billions of clinical events in milliseconds."
  button={href: "https://cloud.tinybird.co/signup", target: "_blank", text: "Try Tinybird free"}
/%}
