---
name: Data & Analytics Engineering
slug: data-engineering
description: Transform, orchestrate, and query data at scale — the modern data stack.
member_count: 4
canonical: https://releases.sh/collections/data-engineering
---

# Data & Analytics Engineering

Transform, orchestrate, and query data at scale — the modern data stack.

## Members (4)

- [dbt Labs](https://releases.sh/dbt-labs) — getdbt.com
- [Dagster](https://releases.sh/dagster) — dagster.io
- [ClickHouse](https://releases.sh/clickhouse) — clickhouse.com
- [Tinybird](https://releases.sh/tinybird) — tinybird.co

## Fetching more

Append `.md` (markdown), `.json` (raw data), or `.atom` (feed) to any URL on this page.

- Aggregated release feed: `https://releases.sh/collections/data-engineering.atom`

## Recent Releases

---
collection: data-engineering
collection_name: Data & Analytics Engineering
release_count: 20
has_more: true
canonical: https://releases.sh/collections/data-engineering
---

<Release version="1.13.8" date="June 4, 2026" published="2026-06-04T18:11:04.000Z" url="https://github.com/dagster-io/dagster/releases/tag/1.13.8" org="dagster" source="dagster">
## 1.13.8 (core) / 0.29.8 (libraries)

### New

- Special characters are now allowed in team `owners` for jobs, schedules, and sensors. (Thanks, [@dragos-pop](https://github.com/dragos-pop)!)
- Added `kinds` tags and icons for Microsoft Fabric and OneLake. (Thanks, [@MartyP233](https://github.com/MartyP233)!)
- The BigQuery, Snowflake, and DuckDB I/O managers now skip the table write when an asset returns an empty DataFrame, logging a warning instead. This prevents incorrect type inference and degenerate tables for empty partitions.
- [ui] The per-code-location "Docs" tab is now "Components", with its content available under a "Library" subtab. Existing `/docs` links continue to work via redirect.
- [dagster-aws] The EMR PySpark step launcher now honors a configured S3 job package path.
- [dagster-cloud] `code_server.*` metrics now carry a `server_instance_id` tag identifying the underlying gRPC server process. For multi-replica code locations, the tag identifies whichever replica answered the most recent metrics ping.

### Bugfixes

- `InstigationLogger` now stringifies log record attributes that are not JSON-serializable instead of failing to emit the log. (Thanks, [@jonaslb](https://github.com/jonaslb)!)
- Fixed a SQL injection vulnerability in `dagster-clickhouse`, `dagster-clickhouse-pandas`, and `dagster-clickhouse-polars` when using dynamic partition keys. ClickHouse partition queries now bind partition key values as driver parameters instead of interpolating them into SQL strings.
- Fixed an issue where tag values in component YAML files were not coerced correctly.
- [dg] `dg labs ai dispatch` no longer writes a `plan.md` artifact into version control.
- [dg] Fixed an error that occurred when using `TypedDict`-typed fields in component configuration.
- [ui] Fixed partition step status labels and squares that rendered incorrectly after the CSS Modules migration.
- [ui] Fixed backfill progress not reaching 100% until all runs had completed.
- [dagster-databricks] Fixed Databricks workspace job list pagination and added retries for rate-limit responses.
- [dagster-dbt] Fixed `DbtProject.prepare()` to always run `dbt deps` when dependency files are present.
- [dagster-dbt] Fixed YAML-based translation in `DbtProjectComponent` subclasses to also translate dependency keys.
- [dagster-dbt] Fixed an issue where dbt unit tests could be ignored unexpectedly.
- [dagster-dbt] Fixed incorrect asset key translation when generating column lineage with `DbtProjectComponent`.
- [dagster-dbt] Fixed unbounded recursive copying of the `.local_defs_state()` directory when a dbt project was located at the repository root.
- [dagster-cloud-cli] Fixed `dg plus deploy` failing with `No module named pip` in `uv`-managed environments.
- [dagster-cloud-cli] Removed an overly strict dependency-file check from the Docker deploy path.

### Documentation

- Added an integration reference for `dagster-elasticsearch`.
- Added documentation for the OpenLineage integration.
- Added documentation for `dagster-hf-datasets`.
- Added an example covering deployment strategies.
</Release>

<Release date="June 3, 2026" published="2026-06-03T00:00:00.000Z" url="https://dagster.io/blog#community-showcase-part-1" org="dagster" source="dagster-blog">
## Community Showcase Part 1

Some of the most interesting Dagster projects come from the community. This post highlights creative community-built applications ranging from public data exploration and infrastructure monitoring to research automation and internal tooling, along with why their creators chose Dagster and what building with it was like.
</Release>

<Release version="26.5" date="June 1, 2026" published="2026-06-01T19:11:30.000Z" url="https://clickhouse.com/blog/clickhouse-release-26-05" org="clickhouse" source="clickhouse-blog">
## ClickHouse Release 26.5

# ClickHouse Release 26.5

Another month goes by, which means it's time for another release!

The ClickHouse 26.5 release contains 38 new features 🌹 51 performance optimizations 🦋 224 bug fixes 🐞

This release sees a record number of performance optimizations, with highlights including ORDER BY … LIMIT pushdown through joins (up to 20× faster), a new GROUP BY … LIMIT shortcut that avoids building unnecessary groups, a new `filesystem` table function for running SQL directly against your local file system, and more!

## New contributors

A special welcome to all the new contributors in 26.5! The growth of ClickHouse's community is humbling, and we are always grateful for the contributions that have made ClickHouse so popular.

Below are the names of the new contributors:

*Abhinav Agarwal, Ahaan, Alex Kuleshov, Ashrith Bandla, Asish Kumar, Callum C, Felix Bernhard, Flavio Malavazi, Ian Rakhmatullin, Ilya Perstenev, JackFielding, Joe Redfern, Larry Snizek, Luc Leray, Rahul Nair, Roy Sindre Norangshol, Venkata Vineel, Vincent Voyer, Yue, Yue Ni, functioncrafter, ibrahim karimeddin, mohaidoss, perst20, peter15914, sayondeep, zhangzhibiao, zxuhan7*

Hint: if you're curious how we generate this list… [here](https://gist.github.com/gingerwizard/5a9a87a39ba93b422d8640d811e269e9).

You can also [view the slides from the presentation](https://presentations.clickhouse.com/2026-release-26.5).

## Push ORDER BY … LIMIT through JOIN

### Contributed by Alexey Milovidov

> "We optimize ClickHouse in every version, we optimize it more, and there is no end in optimizations" – Alexey Milovidov [during the ClickHouse release 26.5 webinar](https://www.youtube.com/live/P1IDAvsi7p8?si=5A3vFFIlNg51spxh&t=1512)

### Moving more work before joins

In recent releases, ClickHouse has been steadily moving more work before joins, so less data has to pass through them. For example, ClickHouse already [pushes down complex OR conditions in JOIN queries](https://clickhouse.com/blog/clickhouse-release-25-10#push-down-of-complex-conditions-in-joins) to filter each table earlier, before the join happens. It also supports [runtime filters](https://clickhouse.com/blog/clickhouse-release-25-10#bloom-filters-in-joins), which are created from the right-hand side of a join and applied to the left-hand side before the join runs.

This release continues that theme, but pushes down a different kind of work: not a WHERE predicate, but the ORDER BY … LIMIT clause, a pattern that appears frequently in analytical workloads.

### From "join then limit" to "limit then join"

If the outermost SELECT of a LEFT JOIN query ends with ORDER BY … LIMIT, and the sort key depends only on columns from the left table, ClickHouse can push that ORDER BY … LIMIT below the join.

The same applies to `RIGHT JOIN` queries when the sort key depends only on columns from the **right table**.

For example, this query running over TPC-H tables asks for the 100 most recent orders, enriched with customer information:

```sql
SELECT
    o_orderkey,
    o_orderdate,
    o_totalprice,
    c_name,
    c_mktsegment
FROM orders
LEFT JOIN customer ON o_custkey = c_custkey
ORDER BY
    o_orderdate DESC,
    o_orderkey DESC
LIMIT 100;
```

Here, the `ORDER BY` uses only columns from `orders`, the preserved side of the `LEFT JOIN`. That means ClickHouse does not need to join every order with its customer before applying the limit.

Without the optimization, the plan is forced to do the expensive join first:

![Blog-release-26.05.001.png](/uploads/Blog_release_26_05_001_5feb2264fa.png)

With the new optimization, ClickHouse can flip the work around: it can first find the top 100 rows from `orders`, and then join only those few rows with `customer`.

![Blog-release-26.05.002.png](/uploads/Blog_release_26_05_002_abd55ce8d9.png)

You can also see the change in the query plan obtained via [EXPLAIN](https://clickhouse.com/docs/sql-reference/statements/explain). With the optimization enabled, the plan contains a Limit and Sorting step on the orders table side, before the join with the customer table:

```shell
Join

  ...

    Limit

      Sorting

        ReadFromMergeTree (sf100.orders)

  ...

ReadFromMergeTree (sf100.customer)
```

A nice side effect is that ClickHouse already treats the pushed-down `ORDER BY … LIMIT` part as a first-class query pattern. As covered in our [dedicated Top-N optimization post](https://clickhouse.com/blog/clickhouse-top-n-queries-granule-level-data-skipping), ClickHouse has accumulated several engine-level optimizations for this pattern.

This optimization is controlled by the new [query\_plan\_top\_k\_through\_join](https://clickhouse.com/docs/operations/settings/settings#query_plan_top_k_through_join) setting, which is enabled by default.

### Benchmark: 20× faster and 175× less memory

To evaluate the impact, we created and loaded the [TPC-H schema with a scale factor of 100](https://clickhouse.com/docs/getting-started/example-datasets/tpch) on an AWS EC2 `m6i.8xlarge` instance with 32 vCPUs and 128 GiB of RAM.

First, we ran the query with the new `ORDER BY … LIMIT` pushdown disabled by setting `query_plan_top_k_through_join = 0`. We executed the query three times and used the fastest run as the baseline:

```shell
Elapsed: 2.153 sec. Processed 165.00 million rows, 3.23 GB (76.65 million rows/s., 1.50 GB/s.)
Peak memory usage: 1.87 GiB.

Elapsed: 1.878 sec. Processed 165.00 million rows, 3.23 GB (87.87 million rows/s., 1.72 GB/s.)
Peak memory usage: 1.88 GiB.

Elapsed: 2.197 sec. Processed 165.00 million rows, 3.23 GB (75.10 million rows/s., 1.47 GB/s.)
Peak memory usage: 1.87 GiB.
```

Then we ran the same query with the optimization enabled by setting `query_plan_top_k_through_join = 1`:

```shell
Elapsed: 0.093 sec. Processed 165.22 million rows, 2.18 GB (1.78 billion rows/s., 23.45 GB/s.)
Peak memory usage: 11.46 MiB.

Elapsed: 0.092 sec. Processed 165.22 million rows, 2.18 GB (1.80 billion rows/s., 23.70 GB/s.)
Peak memory usage: 13.72 MiB.

Elapsed: 0.092 sec. Processed 165.22 million rows, 2.18 GB (1.79 billion rows/s., 23.53 GB/s.)
Peak memory usage: 10.98 MiB.
```

Using the fastest run from each configuration, the difference is significant:

| Setting | Fastest runtime | Peak memory | Data read |
|---------|-----------------|-------------|-----------|
| Pushdown disabled | 1.878 sec | 1.88 GiB | 3.23 GB |
| Pushdown enabled | 0.092 sec | 10.98 MiB | 2.18 GB |
| Improvement | **20.4× faster** | **~175× less memory** | **1.5× less data read** |

> This benchmark already shows a **20.4× runtime improvement** and around **175× lower peak memory usage**.

These numbers are not a fixed ceiling. The benefit depends on the size of the input tables, the width of the joined rows, the selected columns, and the LIMIT value.

## GROUP BY … LIMIT with no ORDER BY

### Contributed by Amos Bird

### Extending Top-N optimizations to GROUP BY

ClickHouse already treats Top-N queries as a first-class query pattern. As covered in our dedicated [Top-N optimization post](https://clickhouse.com/blog/clickhouse-top-n-queries-granule-level-data-skipping), ClickHouse has accumulated several engine-level optimizations for queries with ORDER BY … LIMIT, including streaming execution, read-in-order, lazy reading, and data-skipping-based Top-N pruning.

This release extends the same idea to another shape: GROUP BY … LIMIT queries without ORDER BY.

Consider a query that groups by a key and then applies `LIMIT`, but has no `ORDER BY`, no `HAVING` clause, and no window function. In that case, the query does not ask for the smallest keys, the largest keys, the most frequent keys, or keys in any particular order. It only asks for **any N distinct grouping keys**.

For example, because we already had the TPC-H dataset loaded for the previous section's benchmark, we can reuse it here. This query asks for any 100 distinct order keys from the `lineitem` table:

```sql
SELECT l_orderkey
FROM lineitem
GROUP BY l_orderkey
LIMIT 100;
```

### From "group everything, then limit" to "keep only N groups"

In TPC-H scale factor 100, `lineitem` contains 600 million rows and 150 million distinct `l_orderkey` values.

Without the new optimization, ClickHouse treats the query like a regular `GROUP BY`: as it scans the input, every new `l_orderkey` creates a new entry in the [aggregation hash table](https://clickhouse.com/blog/clickhouse-parallel-replicas#how-clickhouse-makes-group-by-fast). Only after the aggregation result has been built does `LIMIT 100` reduce the output to 100 rows.

![Blog-release-26.05.003.png](/uploads/Blog_release_26_05_003_03f393fa9d.png)

With this release, ClickHouse recognizes this special pattern and avoids building groups that cannot affect the result. The optimization is controlled by the new [`optimize_trivial_group_by_limit_query`](https://clickhouse.com/docs/operations/settings/settings#optimize_trivial_group_by_limit_query) setting, which is enabled by default.

For eligible queries, ClickHouse internally sets the [aggregation limit](https://clickhouse.com/docs/operations/settings/settings#max_rows_to_group_by) to `LIMIT + OFFSET` and uses [`group_by_overflow_mode`](https://clickhouse.com/docs/operations/settings/settings#group_by_overflow_mode) `= 'any'`. In practice, this means that once the aggregation hash table contains the first 100 distinct `l_orderkey` values, new keys are ignored instead of being added as new groups.

![Blog-release-26.05.004.png](/uploads/Blog_release_26_05_004_0dabca7c6d.png)

The scan still processes the input, but the aggregation state in main memory stays tiny: 100 groups instead of growing toward 150 million.

### Benchmark: 11.9× faster and 185× less memory

To evaluate the impact, we ran the query again on an AWS EC2 m6i.8xlarge instance with 32 vCPUs and 128 GiB RAM. First, we disabled the optimization by setting `optimize_trivial_group_by_limit_query = 0` and used the fastest of three runs as the baseline:

```shell
Elapsed: 0.853 sec. Processed 600.04 million rows, 2.40 GB (703.29 million rows/s., 2.81 GB/s.)
Peak memory usage: 8.60 GiB.

Elapsed: 0.806 sec. Processed 600.04 million rows, 2.40 GB (744.07 million rows/s., 2.98 GB/s.)
Peak memory usage: 8.58 GiB.

Elapsed: 0.809 sec. Processed 600.04 million rows, 2.40 GB (742.06 million rows/s., 2.97 GB/s.)
Peak memory usage: 8.57 GiB.
```

Then we ran the same query with the optimization enabled by setting `optimize_trivial_group_by_limit_query = 1`:

```shell
Elapsed: 0.069 sec. Processed 600.04 million rows, 2.40 GB (8.76 billion rows/s., 35.03 GB/s.)
Peak memory usage: 47.54 MiB.

Elapsed: 0.070 sec. Processed 600.04 million rows, 2.40 GB (8.54 billion rows/s., 34.16 GB/s.)
Peak memory usage: 47.54 MiB.

Elapsed: 0.068 sec. Processed 600.04 million rows, 2.40 GB (8.79 billion rows/s., 35.17 GB/s.)
Peak memory usage: 47.55 MiB.
```

Using the fastest run from each configuration:

| Setting | Fastest runtime | Rows processed | Data read | Peak memory |
|---------|-----------------|-----------------|-----------|-------------|
| Optimization disabled | 0.806 sec | 600.04 million | 2.40 GB | 8.58 GiB |
| Optimization enabled | 0.068 sec | 600.04 million | 2.40 GB | 47.55 MiB |
| Improvement | **11.9× faster** | same | same | **~185× less memory** |

> The optimized query is **11.9× faster** and uses about **185× less peak memory**.

## The filesystem table function

### Contributed by Ilya Perstenev, Ilya Yatsishin, Alexey Milovidov

ClickHouse 25.6 also introduces the `filesystem` table function, which lets us list and analyze a directory as a queryable table.

The full schema exposed by `filesystem` covers everything you'd expect for filesystem introspection:

```sql
DESCRIBE filesystem();
```

```shell
┌─name──────────────┬─type───────────────────────────────────────────────┐
│ path              │ String                                             │
│ name              │ String                                             │
│ type              │ Enum8('none' = 0, 'not_found' = 1, 'regular' = 2, ⋯│
│ size              │ Nullable(UInt64)                                   │
│ depth             │ UInt16                                             │
│ modification_time │ Nullable(DateTime64(6))                            │
│ is_symlink        │ Bool                                               │
│ content           │ Nullable(String)                                   │
│ owner_read        │ Bool                                               │
│ owner_write       │ Bool                                               │
│ owner_exec        │ Bool                                               │
│ group_read        │ Bool                                               │
│ group_write       │ Bool                                               │
│ group_exec        │ Bool                                               │
│ others_read       │ Bool                                               │
│ others_write      │ Bool                                               │
│ others_exec       │ Bool                                               │
│ set_gid           │ Bool                                               │
│ set_uid           │ Bool                                               │
│ sticky_bit        │ Bool                                               │
│ file              │ String                                             │
└───────────────────┴────────────────────────────────────────────────────┘
```

If we call it with no arguments, using clickhouse-local, it will list files in the current directory:

```sql
SELECT path, name FROM filesystem();
```

```shell
┌─path──────────────────────────────────────────────┬─name──────────────────────┐
│ /Users/markhneedham/projects/release-posts/26.5   │ clickhouse                │
│ /Users/markhneedham/projects/release-posts/26.5   │ .claude                   │
└───────────────────────────────────────────────────┴───────────────────────────┘
```

It has access to the same parts of the file system as the user who launched ClickHouse. If you call it via ClickHouse Server, it will list the files in the `user_files` directory.

I have a lot of large video files on my machine, and I (or rather Claude!) usually have to run a bunch of Unix commands to find them. With this new function, it's as simple as the following query:

```sql
SELECT path, name, formatReadableSize(size), modification_time
FROM filesystem('/Users/markhneedham/projects/videos')
WHERE type = 'regular' AND name LIKE '%.braw'
ORDER BY size DESC
LIMIT 3
FORMAT Vertical;
```

```shell
Row 1:
──────
path:                     /Users/markhneedham/projects/videos/20260212-Sample
name:                     A001_10150625_C183 2.braw
formatReadableSize(size): 26.75 GiB
modification_time:        2025-10-15 06:25:08.529999

Row 2:
──────
path:                     /Users/markhneedham/projects/videos/20260217-AsyncInserts
name:                     A001_09290151_C176.braw
formatReadableSize(size): 21.70 GiB
modification_time:        2025-09-29 01:51:47.820000

Row 3:
──────
path:                     /Users/markhneedham/projects/videos/20260123-PGCHStack
name:                     A001_08021314_C119.braw
formatReadableSize(size): 21.54 GiB
modification_time:        2025-08-02 13:14:33.260000
```

And I've wrapped this query up into a skill that Claude can use to more quickly find files to delete to free up space.

## url_base for the url table function

### Contributed by Alexey Milovidov

If you use the `url` table function regularly, you've probably typed the same base URL dozens of times. The new `url_base` setting lets you set it once and use relative paths everywhere instead.

Working with the [Amazon customer review dataset](https://clickhouse.com/docs/getting-started/example-datasets/amazon-reviews), we could set the URL base like this:

```bash
SET url_base = 'https://datasets-documentation.s3.eu-west-3.amazonaws.com/amazon_reviews/';
```

We could then query the 2014 reviews like this:

```sql
SELECT
    count(),
    round(avg(star_rating), 2) AS stars,
    round(avg(helpful_votes), 2) AS votes
FROM url('amazon_reviews_2014.snappy.parquet')
```

```shell
┌──count()─┬─stars─┬─votes─┐
│ 44127569 │  4.23 │  0.96 │
└──────────┴───────┴───────┘
```

And if we want to query 2015:

```sql
SELECT
    count(),
    round(avg(star_rating), 2) AS stars,
    round(avg(helpful_votes), 2) AS votes
FROM url('amazon_reviews_2015.snappy.parquet')
```

```shell
┌──count()─┬─stars─┬─votes─┐
│ 41905631 │  4.25 │  0.74 │
└──────────┴───────┴───────┘
```

## Negative LIMIT BY

### Contributed by Nihal Z. Miaji

The 26.5 release also adds negative limit by, which lets us pick rows from the end of each group, rather than the beginning.

We'll use my favorite [UK property prices dataset](https://clickhouse.com/docs/getting-started/example-datasets/uk-price-paid) to demonstrate how it works, starting with the following query that finds the median price by district for all the counties that contain the term `Yorkshire`:

```sql
SELECT county, district, median(price)
FROM uk_price_paid
WHERE county ILIKE '%Yorkshire%'
GROUP BY ALL
ORDER BY median(price) DESC;
```

```shell
┌─county───────────────────┬─district─────────────────┬─median(price)─┐
│ NORTH YORKSHIRE          │ NORTH YORKSHIRE          │        263000 │
│ NORTH YORKSHIRE          │ HARROGATE                │        185000 │
│ NORTH YORKSHIRE          │ HAMBLETON                │        170000 │
│ NORTH YORKSHIRE          │ RYEDALE                  │        160000 │
│ NORTH YORKSHIRE          │ RICHMONDSHIRE            │        150000 │
│ NORTH YORKSHIRE          │ CRAVEN                   │        149250 │
│ NORTH YORKSHIRE          │ SELBY                    │        144995 │
│ EAST RIDING OF YORKSHIRE │ EAST RIDING OF YORKSHIRE │        132000 │
│ WEST YORKSHIRE           │ LEEDS                    │        129997 │
│ NORTH YORKSHIRE          │ SCARBOROUGH              │        120000 │
│ SOUTH YORKSHIRE          │ SHEFFIELD                │        115000 │
│ WEST YORKSHIRE           │ KIRKLEES                 │        114950 │
│ WEST YORKSHIRE           │ WAKEFIELD                │      112997.5 │
│ SOUTH YORKSHIRE          │ ROTHERHAM                │        102500 │
│ WEST YORKSHIRE           │ CALDERDALE               │        101000 │
│ WEST YORKSHIRE           │ BRADFORD                 │        100000 │
│ SOUTH YORKSHIRE          │ DONCASTER                │         98500 │
│ SOUTH YORKSHIRE          │ BARNSLEY                 │         95000 │
│ WEST YORKSHIRE           │ EAST YORKSHIRE           │         94950 │
└──────────────────────────┴──────────────────────────┴───────────────┘
```

We could already select the first two rows per county group, i.e., the two districts with the highest median price per county:

```sql
SELECT county, district, median(price)
FROM uk_price_paid
WHERE county ILIKE '%Yorkshire%'
GROUP BY ALL
ORDER BY median(price) DESC
LIMIT 2 BY county
```

```shell
┌─county───────────────────┬─district─────────────────┬─median(price)─┐
│ NORTH YORKSHIRE          │ NORTH YORKSHIRE          │        262000 │
│ NORTH YORKSHIRE          │ HARROGATE                │        185000 │
│ EAST RIDING OF YORKSHIRE │ EAST RIDING OF YORKSHIRE │      130972.5 │
│ WEST YORKSHIRE           │ LEEDS                    │        130000 │
│ WEST YORKSHIRE           │ KIRKLEES                 │        115000 │
│ SOUTH YORKSHIRE          │ SHEFFIELD                │        115000 │
│ SOUTH YORKSHIRE          │ ROTHERHAM                │        105000 │
└──────────────────────────┴──────────────────────────┴───────────────┘
```

But with negative limit by, we can also select the last two rows per county group, i.e., the two districts with the lowest median price per county.

```sql
SELECT county, district, median(price)
FROM uk_price_paid
WHERE county ILIKE '%Yorkshire%'
GROUP BY ALL
ORDER BY median(price) DESC
LIMIT -2 BY county;
```

```shell
┌─county───────────────────┬─district─────────────────┬─median(price)─┐
│ NORTH YORKSHIRE          │ SELBY                    │        145000 │
│ EAST RIDING OF YORKSHIRE │ EAST RIDING OF YORKSHIRE │        132500 │
│ NORTH YORKSHIRE          │ SCARBOROUGH              │        122000 │
│ SOUTH YORKSHIRE          │ DONCASTER                │         99000 │
│ WEST YORKSHIRE           │ BRADFORD                 │         97500 │
│ SOUTH YORKSHIRE          │ BARNSLEY                 │         94950 │
│ WEST YORKSHIRE           │ EAST YORKSHIRE           │         94950 │
└──────────────────────────┴──────────────────────────┴───────────────┘
```

## Multi-path SQL/JSON

### Contributed by Kevinyhzou, Alexey Milovidov

When using the `JSON_VALUE` and `JSON_QUERY` functions, we can now pass a tuple or array of paths and receive a tuple or array of strings, with JSON parsed only once.

We're going to work with a JSON string representing the Open House conference, printed out using the new `prettyPrintJSON` function:

```sql
WITH '{
  "name": "Open House 2026",
  "tagline": "The real-time database for AI conference",
  "dates": {
    "workshops": "2026-05-26",
    "conference": ["2026-05-27", "2026-05-28"]
  },
  "venue": {
    "name": "Convene 100 Stockton",
    "address": "40 O''Farrell St, San Francisco, CA 94108"
  }
}' AS conf
SELECT prettyPrintJSON(conf)FORMAT Raw;
```

```shell
{
    "name": "Open House 2026",
    "tagline": "The real-time database for AI conference",
    "dates": {
        "workshops": "2026-05-26",
        "conference": [
            "2026-05-27",
            "2026-05-28"
        ]
    },
    "venue": {
        "name": "Convene 100 Stockton",
        "address": "40 O'Farrell St, San Francisco, CA 94108"
    }
}

1 row in set. Elapsed: 0.003 sec.
```

To return strings, for example, if we want to return a tuple containing the name and venue, we use the `JSON_VALUE` function:

```sql
WITH '{
  "name": "Open House 2026",
  "tagline": "The real-time database for AI conference",
  "dates": {
    "workshops": "2026-05-26",
    "conference": ["2026-05-27", "2026-05-28"]
  },
  "venue": {
    "name": "Convene 100 Stockton",
    "address": "40 O''Farrell St, San Francisco, CA 94108"
  }
}' AS conf
SELECT JSON_VALUE(conf, ('$.name', '$.venue.name'));
```

```shell
┌─JSON_VALUE(conf, ('$.name', '$.venue.name'))─┐
│ ('Open House 2026','Convene 100 Stockton')   │
└──────────────────────────────────────────────┘
```

We can also pass in the JSON paths as an array rather than a tuple:

```sql
WITH '{
  "name": "Open House 2026",
  "tagline": "The real-time database for AI conference",
  "dates": {
    "workshops": "2026-05-26",
    "conference": ["2026-05-27", "2026-05-28"]
  },
  "venue": {
    "name": "Convene 100 Stockton",
    "address": "40 O''Farrell St, San Francisco, CA 94108"
  }
}' AS conf
SELECT JSON_VALUE(conf, ['$.name', '$.venue.name']);
```

```shell
┌─JSON_VALUE(conf, ['$.name', '$.venue.name'])─┐
│ ['Open House 2026','Convene 100 Stockton']   │
└──────────────────────────────────────────────┘
```

But `dates.conference` is an array, so if we try to retrieve that using `JSON_VALUE`, we'll return an empty string:

```sql
WITH '{
  "name": "Open House 2026",
  "tagline": "The real-time database for AI conference",
  "dates": {
    "workshops": "2026-05-26",
    "conference": ["2026-05-27", "2026-05-28"]
  },
  "venue": {
    "name": "Convene 100 Stockton",
    "address": "40 O''Farrell St, San Francisco, CA 94108"
  }
}' AS conf
SELECT JSON_VALUE(conf, ('$.name', '$.dates.conference'));
```

```shell
┌─JSON_VALUE(c⋯nference'))─┐
│ ('Open House 2026','')   │
└──────────────────────────┘
```

We can read the individual values from that array using zero-based array indices:

```sql
WITH '{
  "name": "Open House 2026",
  "tagline": "The real-time database for AI conference",
  "dates": {
    "workshops": "2026-05-26",
    "conference": ["2026-05-27", "2026-05-28"]
  },
  "venue": {
    "name": "Convene 100 Stockton",
    "address": "40 O''Farrell St, San Francisco, CA 94108"
  }
}' AS conf
SELECT JSON_QUERY(conf, ['$.name', '$.dates.conference[0]', '$.dates.conference[1]']);
```

```shell
┌─JSON_QUERY(conf, ['$.name', '$.dates.conference[0]', '$.dates.conference[1]'])─┐
│ ['Open House 2026','2026-05-27','2026-05-28']                                 │
└────────────────────────────────────────────────────────────────────────────────┘
```
</Release>

<Release date="June 1, 2026" published="2026-06-01T14:08:38.000Z" url="https://clickhouse.com/blog/executable-udfs-clickhouse-cloud-beta" org="clickhouse" source="clickhouse-blog">
## Executable UDFs are now in public beta on ClickHouse Cloud

Today we're excited to announce that **executable UDFs are now available in public beta on ClickHouse Cloud**. You can write a function in Python, upload it as a zip to your cluster, and call it from SQL like any built-in. ClickHouse manages a pool of long-lived sandboxed processes and routes rows through them at query speed. The function is callable anywhere SQL is: ad-hoc queries, joins, even materialized views that fire on every insert.

This isn't a brand-new idea. We've shipped executable UDFs in self-hosted ClickHouse for a while. [Our 2023 post on calling OpenAI from SQL](https://clickhouse.com/blog/clickhouse-open-ai-user-defined-functions-udfs) walked through the same mechanism. What's new today is that you don't need to run your own server to use it. The model code lives where the data is, runs in a managed sandbox, and the deployment surface is one upload screen in the Cloud console.

To show what this unlocks, we built a demo. A small PyTorch autoencoder scores ~6 billion equity trade ticks for anomalousness, inline with ingest. A Next.js front-end consumes the embeddings. Full source for the notebook, UDF bundle, SQL, and webapp is in [this repo](https://github.com/ClickHouse/stock-anomaly-udf).

![Anomaly dashboard with packedbubble chart and S&P 500 leaderboard](/uploads/hero_dashboard_5c8d3ce323.png)

## The problem this solves [#](/blog/executable-udfs-clickhouse-cloud-beta#the-problem-this-solves)

You have a trained model. You have a stream of data in ClickHouse. Getting them into the same room used to mean one of three options.

1. **Stand up a separate scoring service.** Now you maintain a model server, an ingest pipeline that routes rows to it, and a way to write the scores back into ClickHouse. The model is no longer near the data in any meaningful sense.
    
2. **Translate the model into pure SQL.** Workable for some tree-based models. Painful for anything with embeddings. Every retrain means regenerating thousands of lines of SQL by hand.
    
3. **Batch score offline and join later.** Loses freshness. The "anomaly" on a trade that just hit is only useful if you can react to it now.
    

Executable UDFs collapse all three into one. Write the inference code as a normal Python file. Point ClickHouse at it. Call it from SQL. The function runs inline with whatever query needs it, including inside a materialized view, which is exactly what we do here.

## What we built [#](/blog/executable-udfs-clickhouse-cloud-beta#what-we-built)

Last year we wrote ["Building StockHouse"](https://clickhouse.com/blog/building-stockhouse), showing how ClickHouse handles a continuous firehose of stock trade ticks in real time. That post stopped at the ingest and query layer. The natural next question is: what if you wanted to apply a learned model to every trade as it lands?

We picked an unsupervised anomaly-detection setup because it shows off the shape of the problem cleanly.

- A small autoencoder (~270K parameters) is trained on 50M historical trade ticks. Its inputs: a hashed ticker, 7 numeric features (price, size, exchange, etc.), and 6 cyclical-encoded temporal features.
- For each trade, the model produces a **32-dim embedding** and a **reconstruction error**. High error means the model wasn't trained on patterns like this trade. It's *anomalous in shape* compared to what's normal for that symbol's history.
- The UDF that wraps this model is `embed_trade`. It's the only ML-specific piece in the system. Everything else is plain SQL: the score aggregation, the per-symbol baselines, the views.

Here's the data flow:

```sql
            ┌───────────────────────────┐
            │  default.trades           │     ← upstream feed (e.g. Polygon)
            └──────────────┬────────────┘
                           │ INSERT
                           ▼
            ┌───────────────────────────┐
            │  trades_embeddings_mv     │     ← fires on every INSERT
            │  (calls embed_trade UDF)  │
            └──────────────┬────────────┘
                           │
                           ▼
            ┌───────────────────────────┐
            │  default.trades_embeddings│     ← same trade + 32-dim
            │                           │       embedding + recon_score
            └──────────────┬────────────┘
              ▲            │
              │            │ refresh hourly
              │            ▼
              │  ┌──────────────────────┐
              │  │ trades_baselines     │     ← per-symbol score
              │  │ trades_dim_baselines │       distribution stats
              │  └──────────────────────┘
              │
              └──── consumed by webapp queries
                    (anomalies are defined relative
                     to each symbol's own baselines)
```

Every `INSERT INTO trades` flows through the materialized view, gets scored, and lands in `trades_embeddings`. The webapp never re-runs the model. It only reads `trades_embeddings` and two cheap baseline tables. The expensive inference happens exactly once per trade, inline with ingest, and every downstream query is a normal aggregation.

## Training the autoencoder [#](/blog/executable-udfs-clickhouse-cloud-beta#training-the-autoencoder)

The model itself is small and unremarkable as ML goes, but the training pipeline is worth a quick look because it has to produce artifacts the UDF can load at runtime. The full walkthrough lives in [`notebook/train_and_deploy_udf.ipynb`](https://github.com/ClickHouse/stock-anomaly-udf/blob/main/notebook/train_and_deploy_udf.ipynb). A summary:

1. **Stream training data into Parquet chunks.** A SELECT against `default.trades` derives the 14 input features server-side (price, size, exchange, condition-code count, hashed ticker, and cyclical encodings of hour and day of week). The notebook pulls the result via `query_arrow_stream` and writes 5M-row Parquet chunks to local disk. Nothing is held in memory.
    
2. **Fit a `StandardScaler` incrementally.** Welford's algorithm via `partial_fit` gives the same mean and variance as a single `scaler.fit()` over the full dataset, with bounded memory. We fit on the 7 base numeric features only. The hashed ticker is an integer key and the cyclical features are already on a sensible scale.
    
3. **Train the autoencoder.** `TradeAutoencoderV2` is a 4-layer encoder into a 32-dim latent, with a symmetric decoder back to the numeric feature space. The sym embedding lookup happens at the input layer, `sym_idx = xxHash32(sym) % NUM_HASH_BUCKETS`. Loss is MSE on the reconstructed numeric features. Training streams rows out of the Parquet chunks via an `IterableDataset` and stops when a 200-batch moving-average loss fails to improve for 5 windows.
    
4. **Save two artifacts.** `scaler_params.pt` holds `mean_` and `scale_` as Float32 tensors. `trade_autoencoder_v2.pt` holds the model `state_dict` plus a `config` dict with the constructor kwargs. The UDF's `main.py` reads these at startup and reconstructs the model.
    
5. **Package the bundle.** A final notebook cell zips `main.py`, `requirements.txt`, and the two `.pt` files into `embed_trade.zip`, ready to upload.
    

## Deploying the UDF on Cloud [#](/blog/executable-udfs-clickhouse-cloud-beta#deploying-the-udf-on-cloud)

The deployment surface is a single upload screen in the Cloud console. You give it a name, a zip containing your code and model files, and a few runtime parameters.

![ClickHouse Cloud UDF deployment page with argument list and runtime settings](/uploads/cloud_udf_deployment_95cd9b3a7d.png)

For `embed_trade` we use:

- **Type:** `executable_pool`. Long-lived processes, hot model in memory.
- **Pool size:** `10` per replica. Each process loads the 2MB model at startup (~1.5s) and reuses it for every subsequent call.
- **Runtime:** `python3.11`. Dependencies (`torch==2.4.1`, `numpy==1.26.4`) come from the `requirements.txt` in the zip.
- **Format:** `TabSeparated`. The UDF reads one TSV line per input row on stdin and prints `(embedding, recon_score)` on stdout.
- **14 arguments**, each with an explicit ClickHouse type. The signature matches the autoencoder's training schema exactly. See `udf/cloud-deployment.md` for the full table.

The function is then callable from SQL like any built-in:

```sql
WITH
    fromUnixTimestamp64Milli(t, 'America/New_York') AS ts,
    embed_trade(
        xxHash32(sym), p, s, x, z, toUInt64(length(c)), trfi, trft,
        toUInt8(toHour(ts)), toUInt8(toDayOfWeek(ts, 1)),
        sin((toHour(ts) * 2 * pi()) / 24),
        cos((toHour(ts) * 2 * pi()) / 24),
        sin((toDayOfWeek(ts, 1) * 2 * pi()) / 7),
        cos((toDayOfWeek(ts, 1) * 2 * pi()) / 7)
    ) AS result
SELECT
    sym, i, x, p, s, c, t, q, z, trfi, trft, inserted_at,
    result.2 AS recon_score,
    result.1 AS embedding
FROM stockhouse.trades limit 10;
```

The interesting part isn't *that* you can do this. It's *where* you can put the call.

## Scoring every trade, inline with ingest [#](/blog/executable-udfs-clickhouse-cloud-beta#scoring-every-trade-inline-with-ingest)

We wire `embed_trade` into a materialized view:

```sql
CREATE MATERIALIZED VIEW trades_embeddings_mv
TO trades_embeddings
AS
WITH
    fromUnixTimestamp64Milli(t, 'America/New_York') AS ts,
    embed_trade(
        xxHash32(sym), p, s, x, z, toUInt64(length(c)), trfi, trft,
        toUInt8(toHour(ts)), toUInt8(toDayOfWeek(ts, 1)),
        sin((toHour(ts) * 2 * pi()) / 24),
        cos((toHour(ts) * 2 * pi()) / 24),
        sin((toDayOfWeek(ts, 1) * 2 * pi()) / 7),
        cos((toDayOfWeek(ts, 1) * 2 * pi()) / 7)
    ) AS result
SELECT
    sym, i, x, p, s, c, t, q, z, trfi, trft, inserted_at,
    result.2 AS recon_score,
    result.1 AS embedding
FROM trades;
```

Every `INSERT INTO trades` fires this MV. The Python pool scores the batch and lands the result in `trades_embeddings`. There's no other mover, no other service, no separate scheduler. Just SQL.

This is the part that wasn't possible before executable UDFs landed in Cloud. The equivalent service architecture would be a Kafka consumer reading from `trades`, batching rows, posting to a model server, writing the results back. Same end state, several more moving parts. Here it's one DDL statement.

The performance shape is unsurprising. Cost per row is the model forward pass (a few milliseconds on a warm pool) plus the TSV serialization. ClickHouse batches rows into the UDF in chunks. The pool runs a handful of in-flight invocations in parallel. We backfilled ~6B historical rows at ~35K rows/sec sustained over several hours on a 3-replica cluster with no manual scaling. Same UDF, same MV, same SQL.

## Making "anomalous" mean something [#](/blog/executable-udfs-clickhouse-cloud-beta#making-anomalous-mean-something)

The autoencoder gives us a raw `recon_score` per trade. That's a number between roughly 0.00002 and 1,000,000+ across the dataset. A naive "trades above 0.062 are anomalous" filter (using the global 99th percentile from the model's training distribution) sounds reasonable until you actually look at the data.

A handful of symbols, like BRK.A and LLY, score every single trade above that threshold because their share prices are unusually high. Their entire distribution sits in the right tail of the global one. A "100% anomalous" stat for those symbols is technically correct and practically useless.

So we redefine "anomaly" relative to each symbol's own history. For every symbol, we maintain its **lifetime p95 of `recon_score`**. A trade is anomalous *for that symbol* if it exceeds the symbol's own p95. About 5% of trades qualify in a typical window, by construction. When that fraction spikes well above 5%, the symbol is having a genuinely unusual window.

The per-symbol baseline lives in another ClickHouse table:

```sql
CREATE TABLE trades_baselines (
    sym         LowCardinality(String),
    p50         Float32,
    p95         Float32,
    p99         Float32,
    -- ...
    computed_at DateTime
)
ENGINE = MergeTree
ORDER BY sym;
```

A **refreshable materialized view** repopulates it every hour:

```sql
CREATE MATERIALIZED VIEW trades_baselines_mv
REFRESH EVERY 1 HOUR
TO trades_baselines
AS
SELECT
    sym,
    quantiles(0.5, 0.95, 0.99)(recon_score) AS qs,
    qs[1] AS p50, qs[2] AS p95, qs[3] AS p99,
    -- ...
FROM trades_embeddings
WHERE NOT has(c, 15) AND NOT has(c, 12)   -- exclude auction prints
GROUP BY sym;
```

Refreshable MVs atomically truncate and replace the target table on each refresh. Plain `MergeTree` is the right engine: no `FINAL`, no dedup logic, no read-time overhead.

The leaderboard query then **joins live trades against the baselines table** to count anomalies per symbol relative to their own baseline:

```sql
SELECT
    e.sym,
    countIf(e.recon_score > b.p95) AS anomaly_count,
    round(sumIf(e.s, e.recon_score > b.p95) * 100.0 / sum(e.s), 2) AS pct_of_volume
FROM stockhouse.trades_embeddings AS e
INNER JOIN stockhouse.trades_baselines AS b ON e.sym = b.sym
WHERE e.t = now() - INTERVAL 1 HOUR
GROUP BY e.sym
ORDER BY pct_of_volume DESC
LIMIT 50;
```

This query goes from ~1.7s (recomputing baselines inline as a CTE) to ~0.27s (joining against the pre-computed table). Same answer, roughly 6x faster. The expensive part is materialized exactly once an hour instead of on every page load.

## The webapp [#](/blog/executable-udfs-clickhouse-cloud-beta#the-webapp)

The webapp is a Next.js + Click UI + Highcharts demo. It consumes `trades_embeddings` and the baseline tables.

**The anomaly dashboard** ranks S&P 500 symbols by share of trading volume that exceeds their own baseline.

![Dashboard with bubble chart and detailed table](/uploads/hero_dashboard_5c8d3ce323.png)

The packed-bubble chart sizes and colors each symbol by `pct_of_volume`, the share of total trading volume in the window that came from trades above the symbol's lifetime p95. Symbols with redder, larger bubbles had unusually anomaly-heavy windows. The table on the left carries the same sort, with OHLC, max score, and the per-symbol baseline alongside.

**The symbol drilldown** zooms in on one ticker.

![Symbol drilldown showing candlesticks with volume bars and a table of anomalous trades](/uploads/symbol_drilldown_2844f51814.png)

A candlestick and volume pane sits on top. Both axes overlap a single plot area, with the price axis stretched downward to push candles into the top 65% and volume bars into the bottom 30%. Hover any row in the anomalous-trades table and the corresponding candle's volume bar fills yellow, sized to that trade's share of the bucket's total volume. Crosshairs snap to the candle center.

**The similarity search** opens as a modal over the drilldown when you click a trade.

![Similarity modal with radar chart and similar-trades table](/uploads/similarity_modal_80d0258023.gif)

The radar chart plots each trade's 13 input dimensions, normalized against the symbol's lifetime min, max, and avg per dim. Because the avg always maps to `0.5`, the baseline series renders as a perfect 13-sided polygon at the chart's midpoint. Easy to spot deviations from. Hover a similar-trade row to overlay it. The 50 most-similar trades come from `cosineDistance(embedding, target_embedding)` over the same symbol's embedding column.

**The model drift monitor** tracks the score distribution over time.

![Model drift weekly p50/p95/p99/max chart and per-symbol drift lines](/uploads/drift_monitor_f552fa5951.png)

Weekly p50, p95, p99, and max of `recon_score`, with horizontal reference lines at the static thresholds the model was originally calibrated against. If the p99 line starts climbing week over week, the market has drifted from the model's training distribution and it's time to retrain.

**The auction print monitor** is the home for the extreme tail. Opening (c=12) and closing (c=15) auction prints score in the thousands to millions because of their massive share sizes.

![Auction print monitor with top auctions table and daily counts chart](/uploads/auction_monitor_39b129720c.png)

They'd dominate every other view if we didn't filter them out everywhere else. Here they get their own page.

## One more thing: network-access UDFs (private beta) [#](/blog/executable-udfs-clickhouse-cloud-beta#one-more-thing-network-access-udfs-private-beta)

Everything you've seen so far runs on the deterministic path. `embed_trade` scores rows at ingest, baselines refresh hourly, the webapp reads pre-computed tables. No external calls anywhere on the read path. That's the shape you want for the load-bearing pieces: cheap, predictable, no upstream that can disappear on you.

But once a trade has been flagged as anomalous, the obvious next question is *why*. That answer lives outside ClickHouse — in news APIs, SEC filings, halt notices, social signals. To pull those in we need network access from the UDF.

**Network-access executable UDFs are in private beta on ClickHouse Cloud.** Once enabled, the UDF runtime can make outbound HTTPS calls to any allowed host. We added two new UDFs in this repo to use it:

### `nearby_events` [#](/blog/executable-udfs-clickhouse-cloud-beta#nearby_events)

Given `(sym, t, window_min)`, calls two external sources and returns a JSON array of events near that trade time:

1. **Massive News API** (Polygon recently rebranded as **Massive**; `api.polygon.io` endpoints still respond as before).
2. **SEC EDGAR** (free, public, no API key).

```sql
SELECT
    sym,
    JSONLength(nearby_events(sym, t, 120)) AS n_events
FROM stockhouse.trades_embeddings
WHERE recon_score > 1.0
LIMIT 5;
```

You could *almost* do this with `url()`. The differences that make it a UDF:

- **In-process composition.** Polygon's results and EDGAR's filings get deduped, sorted, and capped in a single Python call. Chaining two `url()` calls in SQL would force the same logic into a `UNION ALL` with downstream `arrayJoin`/`groupArray` plumbing — workable, but uglier.
- **Auth in env.** The Polygon API key is read from `POLYGON_API_KEY` at pool-process startup. It never appears in SQL.
- **Per-process LRU cache.** Each pool worker keeps recent results keyed by `(sym, minute, window)`. The same trade hovered twice in the UI costs one API call, not two.
- **Connection reuse.** A long-lived `requests.Session()` per process keeps HTTP connections alive for the duration of that worker, which is hours.

### `classify_trade` [#](/blog/executable-udfs-clickhouse-cloud-beta#classify_trade)

Given `(sym, t)`, fetches context via `nearby_events`'s internals, then asks **Anthropic Claude** to classify the most likely cause of the anomalous trade. Returns a typed tuple:

```sql
WITH classify_trade('LLY', 1778777944818) AS c
SELECT c.1 AS cause, c.2 AS confidence, c.3 AS summary;
```

The cause is constrained to a fixed taxonomy: `earnings`, `m_and_a`, `halt`, `rumor`, `sector_move`, `block_trade`, `no_news_found`. We enforce this via Anthropic's **tool-use** mechanism. The model is required to call a tool whose `input_schema` includes an `enum` on the `cause` field, so the response is guaranteed to be parseable and the cause is guaranteed to be one of the known values. No regex parsing of free-form prose, no "the model returned something close to 'earnings' but with extra words" follow-up logic.

Remember the similarity modal from the webapp? `classify_trade` and `nearby_events` drive a **"Why anomalous?"** panel pinned to the top of that modal. When you open a trade, the panel hits both UDFs in parallel and shows:

- A badge with the classified cause and a confidence number
- A 1–2 sentence summary written by the model
- A short list of the news headlines and filings that drove the call

![Similarity modal showing the Why-anomalous panel with cause badge, summary, and event list](/uploads/why_anomalous_ef01e5439a.png)

### Why this matters [#](/blog/executable-udfs-clickhouse-cloud-beta#why-this-matters)

`url()` has been in ClickHouse for years and it's good for ad-hoc fetches. What network-access UDFs add is the rest of the picture: stateful clients, auth lifecycle, multi-step pipelines, structured LLM output, and per-process caching. The boundary between "code that needs to run" and "data that needs to be queried" gets thinner.

You can put a 200-line Python function with three API calls and an LLM prompt into a `SELECT`. Nobody else has to learn it exists.

**Want to try it on your cluster?** Network-access UDFs are in private beta — reach out to ClickHouse Cloud support to get it enabled!

## What's interesting about this [#](/blog/executable-udfs-clickhouse-cloud-beta#whats-interesting-about-this)

Most ML-on-streaming-data architectures pay an integration tax. The model lives somewhere. The data lives somewhere else. The glue between them is its own system. The setup in this repo flattens that. There's a ClickHouse Cloud cluster, a 2MB Python file, and one DDL statement that binds them together.

Every piece of UI logic in the webapp is a SQL query. Anomaly detection is the only ML in the system, and even that's not "ML in the webapp", it's a column in a table. The "how anomalous is this symbol's last hour" calculation, the "find me similar trades by cosine distance" query, the per-symbol p95 baseline, the materialized views that keep it all fresh: standard SQL features, running against standard ClickHouse tables.

Executable UDFs in Cloud don't add new abstractions on top of ClickHouse. They give you a way to make Python part of your SQL.

## Reproduce it [#](/blog/executable-udfs-clickhouse-cloud-beta#reproduce-it)

The full project is at [https://github.com/clickhouse/stock-anomaly-udf](https://github.com/clickhouse/stock-anomaly-udf).

```bash
stock-anomaly-udf/
├── notebook/   # Train the autoencoder, export weights, package the UDF zip
├── udf/        # The deployable bundle: main.py, model weights, deployment notes
├── sql/        # Source schema, the auto-embed MV, two refreshable baseline MVs
└── web/        # Next.js demo app
```

### Quickstart [#](/blog/executable-udfs-clickhouse-cloud-beta#quickstart)

1. **Get the UDF onto your cluster.**
    
    -   Zip the contents of `udf/embed_trade/`:
        
        ```bash
        cd udf/embed_trade && zip embed_trade.zip main.py requirements.txt *.pt
        ```
        
    -   Upload via the Cloud UDF deployment UI. Configure per [`udf/cloud-deployment.md`](https://github.com/ClickHouse/stock-anomaly-udf/blob/main/udf/cloud-deployment.md).
2. **Run the SQL files in order:**
    
    ```sql
    :run sql/01_source_schema.sql
    :run sql/02_embeddings_mv.sql
    :run sql/03_score_baselines.sql
    :run sql/04_dim_baselines.sql
    ```
    
3. **Backfill historical data** (optional). Bulk INSERT into `trades_embeddings` using the same SELECT pattern as the MV, scoped to any time range. The MV in step 2 will catch every subsequent INSERT into `default.trades` automatically.
    
4. **Start the webapp:**
    
    ```bash
    cd web
    cp .env.example .env.local   # fill in CH_HOST/PORT/USER/PASS/DB
    npm install
    npm run dev
    ```
    
    Open [http://localhost:3000](http://localhost:3000).
    

The notebook in `notebook/` walks through training your own autoencoder end to end. It streams training data from `default.trades` into Parquet chunks, fits a `StandardScaler` incrementally, trains with early stopping, and zips the artifacts into a deployable bundle.

## Try executable UDFs [#](/blog/executable-udfs-clickhouse-cloud-beta#try-executable-udfs)

Public beta is live in ClickHouse Cloud today. Drop us a note if you put something interesting together with it!
</Release>

<Release date="June 1, 2026" published="2026-06-01T00:00:00.000Z" url="https://dagster.io/blog#how-dagster-compass-powers-brooklyn-data-s-self-service-anal" org="dagster" source="dagster-blog">
## How Dagster Compass Powers Brooklyn Data's Self-Service Analytics

Text-to-analytics promises self-service access to data, but adoption depends on usability, governance, and trust. In this guest post, Brooklyn Data explains how it evaluated Compass, deployed it on top of Snowflake, and enabled teams to answer operational questions directly in Slack while maintaining centralized governance and business context.
</Release>

<Release version="1.13.7" date="May 28, 2026" published="2026-05-28T17:18:25.000Z" url="https://github.com/dagster-io/dagster/releases/tag/1.13.7" org="dagster" source="dagster">
## 1.13.7 (core) / 0.29.7 (libraries)

### New

- Added an `owners` parameter to `build_schedule_from_partitioned_job` to set owners on the resulting schedule. (Thanks, [@dragos-pop](https://github.com/dragos-pop)!)
- [dagster-airlift] Added support for Python 3.12, 3.13, and 3.14.
- [dagster-fivetran] The Fivetran component now supports a `fetch_column_metadata` option to fetch column-level metadata for synced tables.
- [dagster-k8s] When `includeConfigInLaunchedRuns.enabled` is set in the Helm chart, run pods now inherit `nodeSelector`, `tolerations`, and `podSecurityContext` from user deployment configuration.

### Bugfixes

- Fixed a SQL injection vulnerability in `dagster-clickhouse`, `dagster-clickhouse-pandas`, and `dagster-clickhouse-polars` when using dynamic partition keys. ClickHouse partition queries now use driver parameter binding instead of interpolating partition key values into SQL strings.
- Fixed a race condition where readers polling for a run to reach `FAILURE` could observe the failed status before the `dagster/failure_reason` tag was written.
- Fixed a regression where a blocking asset check that did not emit a result (e.g. a dbt test that dbt declined to run under `indirect_selection: cautious` or `buildable`) could silently cause downstream assets to be skipped. Missing blocking check results now allow downstream assets to proceed and log a warning; failed blocking checks continue to fail the step and the run as before.
- Improved deduplication in `AutomationCondition.any_downstream_conditions()` to prevent recursive blowups in condition size when combined with other automation conditions.
- [ui] Fixed extra scrollbars appearing in the asset overview and asset sidebar.
- [dagster-k8s] Fixed an `AttributeError` when using `kubernetes` client v36+, which uses PEP 585 dict syntax (e.g. `dict[str, str]`) in its OpenAPI type annotations. (Thanks, [@vanika02](https://github.com/vanika02)!)
</Release>

<Release date="May 28, 2026" published="2026-05-28T08:56:59.000Z" url="https://clickhouse.com/blog/multi-stage-distributed-query-execution-clickhouse-cloud" org="clickhouse" source="clickhouse-blog">
## Introducing multi-stage distributed query execution in ClickHouse Cloud

# Introducing multi-stage distributed query execution in ClickHouse Cloud

> **TL;DR**  
> Multi-stage distributed execution gives ClickHouse Cloud a new way to scale one query across many nodes. It repartitions intermediate data between stages, removing key bottlenecks in large joins and high-cardinality aggregations.  
>   
> Early TPC-H results show up to 3.4× speedups for join-heavy queries while retaining near-linear aggregation scaling: 7.4× faster on 8 nodes than on 1 node.

## Scaling one query across many nodes

ClickHouse has always been able to scale a single query across multiple nodes. In shared-nothing deployments, users do this with physical sharding and the `Distributed` table engine. In ClickHouse Cloud, parallel replicas brought intra-query scaling to shared storage.

These mechanisms work well for many analytical queries, but they were not the final answer for modern PB-scale workloads. They could fan out work across nodes, but they could not freely repartition intermediate results between execution stages. That limited how far ClickHouse could scale high-cardinality aggregations, and especially large joins.

Multi-stage distributed query execution is the next step. It gives ClickHouse Cloud a new way to parallelize a single query across the CPU and memory of all available nodes, without the bottlenecks of the previous execution models.

In this post, we introduce the new extension of ClickHouse's query execution model and walk through how it works. We use a multi-table join as the running example because joins are among the hardest analytical workloads to scale, but the mechanism is much broader: it is a new foundation for distributed query execution in ClickHouse Cloud.

Before we look at the new mechanics, let's review what came before and why those approaches weren't enough for modern PB-scale workloads.

## Why existing distributed execution was not enough

The existing distributed execution was useful but not elastic enough for PB-scale workloads.

In shared-nothing open source deployments, ClickHouse scales a query by physically [sharding](https://clickhouse.com/docs/shards) data across nodes and querying those shards through a [Distributed table](https://clickhouse.com/docs/engines/table-engines/special/distributed). Each node processes its local slice, and the requester merges the results.

![Blog-Distributed_JOINS-introduction.001.png](/uploads/Blog_Distributed_JOINS_introduction_001_bf34c79f03.png)

That works, but capacity is tied to the shard layout.

> **Bottleneck: capacity is tied to shard layout**  
> Adding compute does not automatically make one query faster. Large tables must first be redistributed across more shards.

Large joins across physically sharded tables expose a second limitation. A join only works when matching rows meet on the same machine. With a distributed JOIN, each node keeps its local left side, fetches the missing right-side shards from the other nodes, builds a full right-side hash table, and returns its local join result to the requester.

![Blog-Distributed_JOINS-introduction.002.png](/uploads/Blog_Distributed_JOINS_introduction_002_68e4f2362c.png)

GLOBAL JOIN reduces the many-to-many network round-trip by computing the right side once and broadcasting it to every node.

![Blog-Distributed_JOINS-introduction.003.png](/uploads/Blog_Distributed_JOINS_introduction_003_69a7171fae.png)

But the core problem remains: large right sides still have to be copied across the cluster.

> **Bottleneck: large right sides are copied everywhere**  
> Distributed JOIN and GLOBAL JOIN handle network traffic in different ways, but both still make every shard join against a full right side.

[ClickHouse Cloud](https://clickhouse.com/cloud) removed the physical sharding problem by moving to [shared storage](https://clickhouse.com/blog/clickhouse-cloud-boosts-performance-with-sharedmergetree-and-lightweight-updates). Any node can access the same table data, and [parallel replicas](https://clickhouse.com/docs/deployment-guides/parallel-replicas) allow multiple nodes to participate in a single query. Nodes can be added or removed instantly, with no data copying or reshuffling.

![Blog-Distributed_JOINS-introduction.004.png](/uploads/Blog_Distributed_JOINS_introduction_004_1af1e0a189.png)

That made [intra-query scaling](https://clickhouse.com/blog/clickhouse-parallel-replicas) in ClickHouse Cloud much more elastic. But parallel replicas still had a structural limitation: they could split work across replicas, but they could not freely repartition intermediate data between execution stages.

This shows up in two places.

First, joins. On a single node, ClickHouse can [parallelize both sides of its default hash join strategy](https://clickhouse.com/blog/clickhouse-fully-supports-joins-hash-joins-part2#parallel-hash-join): it partitions rows by the join key into multiple hash tables, so both build and probe work can run across CPU cores. The same remains true inside each node when parallel replicas are used.

The limitation is one level higher. Across multiple nodes, dividing the build side itself would require a shuffle stage that repartitions both inputs by join key between nodes. Parallel replicas do not have that mechanism. The next best option is to distribute the left-side read ranges after [primary-index pruning](https://clickhouse.com/docs/primary-indexes). That parallelizes probe-side work across nodes, but those ranges are not partitioned by the join key. A row in one left-side range can match rows anywhere in the right-side table, so each node still needs the full right side to build its local hash table(s) before probing its local slice.

![Blog-Distributed_JOINS-introduction.005.png](/uploads/Blog_Distributed_JOINS_introduction_005_8a9fda944e.png)

> **Bottleneck: the build side does not scale out**  
> The left-side probe is divided across nodes, but the build side is not. Every node still builds the same hash table from the full right side, so the build step is repeated instead of divided across the cluster.

Second, aggregations. Nodes can scan and aggregate locally in parallel. But without a shuffle by the GROUP BY key, ClickHouse cannot guarantee that all rows for the same GROUP BY key end up on the same node.

![Blog-Distributed_JOINS-introduction.006.png](/uploads/Blog_Distributed_JOINS_introduction_006_72425a40cd.png)

> **Bottleneck: final aggregation is still single-node**  
> Partial groups must be merged by one coordinator. For high-cardinality GROUP BY, that final merge is bounded by one node's CPU and memory, not the cluster.

Both problems have the same root cause: there is no general way to repartition intermediate data between execution stages. That is what multi-stage distributed execution adds.

## Introducing multi-stage distributed execution

Multi-stage distributed execution adds the missing primitive: it lets ClickHouse Cloud move intermediate data between nodes while a query is running.

Instead of executing a query as one distributed fan-out plus a final merge, ClickHouse splits the query plan into stages running in parallel across worker nodes. Between stages, exchange operators move the intermediate results into the shape required by the next stage.

![Blog-Distributed_JOINS-introduction.007.png](/uploads/Blog_Distributed_JOINS_introduction_007_f6cb8aa8b8.png)

For example, data can be shuffled by a join key so each worker receives the matching slice of both join inputs. It can be shuffled by a GROUP BY key so each worker owns complete groups. Small inputs can be broadcast to all workers. Final results can be gathered by the coordinator.

> **Prior bottlenecks removed: data can move between stages**  
> Large joins no longer need every node to build the full right-side hash table. High-cardinality aggregations no longer need one coordinator to merge all partial groups.

The core abstraction is the exchange operator, a well-known building block in parallel query execution, [introduced](https://scholar.colorado.edu/concern/reports/sj139272m) for the Volcano system and used by MPP databases like Teradata and Greenplum, as well as in SQL Server.

The exchange operator redistributes data between plan stages. Multi-stage distributed execution uses three main exchange types:

1. **GatherExchange** (N-to-1): sends worker output to the coordinator, usually at the top of the plan to produce the final result.

2. **ShuffleExchange** (M-to-N): repartitions rows by a key, such as a join key or GROUP BY key. This is what lets each worker own a complete, disjoint slice of the next operation.

3. **BroadcastExchange** (1-to-N): copies a small input to every worker, useful when one side of a join is small enough to replicate cheaply.

There is also **ScatterExchange**, which spreads rows randomly amongst workers.

Those are the mechanics in the abstract. The easiest way to see why they matter is to follow one query through the stages.

### How one analytical join query avoids the old bottlenecks

Let's make this concrete with a [TPC-H](https://clickhouse.com/docs/getting-started/example-datasets/tpch)-like query that hits both bottlenecks from the previous section: a large join side that should not be copied to every worker node, and an aggregation that should not collapse into a single-node final merge.

The query computes total shipment revenue per nation: it joins lineitem to supplier, joins the result to the small nation table, groups by n_name, and sorts by revenue.

```sql
SELECT n_name, sum(l_extendedprice) AS revenue
FROM lineitem
JOIN supplier ON l_suppkey = s_suppkey
JOIN nation ON s_nationkey = n_nationkey
WHERE l_shipdate >= '1994-01-01' AND l_shipdate < '1995-01-01'
GROUP BY n_name
ORDER BY revenue DESC;
```

The distributed plan (inspected via [EXPLAIN](https://clickhouse.com/docs/sql-reference/statements/explain)) contains one BroadcastExchange, two ShuffleExchanges, and one GatherExchange:

```text
┌─explain──────────────────────────────────────────────────┐
│ Output: n_name, sum(l_extendedprice)                     │
│                                                          │
│ GatherExchange (sorted by (sum(l_extendedprice) DESC))   │
│ └──Sorting (Sorting for ORDER BY)                        │
│    └──Aggregating                                        │
│       └──ShuffleExchange (by hash([n_name]))             │
│          └──JoinLogical                                  │
│             ├──ShuffleExchange (by hash([l_suppkey]))    │
│             │  └──ReadFromMergeTree (sf100.lineitem)     │
│             └──ShuffleExchange (by hash([s_suppkey]))    │
│                └──JoinLogical                            │
│                   ├──ReadFromMergeTree (sf100.supplier)  │
│                   └──BroadcastExchange                   │
│                      └──ReadFromMergeTree (sf100.nation) │
└──────────────────────────────────────────────────────────┘
```

Read from the bottom up, the plan first builds the small supplier ⋈ nation join: nation is broadcast, supplier is read, and each worker produces an enriched supplier ⋈ nation side. That enriched side is then repartitioned by s_suppkey, while lineitem is read and repartitioned by l_suppkey, so matching rows meet on the same worker. The joined rows are then shuffled by n_name for aggregation, and the sorted final result is gathered by the coordinator.

Let's walk through those steps.

#### **Step 1: Join supplier with nation**

ClickHouse first broadcasts the tiny `nation` table to every worker and builds a small local `nation` hash table.

![Blog-Distributed_JOINS-introduction.008.png](/uploads/Blog_Distributed_JOINS_introduction_008_9f01f1a3b7.png)

Each worker then reads its slice of `supplier` …

![Blog-Distributed_JOINS-introduction.009.png](/uploads/Blog_Distributed_JOINS_introduction_009_9d34292f7b.png)

…and probes that local hash table.

The result is an enriched `supplier ⋈ nation` side.

![Blog-Distributed_JOINS-introduction.010.png](/uploads/Blog_Distributed_JOINS_introduction_010_ec2f2a7c59.png)

Nothing has been shuffled yet: each worker still keeps the rows from its original `supplier` slice.

#### **Step 2: Co-locate lineitem with enriched supplier rows**

Next, ClickHouse prepares the larger join `lineitem` ⋈ (`supplier` ⋈ `nation`).

Workers first read slices of `lineitem`…

![Blog-Distributed_JOINS-introduction.011.png](/uploads/Blog_Distributed_JOINS_introduction_011_535d471633.png)

…and both join sides are repartitioned by supplier key: `lineitem` by `l_suppkey`, and the enriched `supplier ⋈ nation` rows by `s_suppkey`.

![Blog-Distributed_JOINS-introduction.012.png](/uploads/Blog_Distributed_JOINS_introduction_012_82b5335d17.png)

After the shuffle, each worker owns a disjoint supplier-key bucket containing the matching rows from both sides, and the enriched `supplier ⋈ nation` rows from Step 1 become the build side.

> **Prior bottleneck removed: no full build-side copy**  
> Previously, each node needed the full right side of the join in memory. After the shuffle, each worker node owns only a disjoint supplier-key bucket, so it builds only its share of the hash table.

#### **Step 3: Join locally within each supplier-key bucket**

After the shuffle, each worker owns one supplier-key bucket. For that bucket, it has both sides of the join: matching `lineitem` rows and the enriched `supplier ⋈ nation` rows.

Each worker can now join locally by probing its bucket-local hash table. No worker needs the full build side, and no additional network exchange is needed for this join.

![Blog-Distributed_JOINS-introduction.013.png](/uploads/Blog_Distributed_JOINS_introduction_013_65c74d64b3.png)

#### **Step 4: Shuffle by GROUP BY key for final aggregation**

The join output is still partitioned by supplier key, not by `n_name`. So the same nation can appear on multiple workers. ClickHouse reshuffles the joined rows by the `GROUP BY` key, `n_name`, so each worker owns complete groups and can compute `sum(l_extendedprice)` independently.

![Blog-Distributed_JOINS-introduction.014.png](/uploads/Blog_Distributed_JOINS_introduction_014_1ef92c67b0.png)

> **Prior bottleneck removed: no single-node final aggregate merge**  
> Previously, nodes could produce partial groups locally, but the same GROUP BY key could appear on multiple nodes, so one coordinator had to merge all partial states. After the shuffle by the GROUP BY key, each worker owns complete groups and can compute the final aggregate for its keys locally.

Here, `n_name` has only 25 distinct values, so the final merge would be small. But for high-cardinality GROUP BY, shuffling by the grouping key avoids the single-coordinator merge bottleneck. We'll come back to this planning tradeoff at the end.

#### **Step 5: Sort locally and gather the final result**

Each worker sorts its aggregated results by revenue. The GatherExchange (line 3) combines the sorted results from all workers at the coordinator to produce the final output.

![Blog-Distributed_JOINS-introduction.015.png](/uploads/Blog_Distributed_JOINS_introduction_015_227587d6be.png)

> **Prior bottleneck avoided: the coordinator only gathers final rows**  
> The coordinator still receives the query result, but the expensive work has already happened across the workers. It gathers sorted, already-aggregated rows; it does not merge large partial groups or build a large join hash table.

The example above shows the logical data movement: shuffle here, broadcast there, gather at the end. Under the hood, ClickHouse Cloud needs a transport layer that can move those intermediate blocks efficiently between workers.

### How does data move between stages?

Exchanges can move data in two ways.

The default is **streaming exchange**. Workers send blocks directly to other workers over TCP using a custom binary protocol. Data starts moving as soon as it is produced: a worker reading lineitem can begin sending blocks into a ShuffleExchange immediately, while the receiving workers start consuming them without waiting for the full input. In other words, exchanges are pipelined rather than "write everything, then read everything."

The second mode is **persisted exchange**. Instead of sending intermediate blocks directly between workers, ClickHouse writes them to shared object storage. This is useful for future fault recovery and for spilling intermediate results when a query exceeds cluster memory.

Streaming exchange is optimized for fast interactive queries and is the default. If a worker fails, the query fails and the client retries it. For these workloads, rerunning the query is usually cheaper than checkpointing every exchange.

### Why ClickHouse Cloud makes this possible

Multi-stage distributed execution depends on workers being interchangeable. A stage can run on any worker only if that worker can access the data and metadata it needs.

#### Shared storage makes workers interchangeable

ClickHouse Cloud already has that foundation. Table data lives in shared object storage, and every node has access to the metadata needed to read it. The coordinator can therefore assign stages dynamically across the cluster: any worker can scan table data, receive shuffled rows, build its share of a hash table, aggregate its assigned groups, or sort its local result.

#### Shuffle improves memory utilization

This also improves memory utilization. When a ShuffleExchange partitions a large join across 8 workers, each worker receives roughly 1/8 of the rows and builds roughly 1/8 of the hash table. A join that would require 16 GiB of memory on one node can instead use about 2 GiB per worker across 8 nodes.

#### Shared storage can avoid some broadcasts

Shared storage also opens up future optimizations. For small tables, a worker may not need to receive a broadcast over the network at all; it can read the table directly from object storage and keep it in the local SSD cache for future reads. That is useful for dimension tables like nation or supplier, where local cached reads may be cheaper than broadcasting the table through the exchange layer.

#### Toward stateless workers

The longer-term direction is fully stateless workers: nodes that can appear on demand, pick up work for a query, read the data they need from shared storage, and disappear again when the work is done. Without fixed ownership or manual data placement. Multi-stage distributed execution is a step toward that model.

### What about single-node queries?

ClickHouse's single-node execution path is unchanged. Columnar MergeTree storage, vectorized execution, and aggressive pipeline parallelism are still the foundation of query performance.

Multi-stage distributed execution is an additional, opt-in path for queries that benefit from scaling across multiple nodes. It extends ClickHouse's execution model and does not replace the single-node engine.

## TPC-H benchmark results for multi-stage distributed query execution

TPC-H is an industry-standard benchmark for analytical query processing. It consists of 22 queries across 8 tables, ranging from simple scans to complex multi-table joins, designed to simulate real-world decision-support workloads.

We ran it at scale factor 100 (~100GB of data), where the various tables have the following row counts:

- `lineitem` (600M rows)
- `orders` (150M)
- `partsupp` (80M)
- `part` (20M)
- `supplier` (1M)
- `nation` (25)

We ran the benchmark on ClickHouse Cloud staging machines with ARM (Graviton), 8 cores, and 32 GB RAM. We're using ClickHouse's SharedMergeTree and server version 26.2.1.261

The table below shows the results from running each query on 1 node (our baseline) and 8 nodes using multi-stage distributed query execution. We run each query three times and take the best time.

| Query | 1 node | 8 nodes | Speedup | Notes |
|-------|--------|---------|---------|-------|
| Q01 | 14.36s | 1.94s | 7.4x | Full scan + agg, near-linear |
| Q02 | 1.33s | 2.31s | 0.6x | Runtime filters are not fully supported |
| Q03 | 3.67s | 1.27s | 2.9x | 3-table join |
| Q04 | 3.13s | 0.74s | 4.2x | EXISTS subquery as join |
| Q05 | 6.16s | 2.31s | 2.7x | 6-table join |
| Q06 | 0.65s | 0.16s | 4.1x | Single-table scan |
| Q07 | 3.21s | 1.24s | 2.6x | 6-table join |
| Q08 | 5.61s | 2.65s | 2.1x | 8-table join |
| Q09 | 15.42s | 4.60s | 3.4x | 6-table join |
| Q10 | 5.90s | 2.39s | 2.5x | 4-table join |
| Q11 | 1.04s | 0.58s | 1.8x | 3-table join |
| Q12 | 2.45s | 0.81s | 3.0x | 2-table join |
| Q13 | 5.18s | 1.56s | 3.3x | 2-table join, two-level agg |
| Q14 | 0.49s | 0.21s | 2.3x | 2-table join |
| Q15 | 0.07s | 0.07s | 1.0x | Already fast |
| Q16 | 1.12s | 0.58s | 1.9x | 3-table join |
| Q17 | 5.99s | 2.88s | 2.1x | 2-table join + subquery |
| Q18 | 16.07s | 16.32s | 1.0x | EXISTS subquery not distributed by rule-based planner |
| Q19 | 8.09s | 1.78s | 4.5x | 2-table join |
| Q20 | 1.54s | 1.10s | 1.4x | 4-table join |
| Q21 | 14.83s | 8.77s | 1.7x | 4-table join with EXISTS/NOT EXISTS |
| Q22 | 1.31s | 0.38s | 3.4x | 2-table join |
| **Total** | **117.6s** | **54.7s** | **2.1x** |  |

### Why is Q02 slower?

Some single-node optimizations are not yet fully supported in distributed mode, e.g., runtime filters (Bloom filter pushdowns across joins). Q02 shows a regression because of this.

### What scales well?

**Near-linear scaling for scan-dominated queries.**  
Q01 (full scan + aggregation of 600M rows) achieves 7.4x on 8 nodes.

The work is almost entirely reading and aggregating, which splits evenly across workers with minimal exchange overhead.

**Good scaling (2-5x) for multi-join queries.**  
Q19 (4.5x), Q04 (4.2x), Q06 (4.1x), Q09 (3.4x), Q22 (3.4x), Q13 (3.3x), Q12 (3.0x), Q03 (2.9x).

For these queries, there is significant shuffle overhead, as every exchange involves serializing data, network transfer, and deserialization, but it's proportionally small compared to the parallelized join computation.

### Where is there room for smarter plans?

The rule-based strategy works well for most queries, but some plans are suboptimal.

Q08 shuffles both sides of a join where one side has only 134K rows after filtering - a broadcast would save reshuffling 600M rows.

Q18's `EXISTS` subquery limits parallelism. Small tables like `supplier` (1M rows) are shuffled over the network even though every worker could read them directly from shared object storage.

These limitations are not fundamental to the execution engine. The engine can execute any plan it's given, the question is which plan to give it.

## What's next?

We are working on a cost-based optimizer for multi-stage distributed query execution, which we expect will further improve query performance.

One important area is choosing the right aggregation strategy automatically. Some queries benefit from shuffling by the GROUP BY key so each worker owns complete groups; others are better served by local partial aggregation followed by a final merge. A cost-based optimizer can choose between these strategies based on cardinality, data size, and cluster shape.

Stay tuned for a future post.

## How can I use multi-stage distributed query execution?

At the time of writing (May 2026), multi-stage distributed execution is **experimental** and only available in ClickHouse Cloud as part of a private preview program.

To request access, reach out to your ClickHouse account team or contact us at [email protected]
</Release>

<Release date="May 28, 2026" published="2026-05-28T03:59:05.000Z" url="https://clickhouse.com/blog/open-house-2026-day-1" org="clickhouse" source="clickhouse-blog">
## Open House 2026 Day 1: real-time data without lock-in and what teams can build next

# Open House 2026 Day 1: Real-time data without lock-in and what teams can build next

## Summary

- ClickHouse Open House 2026 Day 1 brought announcements across ClickHouse Cloud, Postgres, distributed query execution, AI agents, observability, and lakehouse interoperability.
- ClickHouse Postgres moves to public beta delivering over 5x more transactions per second than AWS RDS; multi-stage distributed queries cut TPC-H SF100 runtime from 117.6 seconds to 54.7 seconds; Langfuse V4 delivers 200x query performance improvements.
- ClickHouse Agents powered by Claude entered public beta with a native chat experience and no-code agent builder; write support landed for Microsoft OneLake and Unity Catalog, enabling bi-directional workflows between ClickHouse and external lakehouse systems.

Today at Open House 2026 in San Francisco, we shared a broad set of announcements across ClickHouse Cloud, AI agents, observability, Postgres, and lakehouse interoperability.

Open House has always been a chance for our users to come together and exchange ideas about where real-time data systems are heading, and this year's launches reflect many of the conversations we've been having with users, contributors, customers, and partners over the last year. This post rounds up the major announcements from Day 1, and tomorrow we'll be back with another set of announcements from Day 2.

## Improving the UX for humans and agents

ClickHouse Cloud is now serving two audiences: the teams that operate it in production and the AI agents that query it programmatically. These product updates bring improvements across resilience, observability, schema management, and developer tooling to make ClickHouse Cloud easier to operate and easier to build on.

Cross-Region Replication brings an active-passive failover architecture to ClickHouse Cloud. Data is synchronously replicated to a secondary region, with recovery time measured in minutes and recovery point measured in seconds. We will begin a limited private preview later this year.

Monitoring v2 overhauls observability with opinionated service health signals, schema exploration, query insights, and mutation tracking all in one place.

A Materialized View Pipeline Visualization will give teams a live visual map of data flowing across materialized views, one-click debugging for unhealthy refreshes, and a drag-and-drop pipeline builder for creating new pipelines.

![](/uploads/open_house_day1_may2026_image3_ed975af860.png)

*Coming soon* Schema Management and Optimization introduces a fully guided, end-to-end AI-assisted schema management experience. It includes a workload-aware recommendation engine, a dedicated sandboxing environment, automated impact analysis, and a guided blue-green deployment flow for zero-downtime schema changes.

![](/uploads/open_house_day1_may2026_image10_182df22dd5.png)

On the developer and agent side, the ClickHouse CLI is built for agentic workflows, with official ClickHouse skills installable directly into Cursor, Claude Code, and other coding agents. The same CLI manages both local development and production Cloud services.

Query API Endpoints gets a unified management pane, enterprise-grade RBAC, native IDE integrations, and MCP tooling support. The AI-Enhanced Query Builder UI lets analysts build complex queries visually and move fluidly between the visual builder and ClickHouse Assistant.

And MCP-as-a-Service enables fully managed, domain-specific remote MCP servers to be spun up directly from the ClickHouse Cloud UI, with fine-grained access control and support for custom context.

Together, these updates make ClickHouse Cloud a more complete platform for running production workloads, whether the queries come from a dashboard, a notebook, or an agent.

## Postgres announcements

AI workloads are collapsing the divide between transactional and analytical databases. Applications that once ran predictable, hard-coded queries now generate unpredictable bursts of agent-driven requests that need answers from both sides of the stack. Best-of-breed matters more than ever: Postgres for OLTP, ClickHouse for OLAP. But composing them has traditionally meant standing up each database independently and bridging the two with custom CDC pipelines, message brokers, and orchestration logic.

ClickHouse Postgres, now moving from private preview to public beta, is built to close that gap. The service runs on local NVMe storage, eliminating the primary bottleneck in transactional workloads by colocating storage with compute instead of relying on network-attached volumes.

In early benchmarks, this delivers over 5x more transactions per second than AWS RDS and 2.4x more than the next closest alternative. High availability is supported with up to two standby replicas across availability zones, synchronous commit to the fastest standby, and automatic failover. Continuous WAL archiving to S3 provides point-in-time recovery, branching, and region-survivable durability.

![](/uploads/open_house_day1_may2026_image8_904253765c.png)

On the integration side, a Postgres-native CDC pipeline streams inserts, updates, and deletes directly into ClickHouse with no intermediate infrastructure, handling both parallel initial snapshots and continuous replication.

To make the experience even simpler for developers already building on Postgres, a new open-source extension, pg_clickhouse, makes ClickHouse-backed tables queryable directly from within a standard Postgres session, transparently pushing projections, filters, and aggregations down to ClickHouse for fast analytical queries. Developers continue using familiar Postgres SQL and tooling while the right engine handles the right workload behind the scenes.

For a deeper look at the architecture, benchmarks, and roadmap, read the [dedicated blog post](https://clickhouse.com/blog/postgres-managed-by-clickhouse-beta).

## Distributed queries in ClickHouse

ClickHouse Cloud now supports multi-stage distributed query execution in private preview. The feature scales large joins and high-cardinality aggregations across many nodes by repartitioning intermediate data between execution stages.

Parallel replicas already distribute probe-side work, but every node still rebuilds the full hash table from the right side of the join. High-cardinality aggregations hit a similar wall, with final aggregation collapsing onto a single coordinator. Multi-stage execution removes both bottlenecks.

![](/uploads/open_house_day1_may2026_image1_73ff57b040.png)

Workers exchange intermediate results, re-partition by join or GROUP BY keys, and operate on independent slices of the problem.

The feature builds on ClickHouse Cloud's shared-storage architecture, where any worker can access any data without fixed shard ownership. That makes it possible to distribute query stages dynamically across available compute, pointing toward more elastic execution with interchangeable worker pools.

In early TPC-H testing at scale factor 100, the full benchmark suite ran in 54.7 seconds on eight nodes, down from 117.6 seconds on a single node. Roughly a 2x improvement, with further gains expected as the cost-based optimizer matures.

For more technical details, we recommend users read the [dedicated blog post](https://clickhouse.com/blog/multi-stage-distributed-query-execution-clickhouse-cloud).

## **Join performance and other core improvements**

We presented how JOIN performance in ClickHouse improved by over 6x in the last year through support for correlated subqueries, lazy materialization, runtime filters, and automatic join reordering. These optimizations and features now put ClickHouse on par with established data warehouses across TPC-H and other standard JOIN benchmarks, often matching or exceeding the performance of systems such as Snowflake, BigQuery, and Databricks.

![image5.png](/uploads/image5_760140b60c.png)

We also recapped some of the most recent improvements in the core database, including Full-Text Search being made [generally available](https://clickhouse.com/blog/full-text-search-ga-release), addressing one of the most common requirements for observability and AI workloads.

## Write support for Microsoft OneLake and Unity catalogs

Following our [recent announcement that ClickHouse is now Data Lake Ready](https://clickhouse.com/blog/clickhouse-is-data-lake-ready), we've continued expanding support for open data lake ecosystems and catalog integrations. ClickHouse now supports write operations to Iceberg tables via Unity Catalog. Furthermore, Microsoft users can now also write directly to Iceberg tables managed by Microsoft OneLake.

This makes it possible to support fully bi-directional workflows between ClickHouse and external lakehouse systems. As organizations increasingly adopt open table formats and shared catalog layers, it is important not only for ClickHouse to act as a fast analytical engine on top of these systems, but also to allow users to write data back in open formats to ensure data remains interoperable. Write support for additional catalogs is planned over the coming months.

## ClickPipes

We're bringing ClickPipes natively to GCP! Starting today, new ClickHouse Cloud services on GCP run ClickPipes in the same region for lower-latency ingestion, data locality guarantees, and tighter integration with GCP-native features.

This includes Private Service Connect (PSC) support for secure, direct private connections to GCP-managed services behind a VPC; before, you had to tunnel connections via an SSH bastion host.

Note: If you're using ClickPipes with existing GCP services, please reach out to your account executive to discuss a migration plan.

![](/uploads/open_house_day1_may2026_image9_9179adfb9e.gif)

To deepen our integration with the GCP ecosystem even further, we're also announcing a new ClickPipes connector for Google Cloud Pub/Sub, in Private Preview. If Pub/Sub is part of your stack, you can now subscribe to topics and stream data directly into ClickHouse Cloud with no additional infrastructure. The connector supports all common formats (JSON, Avro, Protobuf) and schema registry integration, with attribute-based message filtering, flexible seek options, and per-key ordered delivery. Like all connectors, it can also be managed programmatically via OpenAPI and the ClickHouse Terraform provider.

## Bring any agent to your real-time data, with zero lock-in

AI agents are a new kind of workload, and they're breaking the old playbook. They fire ten to a hundred times the volume of queries a human analyst would, in bursts, all at once, and most analytical engines simply weren't built for it. Closed data platforms make it worse, locking teams into limited and proprietary models, a single agent, and one rigid way of working, while costs spiral. The agentic era needs a real-time substrate that's fast, affordable, and refuses to lock anyone in.

![](/uploads/open_house_day1_may2026_image2_2a92f8d9fc.png)

Today, we're introducing ClickHouse Agents, a fully managed agentic analytics service on ClickHouse Cloud, powered by Claude and now in public beta.

Every ClickHouse Cloud user gets a native chat experience and a no-code agent builder for shipping agents grounded in their own data, with zero setup. Under the hood is a sandboxed code interpreter, shareable artifacts, skills, memory, and multi-agent workflows, all built on LibreChat - the open-source platform that [joined ClickHouse last November](https://clickhouse.com/blog/clickhouse-acquires-librechat).

Because it speaks MCP natively, context can be pulled from any MCP-compatible system. ClickHouse is purpose-fit for this agentic era, where sub-second queries on billions of rows, petabyte scale, and cost efficiency are key to running agentic analytics in production successfully. Customers don't have to choose between us and the tools they already love. Use our agents out of the box, build or bring your own, get speed, transparency, and full control with your analytics AI stack.

## Langfuse announcements

As we discussed in the previous section, AI agents are introducing a fundamentally different workload pattern for data systems, and observability platforms are feeling that shift just as strongly. A single agent run can produce hundreds of observations across LLM calls and tool calls, and the interesting signal is rarely at the top of the trace but requires deep evaluation workflows in production and in development. Langfuse solves for this problem, and the team [joined ClickHouse in January](https://clickhouse.com/blog/clickhouse-acquires-langfuse-open-source-llm-observability).

Today, we're announcing Langfuse V4, in beta on Langfuse Cloud and soon to be available for self-hosted deployments.

This represents a "simplify for scalability" rewrite that collapses the data model into a single immutable observations table with no joins and no deduplication. The result is millisecond dashboard loads, 200x query performance across many important routes, and a single project that comfortably handles billions of events, all benefiting from ClickHouse's performance. Learn more about the details of this change in this [technical blog post](https://langfuse.com/blog/2026-03-10-simplify-langfuse-for-scale).

On top of V4, we're shipping a tighter improvement loop: Experiments now ship with an improved UI, code-based evaluators, and CI/CD workflows. This means prompt and agent changes can be scored before they reach production, with LLM-as-a-judge supporting categorical, boolean, and free-text scores and alerting, ensuring teams can react to score and latency regressions at agent scale.

The platform itself is becoming agent-native, with v2 of the MCP server exposing most API routes to agents, a rewritten Langfuse CLI, and skills that wrap every Langfuse capability so the same building blocks work for humans and agents. Soon, these APIs will include fast full-text search, allowing agents to explore traces in a more free-form/semantic way.

Finally, Langfuse Cloud users can now sign in using their ClickHouse Cloud identity, making adoption seamless and removing the need to manage a separate login. Langfuse itself remains open source, OpenTelemetry-native, self-hostable, and built to scale with the realities of agent workloads.

## ClickStack announcements

### ClickStack Cloud

Today, we announced the private preview of ClickStack Cloud, a fully managed, serverless observability platform built on ClickHouse. Teams can send OpenTelemetry data to a managed OTLP endpoint, then investigate logs, metrics, and traces through the ClickStack UI without operating collectors, ingestion pipelines, or ClickHouse clusters themselves. ClickStack Cloud enters private preview with managed ingestion, a serverless query experience, and integrated observability workflows built directly on ClickHouse Cloud.

During the private preview, we're focused on two areas that matter for large-scale observability workloads: independently scaling ingestion and query infrastructure, and automatically tuning the underlying datastore based on how teams actually use their telemetry data.

This means improving systems that learn from common query patterns and automatically optimize telemetry data over time. Planned areas of automatic tuning include materializing frequently queried fields, adjusting primary keys around common filters, and adding materialized views and indexes for common dashboard and investigation workflows. These capabilities will be refined alongside early adopters and design partners throughout the private preview period.

Private preview spots are limited. If you are interested in using ClickStack Cloud, sign up for the [preview program](https://clickhouse.com/cloud/clickstack-cloud-waitlist) and tell us about your observability workload.

![](/uploads/open_house_day1_may2026_image6_0cf7b01fd7.png)

For more details, see our [dedicated blog post](https://clickhouse.com/blog/clickstack-cloud-private-preview).

### Managed ClickStack

In addition to ClickStack Cloud entering private preview, our existing Managed ClickStack offering is now generally available.

Managed ClickStack is designed for teams that want direct operational control over their observability stack, including ingestion pipelines, compute sizing, workload isolation, schema design, and datastore tuning. Users manage their own OpenTelemetry collectors and ingestion architecture while using ClickHouse Cloud as the underlying observability datastore. For many large-scale deployments, that control is essential for optimizing performance and achieving market-leading cost efficiency.

![image8.png](/uploads/image8_75f25bb9a6.png)

### PromQL support

While we believe SQL is the lingua franca of data analysis, some workloads benefit from a domain-specific language. PromQL for Prometheus-style metrics is one example.

To close out our observability announcements, we wanted to share an early preview of a major area of investment for 2026: PromQL support in ClickHouse and ClickStack.

During the demo, we showed PromQL queries running against ClickHouse through the ClickStack UI.

![promql.png](/uploads/promql_9967e55514.png)

This is still very early work. There are dragons here. The implementation is incomplete, behavior will change, and there is still a large gap between the current state and something we would consider production-ready.

For anyone curious enough to experiment, the underlying functionality is already available experimentally in ClickHouse, with integration also available in open-source ClickStack.

Most of the recent work has focused on language coverage and compatibility. The goal is straightforward: existing PromQL queries should behave the way users expect when pointed at ClickHouse. Performance work is happening alongside this, so PromQL queries can still benefit from the scale and execution speed of the ClickHouse engine.

## Partner program announcement

ClickHouse also introduced House Mates, its first formal partner community and program. It launches with a founding cohort of more than 25 technology partners and over 35 services, consulting, and channel partners across six continents. The program is organized across three tracks: Technology, Services, and Reseller, each with three tiers: Ignite, Accelerate, and Prime. Benefits scale with tier and include joint go-to-market motions, co-innovation and integration support, enablement and certifications, incentives, and a dedicated partner portal. [Read the announcement blog](https://clickhouse.com/blog/introducing-house-mates) for more details.

![image6.png](/uploads/image6_39ae0d5084.png)

## Wrap-up

Today's announcements span distributed query execution, observability, lakehouse interoperability, agentic analytics, and transactional workloads, all while keeping teams in control of the tools and architectures they use. Whether that means bringing your own agent, your own catalog, your own collectors, or continuing to work from familiar Postgres workflows, the focus remains the same: fast real-time systems without lock-in.

The performance improvements behind these launches were equally significant, including over 5x higher throughput than AWS RDS for Postgres workloads, roughly 2x faster distributed query execution at TPC-H scale factor 100, and up to 200x query performance improvements in Langfuse V4. We'll be back tomorrow with another round of announcements and deeper technical sessions for Day 2.
</Release>

<Release date="May 28, 2026" published="2026-05-28T00:00:00.000Z" url="https://dagster.io/blog#snowflake-runs-your-data-dagster-runs-everything-else" org="dagster" source="dagster-blog">
## Snowflake Runs Your Data: Dagster Runs Everything Else

Snowflake increasingly handles transformation and data freshness internally through features like Dynamic Tables and Cortex. Dagster complements Snowflake by providing orchestration, lineage, automation, and cost visibility across your broader data platform from SQL-defined assets to downstream automation and Snowflake query attribution.
</Release>

<Release date="May 27, 2026" published="2026-05-27T23:33:05.000Z" url="https://clickhouse.com/blog/observability-mcp-server-ai-notebooks" org="clickhouse" source="clickhouse-blog">
## Open House observability announcements: MCP server, AI Notebooks, and ClickStack Cloud

At Open House 2026, we announced ClickStack Cloud in private preview, AI Notebooks in beta, and the ClickStack MCP server: three updates that make observability on ClickHouse faster to get started with, easier to investigate in, and more composable with y
</Release>

<Release date="May 27, 2026" published="2026-05-27T17:00:18.000Z" url="https://clickhouse.com/blog/clickstack-cloud-private-preview" org="clickhouse" source="clickhouse-blog">
## Introducing ClickStack Cloud: Serverless observability powered by ClickHouse

Introducing ClickStack Cloud: a fully managed, serverless observability platform built on ClickHouse where you send OpenTelemetry data to a managed endpoint and immediately explore logs, metrics, and traces without operating any infrastructure.
</Release>

<Release date="May 27, 2026" published="2026-05-27T16:59:43.000Z" url="https://clickhouse.com/blog/introducing-house-mates" org="clickhouse" source="clickhouse-blog">
## Introducing House Mates: the ClickHouse partner community and program

We are launching House Mates, a partner community and program bringing together 60+ ISV, consulting, and channel partners to help customers build, scale, and buy ClickHouse solutions with confidence.
</Release>

<Release date="May 27, 2026" published="2026-05-27T16:59:21.000Z" url="https://clickhouse.com/blog/postgres-managed-by-clickhouse-beta" org="clickhouse" source="clickhouse-blog">
## Postgres managed by ClickHouse is now in beta

Postgres managed by ClickHouse is now in public beta - a fully managed, NVMe-backed Postgres service with native CDC into ClickHouse and a unified query layer via pg\_clickhouse.
</Release>

<Release date="May 27, 2026" published="2026-05-27T14:05:56.000Z" url="https://clickhouse.com/blog/costbench-data-warehouse-cost-performance" org="clickhouse" source="clickhouse-blog">
## Introducing CostBench: an open benchmark for data warehouse cost-performance

Introducing CostBench, an open benchmark that turns cloud data warehouse runtime and billing models into comparable performance-per-dollar results.
</Release>

<Release date="May 27, 2026" published="2026-05-27T14:03:09.000Z" url="https://clickhouse.com/blog/thank-you-for-building-with-us" org="clickhouse" source="clickhouse-blog">
## Thank you for building with us

Today we welcome over 500 data engineers to our annual user conference in San Francisco and announce some exciting product launches and company milestones.
</Release>

<Release date="May 27, 2026" published="2026-05-27T08:38:31.000Z" url="https://clickhouse.com/blog/alexey-apj-ai-tour" org="clickhouse" source="clickhouse-blog">
## 5 Countries, 12 Days, One ClickHouse: Inside Alexey's APJ AI Tour

5 Countries, 12 Days, One ClickHouse: Inside Alexey's APJ AI Tour
</Release>

<Release version="v26.3.12.3-lts" date="May 22, 2026" published="2026-05-22T19:24:41.000Z" url="https://github.com/ClickHouse/ClickHouse/releases/tag/v26.3.12.3-lts" org="clickhouse" source="clickhouse">
## Release v26.3.12.3-lts

</Release>

<Release version="1.13.6" date="May 22, 2026" published="2026-05-22T18:17:05.000Z" url="https://github.com/dagster-io/dagster/releases/tag/1.13.6" org="dagster" source="dagster">
## 1.13.6 (core) / 0.29.6 (libraries)

### New

- Schedule, sensor, and asset daemon ticks now dispatch instigators round-robin across code locations, preventing a single code location with many instigators from delaying launches in other code locations.
- Added `dg api run launch` command for launching runs from the CLI against the Dagster+ API.
- Improved declarative automation performance for assets sharing the same cron schedule.
- [ui] The asset catalog now shows a toggle to hide duplicate connections-defined assets when an SDA and a connection target the same table, and the asset sidebar links to matched assets.
- [dagster-dbt] Added a configurable job pool to `DbtCloudComponent` and `DbtCloudWorkspace` to work around dbt Cloud's one-concurrent-run-per-job limit.

### Bugfixes

- Fixed an issue where asset keys and partitions containing leading "/" characters or ".." characters would sometimes cause IOManagers to read and write outside of the supplied `base_path`.
- Fixed a regression where a blocking asset check that did not emit a result (e.g. a dbt test skipped under `indirect_selection: cautious` or `buildable`) would silently cause downstream assets to be skipped. Missing check results now allow downstream assets to proceed; failed checks still fail the step.
- Softened the `job_snapshot_id` mismatch invariant in execution plan persistence from a crash to a warning, preventing run launch failures when client and server compute slightly different snapshot IDs.
- Pinned `antlr4-python3-runtime` to `<4.14` to prevent potential breakage from future ANTLR minor version bumps.
- [ui] Fixed minor flex alignment issues for tags and metadata buttons on asset catalog detail pages.
- [dagster-k8s] Added support for the PEP 585 `dict[str, str]` syntax used by the `kubernetes` client v36+ when snake-casing model dictionaries, fixing an `AttributeError: module 'kubernetes.client.models' has no attribute 'dict[str, str]'` raised on `kubernetes>=36`. (Thanks, [@vanika02](https://github.com/vanika02)!)

### Documentation

- Added documentation for defining asset dependencies across code locations.
- Added `dg utils` CLI reference documentation.
- Added missing `DAGSTER_POSTGRES_HOST` environment variable to the Docker deployment documentation.
- Expanded Databricks integration documentation with Databricks Connect, serverless compute (Pipes), and UC Volumes patterns.
- [dagster-dbt] Documented the `FROM_ASSET_FAILURE` retry strategy for dbt asset jobs.
</Release>

<Release version="v26.3.11.36-lts" date="May 22, 2026" published="2026-05-22T15:30:23.000Z" url="https://github.com/ClickHouse/ClickHouse/releases/tag/v26.3.11.36-lts" org="clickhouse" source="clickhouse">
## Release v26.3.11.36-lts

</Release>

<Release version="v26.2.19.43-stable" date="May 22, 2026" published="2026-05-22T14:54:42.000Z" url="https://github.com/ClickHouse/ClickHouse/releases/tag/v26.2.19.43-stable" org="clickhouse" source="clickhouse">
## Release v26.2.19.43-stable

</Release>

<Pagination cursor="2026-05-22T14:54:42.000Z|2026-05-22T15:01:54.558Z|rel_whNwgseByLUCMDA-IXrbK" next="https://releases.sh/collections/data-engineering/releases?cursor=2026-05-22T14%3A54%3A42.000Z%7C2026-05-22T15%3A01%3A54.558Z%7Crel_whNwgseByLUCMDA-IXrbK&limit=20" />