Are ClickHouse JOINs Slow? A 2026 PR-by-PR Analysis

Apr 14, 2026

TL;DR

Are ClickHouse JOINs slow? Not since 2022. Over 50 merged PRs between 2022 and 2026 rebuilt the join engine from the ground up. The evidence is in the commit history.

We analyzed 50+ GitHub pull requests, official ClickHouse changelogs, and release blogs to trace the full evolution of JOIN support from 2022 through early 2026.
In 2021, the criticism was fair. ClickHouse had one join algorithm (hash join), no disk spilling, no cost-based optimization, and join order followed query syntax. If your right table exceeded memory, the query crashed.
By early 2026, ClickHouse ships six distinct join algorithms, cost-based global join reordering with dynamic programming, runtime bloom filters at the storage layer, parallel hash join as the default, correlated subquery decorrelation, and automatic build-side selection. None of this requires manual tuning.
The single highest-impact change is equivalence-set filter pushdown (PR #61216), which delivered 180×+ speedups by propagating predicates across join sides through column equivalence classes. PostgreSQL and Oracle’s planners use the same technique, and ClickHouse implements it natively in its columnar vectorized engine.
Grace hash join (PR #38191) eliminated OOM crashes for memory-bound joins. Parallel hash join (PR #70788) became the default and scales near-linearly across CPU cores. Neither requires configuration.
Global join reordering with column statistics (PR #86822) produces 1,450× speedups on TPC-H SF100 by automatically finding the optimal join order. The DPsize dynamic programming algorithm (PR #91002) further improves this for complex multi-table queries.
Runtime bloom filters, enabled by default since February 2026 (PR #89314), dynamically prune probe-side data at the storage scan level. The v25.10 release blog reports a 2.1× speedup and 7× memory reduction on star-schema workloads.
Verdict: the “avoid JOINs in ClickHouse” advice made sense in 2020. Repeating it in 2026 is misinformation. ClickHouse’s join engine now operates with the planning sophistication of a mature enterprise RDBMS, and it does so inside the columnar vectorized execution model that makes ClickHouse fast in the first place.

Why People Still Say “Avoid JOINs in ClickHouse”

If you’ve evaluated ClickHouse in the last few years, you’ve heard the warnings:

“Avoid JOINs in ClickHouse”
“ClickHouse doesn’t handle JOINs well”
“Denormalize everything, always use flat tables”
“JOINs are slow in ClickHouse”
“Only hash join available, limited join algorithms”

Some of these started as legitimate ClickHouse documentation circa 2019–2020 that advised caution with joins. Others were amplified by competitors who found a convenient story: ClickHouse is fast for scans, but it can’t join.

In 2020, the criticism was mostly fair. ClickHouse had a single hash join algorithm, no disk spilling, no cost-based optimizer, and join order followed query syntax. If your right table exceeded memory, the query crashed with OOM.

Then ClickHouse’s engineering team spent four years dismantling every one of those limitations. Over 50 significant pull requests merged. They added six join algorithms, a cost-based optimizer with dynamic programming, runtime bloom filters, and automatic algorithm selection, build-side selection, join reordering, and predicate pushdown.

This article traces that evolution with PR-level evidence. No marketing claims. No benchmarks on toy datasets. Just the commit history.

Methodology: How We Analyzed ClickHouse’s Join Commit History

We went through ClickHouse’s GitHub commit history, pull requests, changelogs, and release blogs from 2022 through early 2026. The scope covered every PR that touched the join subsystem: algorithm changes, optimizer rewrites, planner passes, correctness fixes, and default configuration changes.

Each PR was classified by category (algorithm, optimizer, parallelism, correctness), impact severity, and whether it changed default behavior. We cross-referenced PR descriptions against changelog entries and release blog benchmarks to verify the claimed improvements. Where multiple PRs addressed the same subsystem, we traced the dependency chain to understand how the incremental changes compounded.

The result is a ranked list of 50 pull requests by impact, organized into eight thematic arcs, with full provenance. Every claim in this article maps to a specific merged PR that you can verify yourself on GitHub.

This isn’t a benchmarking exercise. Benchmarks measure peak performance on controlled workloads. This analysis measures the engineering trajectory: what was built, why, and what it means for teams deciding whether to use JOINs in ClickHouse today.

ClickHouse JOIN Features in 2026: What Ships by Default

The current state, as of early 2026:

Six distinct join algorithms: hash, parallel hash, grace hash (disk-spilling), full sorting merge, direct (key-value), and paste join. Each one is optimized for a different workload shape, and the engine selects automatically.
Cost-based global join reordering: Greedy and dynamic programming algorithms find the optimal join order using column statistics. No manual query rewriting needed.
Runtime bloom filters: Build-side join keys compile into bloom filters that get pushed down to probe-side storage scans, filtering non-matching rows before they reach the join.
Equivalence-set predicate pushdown: Filters propagate transitively across multi-table join chains. WHERE t1.id = 5 pushes to t2, t3, and beyond when joined on equivalent keys.
Parallel execution by default: Parallel hash join scales near-linearly with cores. No configuration required.
Correlated subquery support: EXISTS, scalar subqueries, and projection-list subqueries are automatically decorrelated into joins.

These aren’t experimental features hidden behind flags. They’re defaults that ship with every ClickHouse installation.

ClickHouse JOIN Myths vs. Reality: A 2026 Checklist

Phase 1 (2022): How Many Join Algorithms Does ClickHouse Support?

The FUD: “ClickHouse only has hash join”

In mid-2022, ClickHouse had exactly one production join algorithm: hash join. The criticism was valid. Then, in roughly six months, three new algorithms landed.

Full Sorting Merge Join: Memory-Bounded Joins (July 2022)

PR #35796 introduced full sorting merge join, a classical sort-merge algorithm integrated into ClickHouse’s pipeline. Both sides sort by join keys (with external sorting if needed), then merge in streaming fashion. Memory is bounded by the sort buffer, not by hash table size.

This mattered for two reasons. First, it was the first non-memory-bound join algorithm in ClickHouse, so you could join tables larger than RAM without crashing. Second, it skips sorting entirely when physical row order already matches join keys, which makes it faster than hash join for pre-sorted data.

A follow-up optimization (PR #39418) builds an in-memory key set from the smaller table to pre-filter the larger table before sorting. That made full sorting merge competitive with hash join on general workloads, not just pre-sorted ones.

Grace Hash Join: Disk-Spilling for Out-of-Memory JOINs (November 2022)

PR #38191 was arguably the most important foundational change of this era. Grace hash join partitions both inputs into buckets via a secondary hash. Only one bucket pair is processed at a time, and inactive buckets spill to disk.

Before this PR, a join where the right table exceeded available memory crashed with OOM. After it, the join completed. It just took longer.

Grace hash initially supported only INNER and LEFT joins. FULL and RIGHT support arrived in July 2023 (PR #51013), and a cache locality optimization (PR #72237) delivered a ~24% speedup in late 2024. It graduated to GA in v24.3, which closed issue #11596, the most upvoted join-related issue in ClickHouse’s history, open since June 2020.

Direct Join: O(1) Memory Key-Value Lookups (2022)

PR #35363 introduced direct join for EmbeddedRocksDB tables, and PR #38956 extended it to dictionaries with SEMI/ANTI support.

Direct join bypasses hash table construction entirely. It performs O(1) key-value lookups against the storage engine for each left-side row, and memory usage stays constant regardless of right table size.

ConcurrentHashJoin: The Foundation for Parallel Hash Join (May 2022)

PR #36415 laid the groundwork for what would become ClickHouse’s most impactful default change. ConcurrentHashJoin creates multiple HashJoin instances, one per thread, and partitions both build and probe sides for concurrent execution. This was the foundation for parallel hash join, which became the default two and a half years later.

By the end of 2022, ClickHouse had five distinct join algorithms where it previously had one. The “only hash join available” criticism had a documented expiration date.

Phase 2 (2023–2024): Does ClickHouse Have a Query Optimizer for JOINs?

The FUD: “ClickHouse has no query optimizer for JOINs”

The promotion of ClickHouse’s new query Analyzer to production status in v24.9 was the catalyst. The Analyzer provides richer semantic information about column relationships than the old parser-based planner did, which enabled a class of optimizations that were previously impossible.

Equivalence-Set Filter Pushdown: 180× Speedup (April 2024)

PR #61216 is the single highest-impact join optimization in this entire four-year period. It introduced equivalence-class-based predicate pushdown across join sides.

The logic is straightforward. When tables are joined on t1.id = t2.id, a filter WHERE t1.id = 5 is equivalent to t2.id = 5. The optimizer recognizes this equivalence and pushes the filter to both sides of the join before execution.

Before this PR, filters were applied only after the join completed, which forced full table scans of both sides. After it, filters propagate to both sides and prune data before it reaches the join. Benchmarks show up to 180×+ improvement.

This was later extended to work across chains of multiple INNER JOINs (PR #96596) using a Disjoint Set Union data structure to track transitive equalities. For a query joining t1, t2, and t3 on equivalent keys, a filter WHERE t1.id = 42 now pushes to all three tables.

Automatic OUTER JOIN to INNER JOIN Conversion (April 2024)

PR #62907 automatically converts OUTER JOINs to INNER JOINs when post-join filter conditions make the outer semantics unnecessary. A LEFT JOIN ... WHERE right_col IS NOT NULL is functionally an INNER JOIN, and the optimizer now recognizes this.

This matters beyond the immediate execution improvement (benchmarks show 32s to 0.006s in some cases) because it enables cascading optimizations. INNER JOINs allow predicate pushdown and join reordering that are structurally impossible for OUTER JOINs. Converting the join type first unlocks the full optimization pipeline downstream.

Right-Side Pushdown, OR Conditions, and Common Expression Extraction

The planner intelligence kept accumulating:

PR #50532 extended predicate pushdown to the right side of joins, delivering 27× improvement on applicable queries.
PR #84735 enabled pushdown of OR conditions through joins. Previously only AND conditions could be pushed.
PR #71537 extracted common expressions from WHERE/ON clauses, which reduced redundant hash table instantiation for BI-generated queries with complex OR conditions.
PR #78877 moved equality predicates from WHERE into JOIN ON conditions, enabling more efficient hash table lookups.

Each of these operates automatically. No query hints, and no manual rewriting. The planner just does the right thing.

Phase 3 (Late 2024): Do ClickHouse JOINs Scale Across CPU Cores?

The FUD: “JOINs don’t scale across cores in ClickHouse”

Parallel Hash Join Becomes the Default Algorithm (November 2024)

PR #70788 changed the default join_algorithm from ‘direct,hash’ to ‘direct,parallel_hash,hash’. Every ClickHouse installation now uses parallel hash join by default.

The parallel hash join builds hash tables using multiple threads via hash-based sharding. The probe phase shards the same way for lock-free concurrent execution. No configuration needed, and scaling is near-linear with CPU cores.

This was the most broadly impactful default configuration change in this period. Every hash join query on every ClickHouse installation benefits without any user action.

The path to default status was paved by years of incremental improvements. Hash table size statistics caching (PR #64553) pre-allocates tables on repeat queries. Zero-copy block scattering (PR #67782) eliminated redundant memory copies. An adaptive threshold (PR #76185) falls back to single-threaded hash join for small tables where parallelism would add overhead. Two-level hash maps in v25.1 yielded another ~40% speedup.

Parallelizing OUTER JOIN Completion (February 2026)

PR #92068 addressed the last remaining single-threaded bottleneck in parallel hash join. For FULL and RIGHT OUTER joins, the “non-joined rows” (rows from the build side with no match) were previously emitted by a single thread. That created an Amdahl’s Law bottleneck that limited outer join scalability. The fix parallelizes non-joined row emission across all hash table buckets.

Phase 4 (2025–2026): ClickHouse Cost-Based Join Optimization

The FUD: “You have to manually optimize join order in ClickHouse”

Automatic Build-Side Selection for Hash Joins (November 2024)

PR #71577 introduced query_plan_join_swap_table = ‘auto’. The optimizer estimates table sizes and places the smaller table on the build (right) side of hash joins. This was the first step toward automatic join reordering.

Statistics-Driven Global Join Reordering: 1,450× TPC-H Speedup (v25.9)

PR #86822 introduced global join reordering using a greedy algorithm with column statistics. For queries joining three or more tables, the optimizer evaluates estimated cardinalities and selects the join order that minimizes intermediate result sizes.

The numbers on TPC-H SF100: 1,450× speedup and 25× memory reduction compared to syntax-order execution. That kind of improvement turns “don’t use JOINs” into “write whatever join order you want, the optimizer will figure it out.”

DPsize Dynamic Programming Join Reordering (v25.12)

PR #91002 added a dynamic programming algorithm (DPsize) for more exhaustive join order search. The greedy algorithm makes locally optimal choices, but DPsize evaluates subsets of joined relations systematically. It produces ~4.7% further improvement over greedy on TPC-H, with bigger gains on complex multi-table queries.

The optimizer tries DPsize first and falls back to greedy if the complexity threshold is exceeded. That’s how mature query planners work.

Automatic Statistics Collection for the Join Optimizer

The optimizer is only as good as its statistics. Column statistics moved from manual (ALTER TABLE ADD STATISTICS) to automatic. PR #89332 enabled allow_statistics_optimize by default in v25.10.

Runtime hash table size statistics (PR #93912) close the feedback loop between execution and planning. Actual observed sizes from previous queries inform future optimization decisions.

ClickHouse Runtime Bloom Filters and Star Schema JOIN Performance

The FUD: “ClickHouse can’t handle star/snowflake schemas”

PR #89314, merged February 2026, enabled runtime bloom filters by default. During hash table construction, ClickHouse builds a bloom filter from the build-side join keys and pushes it down to the probe-side scan pipeline. Rows that don’t match the bloom filter are discarded at the storage scan level, before they ever reach the join.

For star-schema workloads where fact tables are orders of magnitude larger than dimension tables, this is transformative. The v25.10 release blog reports a 2.1× overall query speedup and 7× memory reduction.

The implementation was hardened through 10+ follow-up correctness fixes addressing edge cases: Nullable keys (PR #94555), multi-key ANTI joins (PR #98871), const columns, Merge tables, and more. An adaptive mechanism (PR #91578) dynamically disables bloom filters at runtime when they become saturated or aren’t filtering enough rows, which prevents negative ROI on non-selective joins. Coverage was extended to RIGHT OUTER joins (PR #96183).

Runtime filters can also be pushed into PREWHERE (PR #95838), ClickHouse’s storage-layer pre-filtering mechanism, for maximum efficiency.

Does ClickHouse Support Correlated Subqueries? (2025 Decorrelation)

The FUD: “ClickHouse can’t do correlated subqueries”

This one was true until April 2025. PR #76078 introduced the first correlated subquery decorrelation support, converting EXISTS with correlated references into joins. Scalar subquery support followed (PR #79600), then projection-list subqueries (PR #79925).

PR #85107 promoted correlated subqueries to beta with default enablement in August 2025. That closed issue #6697, one of the longest-standing SQL compatibility gaps in ClickHouse, open since 2019.

Teams migrating from PostgreSQL, MySQL, or Snowflake no longer need to manually rewrite correlated subqueries into explicit joins. The planner does it automatically.

ClickHouse Hash Join Internals: Low-Level Optimizations

Beyond the headline features, ClickHouse’s most-used join algorithm received systematic low-level optimization that compounds across every query:

Main loop specialization (PR #82308): Compile-time elimination of null_map and join_mask checks for single-key joins. No more unnecessary branches on every row.
JoinUsedFlags vector optimization (PR #83043): Replaced hash-based flag tracking with atomic vectors, removing per-access hash computation in FULL/RIGHT joins.
Output size enforcement (PR #56996): max_joined_block_size_rows prevents catastrophic memory spikes from ALL JOIN row replication.
Cache locality improvements (PR #60341): Right-table reranging by join keys for cache-friendly access patterns.
Dynamic dispatch (PR #79573): Optimized ColumnVector::replicate in the hash join hot path, lowering CPU per output row.

None of these individually make a press release. Together, they compound into a materially faster join engine at every level of the stack.

ClickHouse JOIN Limitations and Trade-offs in 2026

Fairness matters. A few things still require awareness:

Denormalization still has value for extreme latency targets. If you need sub-10ms p99 on dashboard queries and you can afford the storage, flat tables remain faster than joins. The optimizer is good, but it isn’t free.
Join reordering depends on statistics. When statistics are missing or stale, the optimizer can pick suboptimal plans. The system increasingly collects statistics automatically, but monitoring is still your responsibility.
Correlated subqueries are beta. They work for common patterns like EXISTS and scalar subqueries, but edge cases exist. For complex correlated logic, explicit join rewrites may still be necessary.
Grace hash join trades speed for completion. Disk-spilling joins complete instead of crashing, but they’re slower than in-memory execution. If you consistently need to spill, you need more memory or a different data model.
Correctness fixes are ongoing. The volume of bug fixes following runtime filter enablement (10+ PRs) shows how complex cross-cutting optimizations get when they’re enabled by default. ClickHouse’s engineering team has been rigorous about correctness, but running the latest stable release matters.

These are real engineering trade-offs, and understanding them is part of making an informed decision.

ClickHouse JOIN Improvements Timeline (2022–2026)

When Should You Use JOINs in ClickHouse?

How to Respond to “Avoid JOINs in ClickHouse”

Run the PR numbers.

When someone tells you ClickHouse can’t do joins in 2026, ask them if they’ve tested against a version that includes parallel hash join (default since v24.12), equivalence-set predicate pushdown (v24.4), grace hash join (GA since v24.3), runtime bloom filters (default since v25.10), or cost-based join reordering (v25.9).

If they’re benchmarking against ClickHouse 23.x or earlier, or repeating 2020-era blog posts, they aren’t evaluating ClickHouse. They’re evaluating a system that no longer exists.

The commit history doesn’t lie. 50+ pull requests. Six algorithms. Cost-based optimization. Runtime filtering. Automatic algorithm selection, build-side selection, join reordering, and predicate pushdown.

ClickHouse’s join subsystem in 2026 bears no resemblance to the one that earned those early warnings. The engineers built a modern join engine, and the evidence is in the PRs.

Test it on your workload. That’s the only benchmark that matters.

ClickHouse JOINs FAQ

Are JOINs production-ready in ClickHouse in 2026?

Yes. ClickHouse’s join subsystem has been transformed since 2022. Six join algorithms, cost-based global join reordering, runtime bloom filters, and parallel execution all ship enabled by default. The “avoid JOINs” advice is outdated by four years and 50+ merged PRs.

What is the most impactful ClickHouse join optimization?

Equivalence-set filter pushdown (PR #61216), which delivers 180×+ speedups by propagating predicates across join sides. For multi-table workloads, global join reordering (PR #86822) with 1,450× speedup on TPC-H SF100 is equally transformative.

Does ClickHouse still crash with OOM on large joins?

No. Grace hash join (PR #38191) introduced disk-spilling in November 2022 and graduated to GA in v24.3. Full sorting merge join (PR #35796) provides a memory-bounded sort-merge alternative. Both algorithms guarantee completion regardless of data size.

How many join algorithms does ClickHouse support?

Six: hash join, parallel hash join (default), grace hash join (disk-spilling), full sorting merge join, direct join (O(1) memory key-value), and paste join (positional). The engine selects the appropriate one automatically.

Does ClickHouse have cost-based join optimization?

Yes, since v25.9. The optimizer uses column statistics for greedy join reordering and a DPsize dynamic programming algorithm (v25.12) for exhaustive search. Statistics collection is automatic since v25.10 (PR #89332). Runtime hash table statistics (PR #93912) feed execution data back into the planner.

Should I denormalize tables in ClickHouse instead of using JOINs?

It depends on your latency requirements. For sub-10ms p99 dashboard queries, flat tables remain the fastest path. For analytical workloads where query readability, data freshness, and storage efficiency matter, normalized star/snowflake schemas with JOINs are now well-optimized. The “always denormalize” advice is no longer a blanket recommendation.

Can ClickHouse handle star and snowflake schemas?

Yes. Runtime bloom filters (default since February 2026, PR #89314) specifically target star-schema patterns by filtering fact table rows at the storage layer using dimension table keys. Combined with cost-based join reordering and predicate pushdown, star/snowflake schemas are a first-class workload.

Does ClickHouse support correlated subqueries?

Yes, since mid-2025. Correlated subquery decorrelation (PR #76078) automatically converts EXISTS, scalar, and projection-list subqueries into joins. Promoted to beta with default enablement in August 2025 (PR #85107). Closes a feature gap that had been open since 2019.

Analysis based on 50+ GitHub pull requests, official ClickHouse changelogs, and release blogs covering the period 2022–2026. Every claim maps to a specific merged PR. Verify the evidence yourself, the commit history is public.

A guest post by

Manveer Chawla

I’m founder of Zenith AI Labs, where we help businesses grow with AI. Formerly scaled Confluent, Dropbox, and Facebook. I write about building agents, data platforms, and the service-as-software model.

Data Analytics Guide

Discussion about this post

Ready for more?