Does anyone use clickhouse in production? I was initially pretty impressed but when I really put it through its paces I could OoM it as soon as I actually started querying non-trivial amounts of data:
Yep. Clickhouse is absolutely great for tons of production use cases.
Unless you try to join tables in it, in which case it will immediately explode.
More seriously, it's a columnar data store, not a relational database. It'll definitely pretend to be "postgres but faster", but that's a very thin and very leaky facade. You want to do massively a complex set of selects and conditional sums over one table with 3b rows and tb of data? You'll get a result in tens of seconds without optimization. You want to join two tables that postgres could handle easily? You'll OOM a machine with TB of memory.
So: good for very specific use cases. If you have those usecases, it's great! If you don't, use something else. Many large companies have those use cases.
Yeah I think that's a good summary. For instance, clickbench is comprised of >40 queries and there's not a single join in them: https://github.com/ClickHouse/ClickBench/blob/main/clickhous...
There is the "versions benchmark," which includes a lot of queries with JOINs and compares ClickHouse performance on them: https://benchmark.clickhouse.com/versions/
The majority of our queries have joins (plus our core logic often depends on fact table expansion with `arrayJoin()`s) before aggregations and we're doing fine. AFAIK whenever we hit memory issues, they are mostly due to high-cardinality aggregations (especially with uniqExact), not joins. But I'm sure it can depend on the specifics.
Definitely agree with this, I think ClickHouse can do a lot with joins if you don't implement them naively. Keeping the server up-to-date is a part of it too.
They've made strides in the last year or two to implement more join algorithms, and re-order your joins automatically (including whats on the "left" and "right" of the join, relating to performance of the algorithm).
Their release notes cover a lot of the highlights, and they have dedicated documentation regarding joins[1]. But we've made improvements by an order-of-magnitude before by just reordering our joins to align with how ClickHouse processes them.
> More seriously, it's a columnar data store, not a relational database.
Could you explain why you don't think ClickHouse is relational? The storage is an implementation detail. It affects how fast queries run but not the query model. Joins have already improved substantially and will continue to do so in future.
The storage is not just an implementation detail because it affects how fast things run, which affects which tasks it's better or worse for. There's a reason people reach for a columnar datastore for some tasks and something like postgres or mysql for other tasks, even though both are technically capable of nearly the same queries.
It's used in production by many thousands of companies at this point. The ClickHouse Inc numbers are just a fraction of the total users.
p.s., It's also possible to break ClickHouse as you demonstrated. It used to be a lot easier.
I guess I'm curious how; I breathe on it wrong and it OoMs.
One of the tradeoffs for ClickHouse versus databases like Snowflake is that you have to have some knowledge about the internals to use it effectively. For example, Snowflake completely hides partitioning but on the other hand it does not deliver consistent, real-time response the way a well-tuned ClickHouse application can.
When you use INSERT ... SELECT in ClickHouse you do need to pay attention to the generated table partitions, as they coexist in memory before flushing to storage. The usual approach is to break up the insert into chunks so you can control how many parts are generated or to adjust the partitioning in the target table.
It's possible the problem might be somehow related to this behavior but that's just conjecture. It's usually pretty easy to work around. Meanwhile if it's a bug it will probably get fixed quickly.
You have to have knowledge of the internals of any database you use. Not knowing is going to cost someone a lot of money and/or performance.
One easy way to achieve this is to store really large values, e.g. 10 Mb per row. Since ClickHouse operates in large blocks you'd easily cause an OOM just by trying to read chunks of 8192 rows (the default) at a time, especially during merges, where it needs to read large blocks from several parts at once
You don’t need a good product to have a lot of users, just good marketing and salespeople.
Yes (via Clickhouse Cloud, which is pretty reasonably priced).
It’s important to structure your tables and queries in a way that aligns with the ordering keys, in order to optimize how much data needs to be loaded into RAM. You absolutely CANNOT just replicate your existing postgres DB and its primary keys or whatever over to CH. There are tricks like projections and incremental materialized views that can help to get the appropriate “lenses” for your queries. We use incremental MVs to, for example, continuously aggregate all-time stats about tens of billions of records. In general, for CH, space is cheap and RAM is expensive, so it’s better to duplicate a table’s data with a different ordering key than to make an inefficient query.
As long as the queries align with the ordering keys, it is insanely fast and able to enable analytics queries for truly massive amounts of data. We’ve been very impressed.
I find Clickhouse fascinating, really good, and also really tough to run. It's a non-linear memory hog. It probably needs 32GB RAM for basics to run, otherwise it will OOM on minimal amount of data. That said, it won't "OOM", as in crash. It will just report the query would use too much memory, so it aborted the query.
Thanks for creating this issue, it is worth investigating!
I see you also created similar issues in Polars: https://github.com/pola-rs/polars/issues/17932 and DuckDB: https://github.com/duckdb/duckdb/issues/17066
ClickHouse has a built-in memory tracker, so even if there is not enough memory, it will stop the query and send an exception to the client, instead of crashing. It also allows fair sharing of memory between different workloads.
You need to provide more info on the issue for reproduction, e.g., how to fill the tables. 16 GB of memory should be enough even for a CROSS JOIN between a 10 billion-row and a 100-row table, because it is processed in a streaming fashion without accumulating a large amount of data in memory. The same should be true for a merge join.
However, there are places when a large buffer might be needed. For example, if you insert data into a table backed by S3 storage, it requires a buffer that can be in the order of 500 MB.
There is a possibility that your machine has 16 GB of memory, but most of it is consumed by Chrome, Slack, or Safari, and not much is left for ClickHouse server.
Heard from the grapevine that CloudFlare uses it for their analytics.
They don't make a secret of it: https://blog.cloudflare.com/log-analytics-using-clickhouse/
Clickhouse is great, but like any database if you run it at scale someone must tend to it.