OLAP on Tap: You're just my type
I've been learning a bunch about OLAP fundamentals. I will be trying to encode that learning into blog posts and agent guides. Let me know your thoughts!
When I was prototyping schemas in PostgreSQL, I'd often be fast an loose with my typing. Storage was cheap, and the flexibility afforded to me in not thinking about the types was more important to me then saving those few cents.
What was important to me was retrieval, and in OLTP, the important thing were the indices. The type efficiency wasn't that material.
When you move to OLAP, you move to a different paradigm. It isn't about finding the one record, using an index. It's about retrieving thousands or millions of records out of potentially billions, and doing aggregations over them.
So, like PostreSQL's indices, we have order by fields. But they help us find batches of data.
Then, you need to extract from each of those potentially millions of rows in that batch. The size of each row therefore becomes critical; with inefficient types potentially doubling or more the size of the data, bloating storage, CPU usage, and query latency.
That’s why typing matters in ClickHouse. Use UInt16 instead of UInt32 when you can.Prefer Float32 over Float64 unless you need the precision.And for small sets of repeating strings (states, manufacturers, etc.),wrap them in LowCardinality(String) — ClickHouse will store them as a dictionary under the hood.
ClickHouse can help you out here a bit too, with low cardinality tags allowing you to decrease the size of a column dramatically by mapping the data in that column (even if they are typed like a string) into a dictionary. So if you only have a few strings (like states, manufacturers, etc), this becomes an optimal, easy solution. More on that later.
The downsides to strict typing are:
- don't be too strict, it is not trivial to change a type of a column in ClickHouse
- it takes some thought to get the right types
That second point becomes moot with LLM assisted development. And, to help your agents with typing correctly, I've created a guide you can import into your project, ready to be used by your copilot of choice 👇
🐙 https://github.com/oatsandsugar/olap-agent-ref/blob/main/1-type-rules.md
References
-
🔗 Selecting Types: https://clickhouse.com/docs/best-practices/select-data-types
-
🔗 Low Cardinality: https://clickhouse.com/docs/sql-reference/data-types/lowcardinality
Appendix: Example with UInt and Float
Use UInt16 instead of UInt32 when you can If your IDs, counts, or measurements never exceed 65,535, UInt16 is enough. Each value takes half the space. For example, if you’re tracking sensor states, small category codes, or regional IDs, you don’t need 4 billion possibilities.
Prefer Float32 over Float64 unless you need extra precision In analytics, you usually care about trends and aggregates, not 17 decimal places. Float32 is accurate to ~7 digits — plenty for metrics like latitude/longitude, temperatures, percentages, and financials rounded to cents. Reserve Float64 for scientific or high-precision math.
Interested in learning more?
Sign up for our newsletter — we only send one when we have something actually worth saying.
Related posts
All Blog Posts
OLAP
OLAP migration complexity is the cost of fast reads
In transactional systems, migrations are logical and predictable. In analytical systems, every schema change is a physical event—rewrites, recomputations, and dependency cascades. This post explains why OLAP migrations are inherently complex, the architectural trade-offs behind them, and how to manage that complexity.

OLAP, Product
Ship your data with Moose APIs
You’ve modeled your OLAP data and set up CDC—now it’s time to ship it. Moose makes it effortless to expose your ClickHouse models through typed, validated APIs. Whether you use Moose’s built-in Api class or integrate with Express or FastAPI, you’ll get OpenAPI specs, auth, and runtime validation out of the box.