Derive Moose OLAP Models from SQLModel in Python

TL;DR

SQLModel keeps OLTP ergonomic, but Python’s runtime type system means MooseStack can’t pull OLAP shapes from it the way the TypeScript/TypeORM example does. Every schema change still means touching app/main.py (and its Moose models) by hand.

Code gen is a possibility, but would require investment (let us know if this would be something that would make interacting with MooseOLAP and your OLTP ORM easier, or if you are OK with manual mappings).

Quick links: MooseStack Repo | Demo Repo | Docs.

This is a follow-up to our previous blog that walked through deriving OLAP data models from OLTP ORMs with TypeScript. There, everything stayed in one strongly typed world: the ORM definitions drove the OLTP data model, Moose could import those same types directly, and the compiler guarded every hop from the OLTP database to ClickHouse with almost no extra effort.

Today we're retracing the path with Python, but this time using SQLModel. The end goal is identical, but we lose some of that "for free" flow: type safety has to be regenerated explicitly, and we have to bridge Python's models into OLAP.

SQLModel collapses SQLAlchemy's ORM and Pydantic's validation into a single class hierarchy, so the OLTP API and database stay in sync without code generation. Unlike the TypeScript walkthrough, which could reuse TypeORM-generated types end to end, MooseStack still needs explicit payload definitions here, so we restate the SQLModel fields into an OLAP-friendly shape manually.

There are some prototype helpers in-flight to generate Moose definitions directly from the OLTP ORM metadata, but this walkthrough keeps the seams manual so you understand (and can later automate) the touchpoints.

This is a four-step process:

Model your OLTP schema with SQLModel (one class per table).
Project those definitions into Moose-friendly Pydantic models (one class per OLAP payload).
Add CDC metadata (lsn, is_deleted, CdcMetadata) so Moose can deduplicate and replay safely.
Declare Moose OLAP primitives (tables + streams) that land those payloads in ClickHouse.

FastAPI still serves CRUD traffic in the demo app, but the API layer is orthogonal to this OLTP → OLAP story, so we keep the focus on the Moose-facing pieces here.

Why SQLModel helps but OLAP still needs intent

SQLModel was built to remove the "ORM + Pydantic" duplication. An SQLModel subclass is simultaneously:

a SQLAlchemy table definition,
a Pydantic model for requests/responses, and
a type-safe object you can pass around your Python app.

That covers OLTP ergonomics, but OLAP has different expectations, for example: ClickHouse prefers strict types like unsigned ints, deterministic defaults, and denormalised records.

So we keep SQLModel as the transactional source of truth and layer Moose on top to express OLAP-specific choices.

Step 1: Define your OLTP schema with SQLModel

Every entity lives in a single SQLModel class that powers both the database and FastAPI. Example customer model (apps/sqlmodel-example/src/db/models.py:10):

The same file defines products, orders, and order items with their relationships (apps/sqlmodel-example/src/db/models.py:42). Relationships stay in SQLModel because they describe the OLTP graph and are ignored once we cross into OLAP.

Step 2: Manually rewrite SQLModel classes into Moose Pydantic models

In TypeScript, we were able to do this step automatically, relying on TypeScript types. In Python, this isn’t the case.

SQLModel already describes the fields we need, but Moose expects plain Pydantic models that also capture CDC metadata. We mirror the SQLModel attributes and embed the shared CdcFields mixin so every payload includes lsn and is_deleted (apps/sqlmodel-example/app/main.py:36):

That mirroring step is the manual bridge: when a SQLModel field changes, the Moose-facing Pydantic class must change as well. (FastAPI continues to use the SQLModel types directly, but that's outside the Moose workflow.)

Step 3: Layer CDC metadata onto the Moose models

Moose needs a bit more intent than the OLTP ORM provides. Every OLAP record must be versioned (lsn) and carry a soft-delete flag (is_deleted) so ClickHouse’s ReplacingMergeTree engine can deduplicate correctly. We capture those rules in CdcFields and have each Moose model inherit from it.

The helper that converts SQLModel-shaped dicts into Moose payloads (apps/sqlmodel-example/app/main.py:195) is where the CDC plumbing happens:

Because the Moose models mirror SQLModel exactly (and just mix in CDC fields), Pydantic validation fails fast whenever the OLTP schema drifts. CDC metadata stays centralized, and Moose gets the deterministic columns it needs to manage ClickHouse tables.

Step 4: Declare ClickHouse tables with Moose

Once the payloads are typed, Moose can land them in ClickHouse with OLAP-aware defaults (apps/sqlmodel-example/app/main.py:109):

Repeating that pattern for products, orders, and order items gives you dimension and fact tables with deterministic deduplication. ReplacingMergeTree handles CDC replay, while the order_by_fields make ClickHouse merges predictable.

The same module wires Kafka streams to those tables and registers a dead-letter queue for unexpected tables (apps/sqlmodel-example/app/main.py:151), keeping the entire OLAP pipeline in Python.

Where TypeScript still wins

With TypeORM and Moose TypeScript, schema changes flow automatically: Moose can reuse the TypeORM-derived types because the TypeScript compiler surfaces rich type metadata at build time. Python erases most of that information once code is running, so Moose can’t read SQLModel classes the same way, and the safety net disappears:

No type reuse: Moose can't leverage SQLModel classes the way it reuses TypeORM-generated generics, so the OLAP models (app/main.py:36) must be kept in lockstep by hand.
Transform logic drifts easily: When you add or rename a column, the routing and payload shaping (app/main.py:195) need explicit edits, and there is no code generator to nudge you.
ClickHouse config stays bespoke: OlapTable declarations (app/main.py:109) repeat field names and CDC metadata. Forgetting to update them leads to runtime errors rather than compile-time hints.

If you're coming from the TypeScript example, expect a bumpier ride. Python gets you ergonomic OLTP models, but the OLAP crossover is still mostly manual.

Where automation fits

SQLModel already holds the schema metadata we need, but its metaclass bundles SQLAlchemy table machinery with Pydantic models, which makes the combined classes incompatible with Moose. We’ve been prototyping a generator that inspects the SQLModel metadata and emits plain Pydantic classes plus ClickHouse configs, but until that lands you should treat the Moose update as a checklist item whenever you touch src/db/models.py.

We'd love your input

These manual steps were more painful than we expected, so we'd love to compare notes.

Would SQLModel-to-Moose code generation make this workflow viable for your team? Tell us which pieces you'd want automated first, whether that is model mirroring, transformation scaffolding, ClickHouse table config, or something else entirely. And if you've seen Python-to-Python patterns (framework-to-framework sync, codegen, tests) that keep data models aligned, we'd be grateful if you could point us at the examples.

Just OLAP It (Python Edition): Derive Moose OLAP Models from SQLModel

TL;DR

Why SQLModel helps but OLAP still needs intent

Step 1: Define your OLTP schema with SQLModel

Step 2: Manually rewrite SQLModel classes into Moose Pydantic models

Step 3: Layer CDC metadata onto the Moose models

Step 4: Declare ClickHouse tables with Moose

Where TypeScript still wins

Where automation fits

We'd love your input

Interested in learning more?

Related posts

Analytical dashboards and AI chat: from parquet in S3 to deployed production app

OLAP migration complexity is the cost of fast reads

We value your privacy