MetaHouse
DWG. MH-001 · Pre-release
The governed context layer

AI builds on top.
It can't dig through.

AI is only as good as the context you give it. MetaHouse is the metadata foundation your models stand on — catalogued, governed, and traceable — so AI works with what your organization actually knows instead of guessing at it.

AI · MODELS · AGENTS reads · reasons · answers GRADE — what AI can see L1 CATALOG · METADATA L2 ACCESS · POLICY L3 SEMANTICS · DEFINITIONS L4 LINEAGE · PROVENANCE RAW DATA — ungoverned, unindexed BUILDS ON TOP CAN'T DIG THROUGH SECTION A–A GOVERNED MetaHouse MH-001

Section A–A · the context an AI stands on

Premise

Context is the input.

A model is only as good as the context you give it. Feed it a tangle of half-named tables and it returns confident nonsense. It can't reason about meaning it was never shown.

Constraint

It builds, it doesn't excavate.

AI works on top of what it's handed. It won't reach down through undocumented systems to recover the meaning your teams lost years ago. Hand it structure, not a haystack.

Thesis

Governance is the future.

Data governance stops being paperwork and becomes infrastructure — the layer that decides what AI is allowed to know and proves where every answer came from.

Open source · Built on ClickHouse

The open-source framework built on ClickHouse.

MetaHouse builds on top of ClickHouse — the same move it asks of every model downstream. A catalog isn't a side table in a transactional database; it's an analytical workload, and that's exactly what a column store is built for. Frankly, we're huge fans of ClickHouse — if you haven't used it yet, check it out.

Columnarvectorized · ordered

Reads at catalog scale.

A real catalog turns into billions of rows — every column, table, lineage edge and profiling stat you track. ClickHouse's columnar, vectorized engine scans them in milliseconds, so search and lineage feel instant instead of overnight.

TemporalMergeTree · point-in-time

Metadata is a timeline, not a snapshot.

MergeTree is built for append-heavy, time-ordered data. Schema drift, ownership changes and lineage stay queryable all the way back to day one — so you can ask what a field meant last quarter, not just today.

Materializedviews · streaming

Freshness computed on write.

Profiling, quality checks and rollups update through materialized views as metadata streams in — computed once when it lands, never re-crunched on every query. The catalog is current because keeping it current is free.

Openapache-2.0 · self-host

Apache-2.0, all the way down.

No proprietary core, no vendor lock. Run MetaHouse on your own metal and audit governance down to the storage engine — because the layer everything builds on shouldn't be a black box.

Data governance is the future.

MetaHouse is being drawn up now.
Put your name on the plans.