Toward a Universal Catalog System
Over the last few weeks, something interesting crystallized while working on images, holidays, and discovery feeds in our projects:
We accidentally built a general-purpose catalog system.
Not just for products—but for anything that can be indexed, filtered, governed, and discovered: products, menus, images, documents, media, services, even AI-digested artifacts like PDFs and transcripts.
This post explains what that system is, why it’s different, and why we believe it scales better than existing catalog platforms like Shopify, Square, Magento, or Amazon.
The Core Idea: Catalogs as Relations, Not Tables
Most catalog systems start with a fixed schema:
- categories table
- products table
- attributes table
- join tables everywhere
- migrations forever
We take a different approach:
- Everything is a Stream
- Categories are Streams
- Attributes are Relations between streams
- Indexing is done via typed relations, not columns
This lets us express catalogs declaratively using relations like:
attribute/price=19.99
attribute/startDate=2025-01-01
attribute/confidence=0.82
Those relations are indexable, composable, queryable via ranges, attachable by templates, and enforceable by policy. No database schema migrations required. We also help provide a structure for the politics around product attributes.
Single Inheritance (on Purpose)
In our framework, a stream has one primary type.
Example:
Streams/product/electronics/phones/smartphones
This is intentional.
Why?
Because single inheritance gives us:
- deterministic indexing
- predictable query plans
- one-shot lookups by prefix
- admin ownership per level
Multiple inheritance is simulated via relations, not types.
So instead of:
“Ketchup is both a condiment and a vegetable”
We do:
product → relatesTo → category/condiments
product → relatesTo → category/vegetables
That distinction matters a lot for performance and governance.
Admin-Managed Prefixes = Decentralized Governance
Here’s where things get interesting.
Each prefix level can be managed by different admins:
Streams/product/
Streams/product/electronics/
Streams/product/electronics/phones/
Admins at each level can define:
- which attributes exist
- which attributes are required
- which attributes are optional
- how attributes are indexed
- how changes propagate
They do this by defining relation templates.
Example
An admin at:
Streams/product/electronics/
can say:
“All electronics should optionally have a
wattageattribute.”
Later, a higher-level admin might say:
“Actually, wattage is now required for cross-market discovery. We should strive to become compatible with these other, larger marketplaces and aggregators.”
No migrations, no central authority, no breaking existing data. Just relations.
Automatic Indexing via Attribute Relations
When a product (or image, or document) is created or updated:
- Attributes are written to the stream
-
syncRelations()runs - Attribute relations are added or removed automatically
- Indexes stay consistent
This means:
- Upload a product 4 levels down → attributes get indexed automatically
- Change a rule upstream → new items follow it
- Old items remain valid but less discoverable
This mirrors how real organizations work.
Querying Becomes Expressive (and Honest)
Here’s an actual example we’re already using:
'criteria' => array(
array('attribute/startDate' => array(
null, false, false, $tomorrow
)),
array('attribute/endDate' => array(
$yesterday, false, false, null
)),
array('attribute/obscene' => array(
0, true, false, 3.99
)),
array('attribute/controversial' => array(
0, true, false, 5.99
)),
array('attribute/confidence' => array(
0.60, true, true, 10.00
))
)
Read that again.
It’s not SQL.
It’s not magic.
It’s policy expressed as data.
Each block:
- becomes a JOIN
- stays index-friendly
- scales linearly
- remains explainable to humans
Even 10 joins are fine here—they’re narrow, indexed, and deterministic.
Why This Works for Menus, Products, Media, and More
This model applies cleanly to:
Restaurant menus (Toast, Square)
- single inheritance by menu section
- modifiers as relations
- dietary flags as attributes
- seasonal availability via date ranges
E-commerce (Shopify, Magento)
- products as streams
- variants as streams
- attributes indexed via relations
- categories as admin-managed prefixes
Media catalogs
- images, PDFs, audio, video
- attributes extracted via LLMs
- discovery driven by confidence + relevance
- policy enforced at ingestion time
AI-digested content
- PDFs
- product manuals
- transcripts
- catalogs scraped from vendors
Anything that can be observed can be cataloged.
Where AI Fits (Naturally)
We already use LLMs to:
- extract attributes
- score confidence
- flag safety issues
- normalize keywords
- ingest multimodal data
But crucially:
- AI does not decide policy
- AI produces observations
- Humans define governance
- Relations enforce structure
This keeps the system deterministic and auditable.
Toward a Decentralized Amazon
Put it all together and you get something powerful:
- organizations control their own catalogs
- categories evolve independently
- standards emerge voluntarily
- discovery works across orgs
- governance is layered, not centralized
Over time, this allows:
- guild-like coordination on standards
- shared attribute vocabularies
- cross-market discovery
- decentralized commerce
Not a DAO. Not a blockchain buzzword. Just actually good software architecture.
Why We’re Excited
This system we’re building:
- avoids schema lock-in
- avoids central bottlenecks
- avoids political gridlock
- scales with complexity
- works today, not “after a protocol”
We didn’t set out to build a universal catalog system. But as we kept building, things just… fell into place.
More to come.