Distributed Workloads
ParadeDB is designed to scale vertically on a single Postgres node with potentially many read replicas, and many production deployments comfortably operate in the 1–10TB range. The largest single ParadeDB database we’ve seen in production is 10TB. For datasets that significantly exceed this scale, ParadeDB supports partitioned tables and can be deployed in sharded Postgres configurations. If you’re working with very large datasets, please reach out to us. We’d be happy to provide guidance and share our roadmap for future distributed query support.Join Support
ParadeDB supports all PostgreSQLJOINs:
INNER JOINLEFT / RIGHT / FULL OUTER JOINCROSS JOINLATERAL- Semi and Anti
JOINs
JOINs do incur
some performance tradeoffs. See the joins guide for more details.
Covering Index
The BM25 index in ParadeDB is a covering index, which means it stores all indexed columns inside a single index per table. This decision is intentional — by colocating all the relevant data, ParadeDB optimizes for fast reads and boolean conditions. However, this means that all columns must be defined up front at index creation time. Adding or removing columns requires aREINDEX.
DDL Replication
A commonly known limitation of Postgres logical replication is that DDL (Data Definition Language) statements are not replicated. This includes operations likeCREATE TABLE or CREATE INDEX.
If ParadeDB is running as a logical replica of a primary Postgres, DDL statements from the primary must be executed manually on the replica.
We recommend version-controlling your schema changes and applying them in a coordinated, repeatable way — either through a migration tool or deployment automation — to keep source and target databases in sync.