Schema Design & Naming Conventions Guide (2026)

1Tell us where to send it
Your name and work email — nothing more.
2Check your inbox
Your guide arrives in seconds, not days.
3Use it with your team
Editable and ready to share — make it your own.

A peek inside

See exactly what you're getting

Free PDF

Spotsaas · 2026

Schema Design & Naming Conventions Guide

✓ Why conventions beat cleverness

✓ Naming conventions

✓ Per-table design checklist

✓ Decisions to make once, up front

Get the guide →

What Are Schema Design & Naming Conventions?

The Schema Design & Naming Conventions Guide is the set of decisions a senior data engineer locks in before the first table ships — naming, keys, types, and constraints — so the schema stays readable and refactorable at a hundred tables instead of ten. It is part rationale, part naming-conventions reference table, part per-table design checklist, and part 'decide these once, up front' Q&A. Adopt it once and every new table, column, and migration follows the same predictable shape, which is what keeps a growing schema from becoming a thicket of one-off choices.

The guide opens with why conventions beat cleverness: consistency compounds. When every table has a surrogate primary key, every foreign key is a real constraint, and every timestamp is UTC timestamptz, a developer can predict the shape of a table they have never seen. The per-table checklist makes that concrete — a BIGINT identity or UUID primary key never relying on a natural key alone, declared FK constraints rather than conventions, an index on every foreign-key column (the database does not add it for you), created_at and updated_at on every table, and the narrowest correct type for each column.

Where it earns its keep is the type and constraint discipline. NOT NULL by default, so nullability is a deliberate decision; CHECK constraints for enums and ranges; numeric or decimal for money but never float; text over varchar(n) unless a real limit exists; and domain-driven names — a column is total_amount_cents, not the label shown on a screen. The 'decide once' Q&A confronts the genuinely hard calls — UUID versus auto-increment BIGINT, integer cents versus NUMERIC for money, enum type versus lookup table versus CHECK — with the trade-offs spelled out so the team chooses deliberately and then stays consistent.

What Schema Design & Naming Conventions Are Used For

Teams use the conventions guide to standardize schema decisions before they fossilize, so a database stays coherent as it grows. It is a reference for new tables and a rubric for reviewing migrations. The concrete jobs it does:

✓ Establishing naming conventions — consistent casing, pluralization, and column naming so a developer can predict a table's shape from its name without reading the DDL.
✓ Mandating surrogate primary keys — a BIGINT identity or UUID on every table, never relying on a natural key alone, so the PK is stable even as business attributes change.
✓ Enforcing real foreign-key constraints and indexing every FK column — because the database enforces referential integrity only when the constraint is declared, and unindexed FKs cripple joins and cascade deletes.
✓ Standardizing timestamps — created_at and updated_at on every table, stored in UTC as timestamptz, with updated_at maintained by trigger or app.
✓ Choosing the narrowest correct type — integer keys over text, numeric/decimal for money (never float), text over varchar(n) unless a real limit exists — and defaulting columns to NOT NULL.
✓ Adding CHECK constraints for enums and ranges, and deciding deliberately between an enum type, a lookup table, or a CHECK for fixed value sets.
✓ Keeping names domain-driven rather than UI-driven — total_amount_cents, not the on-screen label — so the schema survives interface redesigns.

Who Uses Schema Design & Naming Conventions

Schema conventions are a team agreement, so the guide is written for everyone who creates, reviews, or evolves the data model. Its value is consistency, which only holds if the whole team follows it.

Data engineersThey define and enforce the conventions — naming, key strategy, type discipline — and use the guide as the standard against which new tables and migrations are reviewed.

Backend engineersThey create most of the day-to-day tables and columns, and the guide gives them a predictable template so they are not re-litigating UUID-versus-BIGINT or money types on every feature.

Database administrators (DBAs)They care that every foreign key is a real constraint and indexed, that types are narrow, and that NOT NULL and CHECK constraints protect integrity — all of which the guide mandates.

Tech leads and architectsThey make the 'decide once' calls — UUID versus BIGINT, integer cents versus NUMERIC, enum versus lookup table — and document them so the whole team stays consistent.

Code reviewersThey use the per-table checklist as a migration-review rubric, catching missing FK indexes, nullable-by-accident columns, and UI-driven names before they ship.

Schema Design & Naming Conventions: Context & Good to Know

The cost of inconsistent schema design is invisible early and crippling later. With ten tables, a few naming quirks and a missing FK index are tolerable; at a hundred tables, every inconsistency multiplies the cognitive load of working with the data, and every missing constraint is a latent integrity bug. This guide front-loads the decisions precisely because they are cheap to make before the first table ships and expensive to retrofit once a hundred tables and their migrations depend on the choice.

The primary-key question is the canonical 'decide once' call. BIGINT identity keys are smaller, faster to index, and naturally ordered for good locality; UUIDs hide row counts, allow client-side generation, and avoid cross-shard collisions but bloat indexes unless you use a time-ordered variant like UUIDv7 or ULID. There is no universally right answer, which is exactly why the guide insists the team pick one deliberately and apply it consistently rather than mixing strategies table by table.

Type discipline is where data integrity is quietly won. Storing money as FLOAT introduces rounding errors that surface as accounting discrepancies; defaulting columns to nullable makes every query defend against NULLs that should never exist; choosing varchar(n) with an arbitrary limit creates painful migrations when the limit proves wrong. The guide's defaults — NUMERIC or integer cents for money, NOT NULL by default, text over bounded varchar, CHECK constraints for ranges — encode hard-won lessons that apply across PostgreSQL, MySQL, and Oracle Database alike.

Spotsaas includes this guide in its database-management resources because schema quality is a durable, often underappreciated differentiator between teams — and the conventions translate across engines. Whether a team standardizes on PostgreSQL, MariaDB, or a document model in MongoDB, the underlying questions of naming, keys, types, and constraints recur, and a guide that answers them once helps a team keep its data model clean as the product and the table count grow.

✓ Independent · vendors can't pay to rank

Built on verified data, not vendor spin

Every Spotsaas resource draws on the SpotScore — a blend of verified review ratings, review volume, and feature depth across 89 database management software tools. Refreshed regularly; data as of June 2026.

FAQ

Questions, answered

Why use a surrogate primary key instead of a natural key?

A surrogate key — a BIGINT identity or a UUID — is stable even when business attributes change, whereas a natural key (an email, a SKU) can change and then cascade through every foreign key referencing it. Surrogate keys are also simpler to join on and index. The guide recommends never relying on a natural key alone as the primary key, while you can still add a unique constraint on the natural key for integrity.

Should you use UUID or auto-increment BIGINT for primary keys?

BIGINT is smaller, faster to index, and sequential, which gives good storage locality. UUID hides the row count, allows client-side generation, and avoids collisions across shards, but it bloats indexes unless you use a time-ordered variant like UUIDv7 or ULID. There is no universal winner — the guide's point is to choose deliberately based on your sharding and generation needs, then apply it consistently.

How should money be stored in a database?

Use either integer cents or NUMERIC/DECIMAL — never FLOAT or DOUBLE, which introduce rounding errors. NUMERIC(19,4) is self-documenting and supports fractional currencies; integer cents is faster but pushes scaling logic into the application. The guide's firm rule is simply that money must avoid floating-point types entirely.

Why must every foreign-key column be indexed?

The database enforces the FK constraint but does not automatically create an index on the referencing column. Without that index, joins on the foreign key and cascade deletes do full scans, which cripples performance as tables grow. The guide makes indexing every FK column a per-table requirement precisely because it is an easy step to forget and a costly one to omit.

What does NOT NULL by default mean?

It means you treat every column as NOT NULL unless there is a deliberate reason for it to allow NULL. Accidental nullability forces every query and every piece of application code to defend against NULLs that should never occur, hiding bugs. Making nullability a conscious decision — and adding CHECK constraints for enums and ranges — keeps the schema's intent explicit and its data valid.

Should fixed value sets use an enum, a lookup table, or a CHECK constraint?

A lookup table is the most flexible — you can add values without a migration and attach metadata. A native ENUM is fastest but painful to alter. A CHECK constraint is simplest for truly stable sets like status flags. The guide frames it as a deliberate trade-off: pick based on how often the set changes and whether you need metadata, not by default.

Why store timestamps in UTC as timestamptz?

Storing created_at and updated_at in UTC as timestamptz avoids timezone ambiguity and the bugs that come from mixing local times across regions and daylight-saving changes. Putting these two columns on every table, with updated_at maintained by a trigger or the application, gives you consistent auditability and makes time-based queries reliable.

What is an example of a database software these conventions apply to?

The conventions apply across relational engines — PostgreSQL, MySQL, MariaDB, and Oracle Database — where naming, surrogate keys, FK constraints, type choice, and CHECK constraints all behave similarly. The underlying questions also translate to document stores like MongoDB, even though the mechanisms differ. The guide is engine-agnostic in principle and PostgreSQL-leaning in its specific examples like timestamptz.

Why name columns for the domain instead of the UI?

A column named for the screen label it currently feeds — say balance_label — breaks conceptually the moment the interface is redesigned, and it obscures the column's real meaning. Naming for the domain — total_amount_cents — keeps the schema stable across UI changes and makes its intent clear to anyone querying it directly, independent of any particular interface.

Keep exploring