Back to Playbook
PlaybookDatabase

Schema Design

A schema is the blueprint for your data. It defines what information you store, how it's organized, and how different pieces connect. Get this right, and everything else becomes easier.

What is a Schema?

Think of a schema like the floor plan for a building. Before you build, you need to know where the rooms go, how big they are, and how they connect. A database schema defines:

  • Tables — The "rooms" that hold your data (users, posts, orders)
  • Columns — The "features" of each room (name, email, createdAt)
  • Types — What kind of data each column holds (text, number, date)
  • Relations — How rooms connect (a user has many posts)

Why This Matters

A well-designed schema makes your app fast and your code simple. A messy schema means constant workarounds, slow queries, and bugs. Spend time here—it pays off.

Core Concepts

Tables

Tables hold collections of similar items. Think of them as spreadsheets—each row is one item, each column is a property.

// users table

idnameemailcreated_at
1Alicealice@example.com2024-01-15
2Bobbob@example.com2024-01-16

Columns & Data Types

Each column has a type that defines what kind of data it can hold.

Common Types

  • text — Strings of any length
  • varchar(255) — String with max length
  • integer — Whole numbers
  • boolean — true/false
  • timestamp — Date and time
  • uuid — Unique identifier

Column Modifiers

  • NOT NULL — Required field
  • DEFAULT — Auto-fill value
  • UNIQUE — No duplicates
  • PRIMARY KEY — Row identifier

Primary Keys

Every table needs a primary key—a unique identifier for each row. Most common options:

Auto-increment ID

id SERIAL PRIMARY KEY

Simple, sequential. Good for most cases.

UUID

id UUID DEFAULT gen_random_uuid()

Random, unguessable. Better for public URLs.

Foreign Keys (Relations)

Foreign keys connect tables together. A user_id in a posts table points back to the users table.

// posts table with foreign key to users

CREATE TABLE posts (

id SERIAL PRIMARY KEY,

title TEXT NOT NULL,

content TEXT,

user_id INTEGER REFERENCES users(id),

created_at TIMESTAMP DEFAULT NOW()

);

Cascade Deletes

Add ON DELETE CASCADE to automatically delete posts when a user is deleted. Be careful with this!

Common Relation Types

One-to-Many

One user has many posts. One order has many items. The "many" side stores the foreign key.

users posts (posts.user_id → users.id)

One-to-One

One user has one profile. The foreign key goes on either side with a UNIQUE constraint.

users profiles (profiles.user_id UNIQUE → users.id)

Many-to-Many

Users can belong to many teams. Teams have many users. Requires a junction table.

users user_teams teams

// Junction table

CREATE TABLE user_teams (

user_id INTEGER REFERENCES users(id),

team_id INTEGER REFERENCES teams(id),

PRIMARY KEY (user_id, team_id)

);

Indexes (Speed Up Queries)

An index is like a book's table of contents. Instead of scanning every row, the database jumps directly to what it needs.

// Create an index on email for faster lookups

CREATE INDEX users_email_idx ON users(email);

// Composite index for queries filtering by both

CREATE INDEX posts_user_created_idx ON posts(user_id, created_at);

When to Index

Index columns you filter by (WHERE), sort by (ORDER BY), or join on. Don't over-index—each index slows down writes.

Common Pitfalls

Storing JSON blobs for everything

It's tempting to dump everything in a JSON column. But you can't index, query, or validate JSON fields easily.

Fix: Use proper columns for data you'll query. Reserve JSON for truly unstructured data like user preferences.

Missing indexes on foreign keys

Foreign keys aren't automatically indexed. Joins without indexes are slow.

Fix: Add an index for every foreign key column: CREATE INDEX posts_user_id_idx ON posts(user_id).

Over-normalization

Splitting everything into tiny tables means complex joins for simple queries.

Fix: It's okay to denormalize for read performance. Store author_name on posts if you always display it.

No timestamps

Forgetting created_at and updated_at means no audit trail.

Fix: Add both to every table. Set defaults: created_at TIMESTAMP DEFAULT NOW().

Ready to write queries?

Now that you understand schema design, learn Drizzle ORM for type-safe database access.

Drizzle ORM Guide →