Beginner → Intermediate30 min read· Topic 2.1

Relational databases (SQL)

ACID properties, transactions, indexes, query optimization, schema design, joins at scale

🗃️Key Takeaways

  • 1
    ACID guarantees (Atomicity, Consistency, Isolation, Durability) ensure reliable transactions
  • 2
    Indexes dramatically speed up reads but slow down writes — choose carefully
  • 3
    Normalization reduces redundancy; denormalization improves read performance
  • 4
    SQL databases scale vertically well but horizontal scaling (sharding) is complex

Why Relational Databases Dominate

Relational databases (PostgreSQL, MySQL, Oracle) have been the backbone of software systems for 40+ years. They store data in tables with predefined schemas, enforce relationships through foreign keys, and guarantee ACID transactions. When in doubt about your database choice, SQL is almost always a safe default.

In system design interviews, understanding SQL deeply is essential — roughly 50% of all design questions involve relational data modeling decisions.

ACID Properties Explained

A transaction is all-or-nothing. If a bank transfer debits account A but the credit to account B fails, the entire transaction rolls back. No partial state changes.

Implementation: Write-ahead log (WAL) records all changes before applying them. On failure, replay or rollback from the WAL.

A transaction moves the database from one valid state to another. All constraints (foreign keys, unique indexes, check constraints) must be satisfied.

Example: You can't insert an order referencing a product_id that doesn't exist if there's a foreign key constraint.

Concurrent transactions don't interfere with each other. Isolation levels trade off between correctness and performance:

READ UNCOMMITTED → READ COMMITTED → REPEATABLE READ → SERIALIZABLE. Most production systems use READ COMMITTED or REPEATABLE READ.

Once a transaction commits, its changes survive power failures and crashes. Ensured by flushing the WAL to disk before returning success to the client.

fsync() is the key system call — it forces data from OS buffers to persistent storage.

Index Types and When to Use Them
-- B-Tree Index (default, most common)
-- Best for: equality lookups, range queries, sorting
CREATE INDEX idx_users_email ON users(email);
SELECT * FROM users WHERE email = 'user@example.com';  -- O(log n)

-- Composite Index (multi-column)
-- Best for: queries filtering on multiple columns
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Follows leftmost prefix rule: works for (user_id) and (user_id, created_at)
-- Does NOT work for queries only on created_at

-- Hash Index (equality only, no range)
CREATE INDEX idx_users_email_hash ON users USING hash(email);

-- GIN Index (full-text search, JSONB, arrays)
CREATE INDEX idx_posts_content ON posts USING gin(to_tsvector('english', content));

-- Partial Index (index only matching rows)
CREATE INDEX idx_active_users ON users(email) WHERE active = true;

Popular SQL Databases

DatabaseBest ForMax ScaleKey Feature
PostgreSQLComplex queries, extensibility~50K QPS per nodeJSONB, extensions ecosystem
MySQLWeb applications, reads~50K QPS per nodeInnoDB, widely deployed
SQL ServerEnterprise, WindowsVery high with EnterpriseT-SQL, SSRS integration
CockroachDBDistributed SQLHorizontally scalableSpanner-inspired, serializable
AuroraAWS cloud-native5x MySQL throughputAuto-scaling storage, 6-way replication
When SQL is the Right Choice
Use SQL when you need: ACID transactions (payments, inventory), complex joins, strict schema enforcement, or you're not sure (SQL is the safe default). Avoid SQL when: you need >100K write QPS, your data is mostly key-value lookups, or your schema changes constantly.

Advantages

  • ACID guarantees prevent data corruption
  • Mature tooling and ecosystem
  • SQL is a universal query language
  • Complex joins and aggregations are built-in

Disadvantages

  • Horizontal scaling (sharding) is complex
  • Schema changes require migrations
  • Write-heavy workloads can be bottlenecked
  • Over-normalization leads to expensive joins

🧪 Test Your Understanding

Knowledge Check1/2

What does the 'I' in ACID stand for?