# System Design — From Requirements to Architecture

## The System Design Interview Framework

A structured approach to designing scalable systems:

1. **Requirements** — Clarify functional (what) and non-functional (scale, latency, availability)
2. **Estimation** — Back-of-envelope (DAU, QPS, storage) to bound the problem
3. **High-level design** — Sketch boxes and arrows (API, services, data flow)
4. **Deep dive** — Choose 2–3 components to detail (DB schema, caching, bottlenecks)
5. **Bottlenecks** — Identify single points of failure, scale limits, mitigation

```mermaid
flowchart LR
    A[Requirements] --> B[Estimation]
    B --> C[High-Level Design]
    C --> D[Deep Dive]
    D --> E[Bottlenecks]
```

## CAP Theorem

In a distributed system, you can only guarantee **two of three**:

| Property | Meaning |
|----------|---------|
| **Consistency** | Every read returns the latest write |
| **Availability** | Every request receives a response |
| **Partition tolerance** | System works despite network failures |

In practice: network partitions happen. You choose **CP** (e.g., banking) or **AP** (e.g., social feeds).

## Architecture: Typical Web App

```mermaid
flowchart TB
    subgraph Client
        C[Client / Browser]
    end
    subgraph Edge
        CDN[CDN / Static Assets]
    end
    subgraph App
        LB[Load Balancer]
        A1[App Server 1]
        A2[App Server 2]
        A3[App Server N]
    end
    subgraph Data
        Cache[Cache Redis/Memcached]
        DB[(Primary DB)]
        Replica[(Replica)]
        MQ[Message Queue]
    end
    C --> CDN
    C --> LB
    LB --> A1 & A2 & A3
    A1 & A2 & A3 --> Cache
    A1 & A2 & A3 --> DB
    DB --> Replica
    A1 & A2 & A3 --> MQ
```

## Load Balancing

Distribute traffic across servers: **Round-robin**, **least connections**, **consistent hashing** (for stateful affinity).

```bash
# Example: nginx upstream (round-robin)
upstream app_servers {
    server 10.0.0.1:3000;
    server 10.0.0.2:3000;
    server 10.0.0.3:3000;
}
```

## Caching Strategies

| Layer | Use Case | Example |
|-------|----------|---------|
| **CDN** | Static assets, edge caching | CloudFront, Cloudflare |
| **App cache** | Hot data, session | Redis, Memcached |
| **DB cache** | Query result cache | Redis, query cache |

**Cache-aside:** App checks cache first; on miss, fetches from DB and populates cache. **Write-through:** Write to DB and cache together.

## Database Choices

| Type | When to Use |
|------|-------------|
| **SQL** | ACID, relational, complex joins |
| **NoSQL (document)** | Flexible schema, high write throughput |
| **NoSQL (key-value)** | Simple lookups, session storage |
| **NoSQL (columnar)** | Analytics, time-series |

**Sharding:** Partition data by key (e.g., user_id) across DB instances. **Replication:** Read replicas for read scaling; primary for writes.

## Message Queues

Decouple producers and consumers: **RabbitMQ**, **Kafka**, **SQS**. Use for async processing, event sourcing, buffering spikes.

## Microservices vs Monolith

| Monolith | Microservices |
|----------|---------------|
| Single deployable unit | Independently deployable services |
| Simpler ops, shared DB | Per-service DB, network boundaries |
| Good for small teams | Scales org and team size |

Start with a monolith; extract services when boundaries are clear.

## Rate Limiting

Protect APIs: **token bucket**, **sliding window**, **fixed window**. Return `429 Too Many Requests` with `Retry-After` header.

## Consistent Hashing

Minimize rebalancing when nodes are added/removed. Keys map to a ring; each node owns a range. Used in CDN, caching, sharding.

## Microservices Layout

```mermaid
flowchart LR
    subgraph Gateway
        API[API Gateway]
    end
    subgraph Services
        S1[Auth Service]
        S2[User Service]
        S3[Order Service]
        S4[Payment Service]
    end
    subgraph Shared
        MQ[Message Bus]
        E[Event Store]
    end
    API --> S1 & S2 & S3 & S4
    S2 & S3 & S4 --> MQ
    MQ --> E
    S1 -.-> S2
    S3 --> S4
```

## Key Takeaways

- **Requirements first** — Don't over-engineer for scale you don't need
- **Bottlenecks** — Identify SPOFs and plan for failure
- **Cache wisely** — Invalidation is hard; design for it
- **Async where it helps** — Message queues for decoupling and buffering