# System Design Walkthrough — Learn by Doing

## Before We Begin

<!-- hint:slides topic="System design process: requirements, estimation, high-level architecture, database choice, scaling, and CAP theorem" slides="6" -->

**Diagnostic Question:** When someone says "design a system," what do you think they mean first — the boxes and arrows, the data model, the APIs, or something else? Why might different people answer differently?

**Checkpoint:** You recognize that system design starts with understanding the problem before drawing anything. The "right" starting point depends on context.

---

## Step 1: Clarify Requirements

<!-- hint:progress -->

**Task:** You're asked to "design a URL shortener." Write down 3 functional and 3 non-functional requirements you'd ask the interviewer to clarify.

**Question:** Why does it matter whether the short URLs are 6 chars or 8 chars? What about read vs write ratio?

**Checkpoint:** You have a clear list of assumptions (scale, TTL, analytics, etc.).

---

## Step 2: Back-of-Envelope Estimation

**Task:** Assume 100M URLs/month. Estimate: (a) writes per second, (b) reads per second (assume 10:1 read:write), (c) storage for 5 years (50 bytes per record).

**Question:** What rounding and approximations did you use? How accurate do estimates need to be at this stage?

**Checkpoint:** You have rough QPS and storage numbers that bound the problem.

---

## Step 3: High-Level Diagram

<!-- hint:diagram mermaid-type="flowchart" topic="URL shortener architecture" -->

**Embed:** https://excalidraw.com/

**Task:** Draw a high-level architecture for the URL shortener: client, API, app servers, cache, database. Include the flow for "create short URL" and "resolve short URL."

**Question:** Where would a cache sit, and what keys would it use? What happens on a cache miss?

**Checkpoint:** You have a boxes-and-arrows diagram with clear data flow.

---

## Step 4: Choose a Database

<!-- hint:buttons type="single" prompt="Which database would you choose for a URL shortener?" options="SQL for consistency,NoSQL for scale,Depends on requirements" -->

**Task:** Would you use SQL or NoSQL for the URL shortener? Justify. Consider: schema, scale, consistency needs, operational simplicity.

**Question:** What if the requirement was "we need to list all URLs created by user X"? Does that change your choice?

**Checkpoint:** You can articulate tradeoffs between SQL and NoSQL for this use case.

---

## Step 5: Handle Scale and Bottlenecks

**Task:** Identify 2 bottlenecks: (1) DB write throughput, (2) cache stampede when a viral link gets massive traffic. Propose mitigations.

**Question:** What is a cache stampede? How would you prevent it (e.g., lock, probabilistic early expiration)?

**Checkpoint:** You can name concrete techniques (sharding, queuing, cache locking) for each bottleneck.

---

## Step 6: CAP and Consistency

<!-- hint:card type="concept" title="CAP Theorem" -->
<!-- hint:celebrate -->

**Task:** For the URL shortener, would you choose CP or AP? Why? What would "eventual consistency" mean for a redirect?

**Question:** Can a user tolerate seeing a 404 for a few seconds after creating a link? What about a banking transfer?

**Checkpoint:** You can explain why this system likely chooses AP and what tradeoffs that implies.

---

## Step 7: Rate Limiting

**Task:** Design a rate limit for the "create short URL" endpoint: 10 req/min per IP. Describe the algorithm (fixed window, sliding window, or token bucket) and what headers to return.

**Question:** What happens if a user hits the limit? Should you return 429 immediately or queue the request?

**Checkpoint:** You have a concrete rate-limiting strategy and response format.