# System Design Exercises

## Exercise 1: Design a News Feed

**Task:** Design a news feed (e.g., Twitter-like) for 10M users. Document: (1) functional and non-functional requirements, (2) data model (users, posts, follows), (3) high-level architecture with cache and DB, (4) how you'd generate "feed for user X" (fan-out on write vs read).

**Validation:**
- [ ] Requirements are clear and bounded
- [ ] Data model supports follows and posts
- [ ] Architecture includes load balancer, app tier, cache, DB
- [ ] Tradeoffs between fan-out on write vs read are explained

**Hints:**
1. Fan-out on write: precompute feeds when a post is created; fast reads, slower writes
2. Fan-out on read: fetch from followed users at read time; simpler writes, heavier reads
3. Hybrid: fan-out for normal users, on-read for celebrities

---

## Exercise 2: Add Caching to an API

**Task:** You have a REST API: `GET /users/:id` returns user profile. Add a cache layer. Define: (1) cache key format, (2) TTL, (3) invalidation strategy when user updates profile, (4) what to do on cache miss.

**Validation:**
- [ ] Cache key is deterministic and unique per user
- [ ] TTL chosen with reasoning (e.g., 5 min vs 1 hour)
- [ ] Invalidation on write is specified (invalidate key, or write-through)
- [ ] Cache-aside flow on miss is correct

**Hints:**
1. Key: `user:{id}` or `user:profile:{id}`
2. Invalidation: delete key on PUT/PATCH, or use write-through
3. Cache stampede: consider locking or probabilistic early refresh

---

## Exercise 3: Shard a Database

**Task:** You have a `orders` table with 1B rows. Design a sharding strategy. Specify: (1) shard key, (2) number of shards, (3) how to route a query for "orders by user_id", (4) how to handle "orders by date range" (cross-shard query).

**Validation:**
- [ ] Shard key chosen (user_id typical for orders)
- [ ] Routing is deterministic (e.g., hash(user_id) % N)
- [ ] Cross-shard query strategy described (query each shard, merge; or use a separate analytics DB)

**Hints:**
1. user_id as shard key: all orders for a user on same shard
2. date range: either scatter-gather or maintain a read replica for analytics
3. Consistent hashing helps when adding/removing shards

---

## Exercise 4: Message Queue for Async Jobs

**Task:** Design a "send email" flow: user signs up → send welcome email. Use a message queue. Draw the flow: API → queue → worker. What happens if the worker fails mid-processing? How do you ensure at-least-once delivery?

**Validation:**
- [ ] Flow is API → enqueue → worker consumes
- [ ] Failure handling: retry, dead-letter queue
- [ ] At-least-once: acknowledge only after successful send; redeliver on timeout
- [ ] Idempotency considered (don't send duplicate emails)

**Hints:**
1. Worker acks after email sent; if crash before ack, message redelivered
2. Idempotency: store sent_email_ids or use idempotency key
3. Dead-letter queue for messages that fail repeatedly

---

## Exercise 5: Rate Limiting Design

**Task:** Implement a simple rate limiter: max 100 requests per minute per API key. Describe the data structure and algorithm. What would you use in Redis? Write pseudocode for "is request allowed?".

**Validation:**
- [ ] Algorithm chosen (fixed window, sliding window, or token bucket)
- [ ] Data structure supports per-key counters
- [ ] Redis INCR + EXPIRE or similar is correctly described
- [ ] Edge cases: what if the clock is at minute boundary?

**Hints:**
1. Fixed window: `KEY = "ratelimit:{api_key}:{minute}"`, INCR, EXPIRE 60
2. Sliding window: sorted set, remove expired, count, add current
3. Token bucket: more complex but smoother limiting
