# Test Strategy Walkthrough — Learn by Doing

## Before We Begin

**Diagnostic Question:** When you hear "test pyramid," what do you picture? Where would unit tests, integration tests, and e2e tests sit — and why does that shape matter?

**Checkpoint:** You have a rough mental model of the test pyramid (many unit tests at the base, fewer e2e at the top). You're ready to think about what belongs at each layer.

---

## Step 1: Identify Test Layers for a Feature

You're building a "Forgot Password" flow: user enters email → receives reset link → clicks link → sets new password → logs in.

<!-- hint:diagram mermaid-type="flowchart" topic="test pyramid layers" -->
<!-- hint:buttons type="single" prompt="Which layer verifies the full user flow?" options="Unit,Integration,E2E" -->

**Task:** For this flow, list one example of (a) a unit test, (b) an integration test, and (c) an e2e test. What does each layer verify that the others don't?

**Question:** Why would you want more unit tests than e2e tests for this flow? What happens if you only have e2e tests?

**Checkpoint:** User identifies: unit = validate email format, hash password logic; integration = email service sends, token storage/retrieval; e2e = full click-through flow. They understand unit = fast feedback, e2e = confidence but slow.

---

## Step 2: What to Test vs What Not to Test

<!-- hint:card type="concept" title="Worth testing" -->

Consider this function:

```javascript
function formatCurrency(amount) {
  return new Intl.NumberFormat('en-US', {
    style: 'currency',
    currency: 'USD'
  }).format(amount);
}
```

**Task:** Should you test this? If yes, what would you assert? If no, why not? Now consider a function that computes tax: `function computeTax(subtotal, state) { ... }`. What would you test there?

**Question:** What makes one function "worth testing" and another "skip it"?

**Checkpoint:** User sees formatCurrency as thin wrapper over Intl — maybe one smoke test. computeTax has business logic (rates, edge cases) — definitely test. Worth testing = non-trivial logic, business rules, bug-prone code.

---

## Step 3: Choose a Test Double

<!-- hint:list style="cards" -->

You're testing a function `processOrder(order)` that calls `paymentGateway.charge()` and `inventoryService.reserve()`. You want to verify the order is saved correctly when payment succeeds.

**Task:** Would you use a stub, mock, or spy for `paymentGateway` and `inventoryService`? Why? What would you verify in your test?

**Question:** When does "verify it was called" matter vs "just make it return success"?

**Checkpoint:** User chooses: stub for both if they only care about order persistence (fake success responses). Mock if they need to verify charge was called with correct amount. Spy if they want to assert call count or args after the fact.

---

## Step 4: Design a Test Pyramid for Your Project

<!-- hint:diagram mermaid-type="flowchart" topic="test pyramid distribution" -->

Imagine a small API: user registration, login, and a "list my orders" endpoint. You have limited time.

**Task:** Sketch a test pyramid: how many tests at each layer? What would be your first 5 tests to write?

**Question:** If you could only have 3 tests total, which would you choose and why?

**Checkpoint:** User proposes pyramid: many unit tests for validation, auth logic, order filtering; some integration for DB; 1–2 e2e for critical flows. If only 3: maybe 1 e2e (login + list orders), 1 integration (auth), 1 unit (password validation). Prioritizes highest risk.

---

## Step 5: Avoid Implementation-Detail Assertions

A component fetches users and displays them. Someone writes: "Assert that `useEffect` was called twice."

**Task:** Why is that a bad assertion? What would you assert instead?

**Question:** How do you know if an assertion is testing implementation vs behavior?

**Checkpoint:** User recognizes: useEffect count is implementation; refactoring (e.g., to a different hook) would break the test without changing behavior. Better: assert rendered output (user names on screen) or that the correct API was called. Behavior = what the user sees or what the system does; implementation = how it's built.

---

## Step 6: Place Tests in CI/CD

Your pipeline runs: lint → build → tests → deploy. Tests currently take 12 minutes (unit 2 min, integration 5 min, e2e 5 min).

**Task:** How would you structure the pipeline so developers get fast feedback? What would block a merge vs what could run after?

**Question:** Why might you run e2e only on main, not on every PR?

**Checkpoint:** User suggests: run unit + integration on every PR (7 min); run e2e on merge to main or nightly. Fast feedback for 95% of issues; e2e catches integration bugs less frequently. Cost/speed vs confidence tradeoff.