# AI-Assisted Test Generation Walkthrough — Learn by Doing

## Before We Begin

**Diagnostic Question:** When AI generates tests, what could go wrong? What would you still need to verify or add yourself?

**Checkpoint:** You recognize that AI can draft tests but may miss edge cases, use the wrong assertions, or assume a different framework. Human review is essential.

---

## Step 1: Provide Context for a Test Prompt

You want AI to generate tests for this function:

<!-- hint:code language="javascript" highlight="1,5" -->

```javascript
function parsePrice(input) {
  if (input == null || input === '') return null;
  const num = parseFloat(String(input).replace(/[^0-9.-]/g, ''));
  return isNaN(num) ? null : Math.round(num * 100) / 100;
}
```

**Task:** Write a prompt you'd send to Claude (or another AI) to generate unit tests. Include: the function, the framework (e.g., Vitest), what cases you want, and one example of a test style you prefer.

**Question:** What would happen if you only said "Generate tests for this function" without context?

**Checkpoint:** User includes: function code, Vitest, requested cases (happy, empty, null, invalid string, decimals), and maybe one example test. They understand minimal prompts produce generic output; context yields better tests.

---

## Step 2: Review AI-Generated Tests

<!-- hint:card type="warning" title="AI test pitfalls" -->

Suppose AI produced:

```javascript
test('parses price', () => {
  const result = parsePrice('$19.99');
  expect(result).toBeDefined();
});
```

**Task:** Why is this test weak? Rewrite it (or describe what to change) so it actually verifies behavior.

**Question:** What's the difference between "the code ran" and "the code did the right thing"?

**Checkpoint:** User identifies: toBeDefined() only checks non-null. Should assert expect(result).toBe(19.99) or similar. They understand assertions must verify expected output, not just "it didn't crash."

---

## Step 3: Generate Tests with Claude

**Task:** Pick a small function from your codebase (or use `parsePrice` above). Use Claude to generate unit tests. Provide: the function, framework, and at least 3 case types you want. Run the tests. What passed? What did you have to fix or improve?

**Question:** What did you learn about prompting from the first attempt? What would you add to the prompt next time?

**Checkpoint:** User completes the task, runs tests, and reflects. They note: maybe add "use toBe for numbers, toBeNull for null", or "include $ and , in input" to improve next output.

---

## Step 4: Use AI for Edge Cases

<!-- hint:buttons type="single" prompt="Which edge case would you add first?" options="17 and 121,18.5,eighteen" -->

You have a function that validates a user age (must be 18–120). You've written happy path tests.

**Task:** Prompt AI: "Suggest 5 edge case inputs for a function that validates age 18–120. Include boundary and invalid cases." List what AI suggests. Then evaluate: which would you actually add to your test suite? Why or why not?

**Question:** When would you reject an AI-suggested edge case?

**Checkpoint:** User gets suggestions (e.g., 17, 18, 120, 121, -1, "eighteen", 18.5). They evaluate: 18 and 120 are must-haves; 17 and 121 test boundaries; -1 and "eighteen" test invalid; 18.5 might be optional. They reject cases that don't match real usage (e.g., if age is always integer).

---

## Step 5: AI for Test Data

You need to test an API that accepts `{ name, email, createdAt }`. You want 3 valid and 2 invalid payloads.

**Task:** Write a prompt for AI to generate these payloads. What specific requirements would you include? (e.g., email format, date format, invalid examples)

**Question:** Why might AI-generated test data still need your review?

**Checkpoint:** User writes a clear prompt with constraints (valid email, ISO date, name length). They note: AI might generate data that passes validation but doesn't match real-world edge cases (Unicode, SQL injection attempts); human review catches those.

---

## Step 6: Iterate on a Failed Prompt

<!-- hint:celebrate -->

Your first prompt produced tests that used `jest.fn()` but your project uses Vitest (`vi.fn()`). The tests also didn't handle async.

**Task:** Write a refined prompt that would fix these issues. What would you add or change?

**Question:** How do you build a "prompt library" for test generation over time?

**Checkpoint:** User adds: "Use Vitest (vi.fn(), vi.mock())", "This function is async—use async/await in tests". They see value in saving successful prompts for reuse (framework, style, common patterns).