---
name: Systematic Debugging
description: Use this skill when encountering bugs, errors, or unexpected behavior. Enforces a 4-phase root cause analysis process instead of random fix attempts. Invoke when something breaks.
version: 1.0.0
estimated_tokens: 300
---

# Systematic Debugging

> Stop guessing. Start diagnosing. If 3+ fixes fail, question the architecture.

> Adapted from obra/superpowers systematic debugging discipline.

---

## When To Use This Skill

- Any error, exception, or unexpected behavior
- Tests failing for unknown reasons
- Production incidents
- "It works on my machine" situations
- Performance issues with unclear cause

## When NOT To Use This Skill

- Typos or obvious syntax errors (just fix them)
- Known issues with documented solutions
- Feature requests disguised as bugs

---

## The 4-Phase Debugging Process

```
Phase 1: OBSERVE     -- What exactly is happening?
    |
Phase 2: HYPOTHESIZE -- What could cause this?
    |
Phase 3: TEST        -- Prove or disprove each hypothesis
    |
Phase 4: FIX         -- Apply the verified fix
```

**Critical Rule: Do NOT skip to Phase 4.** The most common debugging mistake is jumping straight to "fixing" before understanding the root cause. This creates new bugs.

---

## Phase 1: OBSERVE

Gather facts. No assumptions. No theories yet.

### Required Information

```
1. EXACT error message (copy-paste, don't paraphrase)
2. Stack trace (full, not truncated)
3. When did it start? (What changed?)
4. Reproduction steps (minimum path to trigger)
5. Environment details (OS, runtime version, config)
6. What works? (Narrow the scope)
```

### Observation Commands

```bash
# Read the actual error (don't paraphrase)
cat logs/error.log | tail -50

# Check recent changes
git log --oneline -10
git diff HEAD~3

# Check environment
node --version
npm list --depth=0

# Check process state
ps aux | grep node
lsof -i :3000
```

### Observation Rules

| Do | Don't |
|----|-------|
| Copy the exact error message | Paraphrase "something about authentication" |
| Note the full stack trace | Only look at the first line |
| Check git log for recent changes | Assume nothing changed |
| Reproduce the bug yourself | Trust secondhand reports |
| Note what DOES work | Only focus on what's broken |

---

## Phase 2: HYPOTHESIZE

Generate ranked hypotheses based on observations.

### Hypothesis Template

```
Hypothesis: [What you think is wrong]
Evidence for: [What observations support this]
Evidence against: [What observations contradict this]
Test: [How to prove/disprove this in <5 minutes]
Likelihood: [HIGH / MEDIUM / LOW]
```

### Hypothesis Ranking Rules

| Rank by | Why |
|---------|-----|
| **Recency** | What changed most recently is the most likely cause |
| **Simplicity** | Simple explanations before complex ones (Occam's razor) |
| **Evidence** | More supporting observations = higher rank |
| **Scope** | Narrower scope hypotheses first (one file vs. whole system) |

### Generate at Least 3 Hypotheses

Even if you're "sure" you know the cause, generate at least 3 hypotheses:

```
H1 (HIGH):  Recent config change broke the database connection
            Evidence: git log shows config changed yesterday
            Test: Revert config, check if error disappears

H2 (MEDIUM): Database connection pool exhausted
             Evidence: Error happens under load, not immediately
             Test: Check active connections count

H3 (LOW):   DNS resolution failing for DB host
            Evidence: Error mentions connection timeout
            Test: nslookup db-host
```

---

## Phase 3: TEST

Prove or disprove each hypothesis, starting from highest ranked.

### Testing Rules

1. **Test ONE variable at a time** -- Never change two things at once
2. **Record every result** -- Even negative results are data
3. **Revert failed attempts** -- Don't leave debugging artifacts in the code
4. **Time-box each test** -- Max 10 minutes per hypothesis test

### Testing Methods

| Method | When to Use | Example |
|--------|------------|---------|
| **Bisect** | "It worked before" | `git bisect` to find the breaking commit |
| **Isolate** | Complex system | Test the component in isolation |
| **Reduce** | Large input triggers bug | Minimize the input that reproduces it |
| **Compare** | Works in one env, not another | Diff the two environments |
| **Log** | Intermittent bugs | Add logging at key points, reproduce |
| **Revert** | Recent change suspected | Revert and test |

### Record Results

```
H1: Revert config → Error PERSISTS → DISPROVED
H2: Check connections → Pool at 95% capacity → LIKELY CAUSE
H3: DNS lookup → Resolves correctly → DISPROVED
```

---

## Phase 4: FIX

Apply the fix for the VERIFIED root cause.

### Fix Protocol

```
1. Write a test that reproduces the bug (see test-driven-development skill)
2. Apply the minimum fix
3. Run the reproducing test → PASSES
4. Run ALL tests → Still pass (no regressions)
5. Verify in the original failing environment
6. Document what caused it and why the fix works
```

### Fix Quality Check

| Question | If NO |
|----------|-------|
| Does the fix address the ROOT CAUSE (not just the symptom)? | Find the real root cause |
| Is this the MINIMUM change needed? | Reduce the change |
| Could this fix break anything else? | Add regression tests |
| Is the fix documented? | Add a code comment explaining WHY |
| Would this bug be caught by existing tests? | Add the test |

---

## The 3-Strike Rule

> **If 3+ fix attempts fail, STOP fixing and question the architecture.**

When you've tried 3 fixes and none work:

```
ESCALATION CHECKLIST:
1. Am I fixing the right thing? (Re-examine the root cause)
2. Is my mental model of the system correct? (Read the actual code, not what you think it does)
3. Is the architecture fundamentally flawed for this use case? (Maybe the fix is a redesign)
4. Am I making the same category of mistake? (Step back and try a completely different approach)
5. Should I ask for a second pair of eyes? (Fresh perspective finds what you're blind to)
```

**Common 3-strike situations:**

| Pattern | What It Usually Means |
|---------|----------------------|
| Fix A breaks B, fix B breaks C | Circular dependency or tight coupling |
| Fix works locally, fails in CI/prod | Environment assumption baked into code |
| Fix works for case 1, breaks case 2 | Missing abstraction or wrong data model |
| "I don't understand why this doesn't work" | Read the code more carefully -- your model is wrong |

---

## Debugging Anti-Patterns

| Anti-Pattern | Why It's Harmful | Do Instead |
|-------------|-----------------|-----------|
| **Shotgun debugging** (change random things) | Creates new bugs, wastes time | Follow the 4-phase process |
| **Blame-driven debugging** ("the library is broken") | 99% of the time it's your code | Assume your code is wrong until proven otherwise |
| **Print-statement-only debugging** | Slow, incomplete, leaves artifacts | Use debugger + structured logging |
| **"It works now, I don't know why"** | The bug WILL come back | Understand the root cause or you haven't fixed it |
| **Fixing the symptom** ("add a try-catch around it") | Masks the real problem | Fix the root cause |
| **Debugging in production** | Risk of making things worse | Reproduce locally first |

---

## Rationalizations to Reject

| Rationalization | Why It's Wrong | Required Action |
|----------------|---------------|----------------|
| "Let me just try this quick fix" | Skipping observation and hypothesis phases | Follow the 4-phase process |
| "I know what the problem is" | Confirmation bias -- your first guess is wrong 60% of the time | Generate 3 hypotheses minimum |
| "It's probably a library bug" | It's almost certainly your code | Prove it's the library with a minimal reproduction |
| "I'll just restart the service" | Restarts mask the root cause | Find and fix the actual bug |
| "It only happens sometimes" | Intermittent bugs are the most important to fix | Add logging, find the race condition or state dependency |
| "We can't reproduce it" | You haven't tried hard enough | Match the exact environment, data, and timing |
| "The fix is too risky to deploy" | The bug is already in production | Assess both risks, not just the fix risk |

---

## Integration with Other Skills

- **verification/SKILL.md**: Iron Law applies -- verify the fix by executing and reading output
- **test-driven-development/SKILL.md**: Bug fix starts with a failing test
- **code-review/SKILL.md**: Review the fix for regression risk

---

## Pre-Completion Checklist

- [ ] Root cause identified and documented (not just the symptom)
- [ ] Fix verified by running tests (Iron Law: execute and read output)
- [ ] Regression test added for this specific bug
- [ ] All existing tests still pass
- [ ] No debugging artifacts left in code (console.log, TODO-REMOVE, etc.)
- [ ] Code comment added explaining the fix if non-obvious

---

*The best debuggers are not the fastest coders. They are the most disciplined observers.*
