# Risk Management — Identifying and Mitigating Project Risks

<!-- hint:slides topic="Risk management: identification techniques, probability-impact matrix, response strategies, and risk register" slides="5" -->

## What Is a Risk?

A **risk** is an uncertain event or condition that, if it occurs, has a positive or negative effect on project objectives. In practice, we usually focus on *threats* (negative risks).

**Risk = Uncertainty × Impact**

Something is a risk when:
1. **It might happen** (uncertainty)
2. **It would matter if it did** (impact)

"I'm 100% sure we'll miss the deadline" is not a risk—it's a problem. "There's a 30% chance the API vendor delays their release" is a risk.

## Risk Identification Techniques

### Brainstorming

Gather the team and ask: "What could go wrong?" Use prompts:
- Technical: integrations, performance, security
- People: key person dependency, skill gaps
- External: vendors, regulators, market
- Process: coordination, handoffs, tooling

### Pre-Mortem

Imagine the project has failed. Ask: "What went wrong?" People are better at imagining failure than success. List causes without judging—quantity first, prioritize later.

### SWOT Analysis

| Strengths | Weaknesses |
|-----------|------------|
| Opportunities | Threats |

Threats are risks. Weaknesses can become risks if they're exposed (e.g., "no automated tests" → risk of regression).

### Dependency Mapping

Draw a dependency graph: who/what do we depend on? Each dependency is a potential risk. External APIs, third-party services, other teams, legacy systems.

## Risk Matrix (Probability × Severity)

Plot risks on a 2×2 or 3×3 grid:

```mermaid
quadrantChart
  title Risk Matrix
  x-axis Low Impact --> High Impact
  y-axis Low Probability --> High Probability
  quadrant-1 Monitor
  quadrant-2 Mitigate
  quadrant-3 Accept
  quadrant-4 Avoid or Transfer
  "Vendor delay": [0.3, 0.8]
  "Key person leaves": [0.7, 0.6]
  "API rate limits": [0.5, 0.4]
  "Scope creep": [0.8, 0.7]
```

| Probability | Low Impact | Medium Impact | High Impact |
|-------------|------------|---------------|-------------|
| **High** | Monitor | Mitigate | Avoid or Transfer |
| **Medium** | Accept | Mitigate | Mitigate |
| **Low** | Accept | Accept | Monitor |

**High probability + High impact** = Address first. Avoid, transfer, or aggressively mitigate.

**Low probability + Low impact** = Accept. Document and move on.

## Risk Response Strategies

| Strategy | Meaning | Example |
|----------|---------|---------|
| **Avoid** | Eliminate the risk by changing the approach | Don't use the risky vendor; build in-house |
| **Mitigate** | Reduce probability or impact | Add integration tests; have a fallback API |
| **Transfer** | Shift impact to someone else | Insurance, warranties, contracts, SLA penalties |
| **Accept** | Acknowledge and plan for it | Document; have a contingency budget or buffer |

You don't eliminate all risks. You choose which to address and which to accept.

## Risk Register

A **risk register** tracks identified risks, their assessment, and response.

| ID | Risk | Prob | Impact | Response | Owner |
|----|------|------|--------|----------|-------|
| R1 | API vendor delays v2 | M | H | Mitigate: parallel prototype | Dev Lead |
| R2 | Key architect on vacation during integration | L | H | Mitigate: knowledge share before | PM |
| R3 | Performance regression in legacy module | M | M | Mitigate: load tests in CI | QA |

Template:

```markdown
## Risk Register

| ID | Risk Description | Probability | Impact | Response | Owner |
|----|------------------|-------------|--------|----------|-------|
| R1 | [What could go wrong] | L/M/H | L/M/H | Avoid/Mitigate/Transfer/Accept | [Name] |
```

## Dependency Mapping

Map dependencies to surface risks:

```mermaid
flowchart LR
  A[Our Service] --> B[Auth API]
  A --> C[Payment Gateway]
  A --> D[Legacy DB]
  C --> E[Stripe]
  D --> F[Ops Team]
```

Each arrow is a dependency—and a potential failure point. Document: What happens if B, C, or D fails? Who owns it?

## Technical Risks

| Category | Example Risks |
|----------|---------------|
| **Integration** | API changes, rate limits, auth failures, version mismatch |
| **Performance** | Load, latency, scalability under peak |
| **Security** | Vulnerabilities, credential exposure, compliance |
| **Legacy** | Unknown behavior, no tests, single point of failure |
| **Data** | Migration failures, corruption, GDPR/retention |

For each, ask: What's the probability? What's the impact? What's our response?

## Communicating Risks Without Being the "Doom Person"

1. **Lead with context** — "Here are the risks we're tracking and how we're addressing them."
2. **Pair risks with responses** — Don't just list problems; show the plan.
3. **Use RAG status** — Red/Amber/Green. Amber = "we're watching it."
4. **Avoid surprise** — Surface risks early. "I've been tracking this; here's the update."
5. **Frame as ownership** — "We're on it" vs "Something bad might happen."

The goal is *transparency*, not alarm. Stakeholders trust you when you show you're managing risk, not ignoring it.

## Monitoring and Updating Risks

Risks change. Review the register:

- **Weekly** — In team sync; any new risks? Any change in probability/impact?
- **At milestones** — Major phase gates; re-assess top risks.
- **When triggers fire** — "If X happens, we re-evaluate." Define triggers.

Close risks when they're no longer relevant (avoided, passed, or occurred and handled).

## Risk Management Flow

```mermaid
flowchart TD
  A[Identify] --> B[Assess]
  B --> C[Prioritize]
  C --> D[Plan Response]
  D --> E[Implement]
  E --> F[Monitor]
  F -.->|Update| A
```

**Identify** → **Assess** (prob × impact) → **Prioritize** (matrix) → **Plan response** (avoid/mitigate/transfer/accept) → **Implement** → **Monitor** (and loop).
